Update README.md
Browse files
README.md
CHANGED
@@ -14,17 +14,12 @@ In addition to cross entropy and cosine teacher-student losses, DistilProtBert w
|
|
14 |
|
15 |
|
16 |
Access to [git](https://github.com/yarongef/DistilProtBert)
|
17 |
-
#
|
18 |
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
| **Model** | **# of Parameters** | **# of Hidden layers** | **# of Pretraining sequences** | **Pretraining hardware** |
|
25 |
-
|:--------------:|:--------------:|:-----------------:|:-------------------------:|:------------------------:|
|
26 |
-
| ProtBert | 420M | 30 | 216M | 512 16GB TPUs |
|
27 |
-
| DistilProtBert | 230M | 15 | 43M | 5 v100 32GB GPUs |
|
28 |
|
29 |
## Intended uses & limitations
|
30 |
|
|
|
14 |
|
15 |
|
16 |
Access to [git](https://github.com/yarongef/DistilProtBert)
|
17 |
+
# DistilProtBert comparison to ProtBert
|
18 |
|
19 |
+
| **Model** | **# of parameters** | **# of hidden layers** | **Pretraining dataset** | **# of pretraining sequences** | **Pretraining hardware** |
|
20 |
+
|:--------------:|:-------------------:|:----------------------:|:-----------------------:|:------------------------------:|:------------------------:|
|
21 |
+
| ProtBert | 420M | 30 | UniRef100 | 216M | 512 16GB Tpus |
|
22 |
+
| DistilProtBert | 230M | 15 | UniRef50 | 43M | 5 v100 32GB GPUs |
|
|
|
|
|
|
|
|
|
|
|
23 |
|
24 |
## Intended uses & limitations
|
25 |
|