Update README.md
Browse files
README.md
CHANGED
@@ -8,3 +8,20 @@ GLORT2 (GLORT2 Low Rank Transformer Transformer) is a transformer model where ev
|
|
8 |
|
9 |
also sorry I just realized theres some residual from where I copied the model code from in my own projects that includes some "expanded lm head size" stuff just ignore that if you're looking at the config and code this isn't a serious project so I don't care too much that it's there
|
10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
|
9 |
also sorry I just realized theres some residual from where I copied the model code from in my own projects that includes some "expanded lm head size" stuff just ignore that if you're looking at the config and code this isn't a serious project so I don't care too much that it's there
|
10 |
|
11 |
+
| model | 512-token strided perplexity on a pile test set | tokens |
|
12 |
+
| --- | --- | --- |
|
13 |
+
| cerebras 111m | 21.550655364990234 | 2.2b |
|
14 |
+
| cerebras 256m | 15.203496932983398 | 5.1b |
|
15 |
+
| pythia 70m | 22.393400192260742 | 300b |
|
16 |
+
| pythia 160m | 13.933751106262207 | 300b |
|
17 |
+
| pythia 410m | 9.61842155456543 | 300b |
|
18 |
+
| GLORT2 (205m) | 13.051741600036621 | 2.2b |
|
19 |
+
| custom llama w same settings as cerebras 111m | 13.882301330566406 | 2.2b |
|
20 |
+
|
21 |
+
|
22 |
+
| Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
|
23 |
+
|-------------|------:|------|-----:|--------|-----:|---|-----:|
|
24 |
+
|arc_challenge| 1|none | 25|acc |0.1706|± |0.0110|
|
25 |
+
| | |none | 25|acc_norm|0.2099|± |0.0119|
|
26 |
+
|truthfulqa_mc2| 2|none | 0|acc |0.4599|± |0.0154|
|
27 |
+
|winogrande| 1|none | 5|acc |0.5083|± |0.0141|
|