crumb commited on
Commit
e03fb62
·
verified ·
1 Parent(s): 207c4da

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -0
README.md CHANGED
@@ -8,3 +8,20 @@ GLORT2 (GLORT2 Low Rank Transformer Transformer) is a transformer model where ev
8
 
9
  also sorry I just realized theres some residual from where I copied the model code from in my own projects that includes some "expanded lm head size" stuff just ignore that if you're looking at the config and code this isn't a serious project so I don't care too much that it's there
10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
  also sorry I just realized theres some residual from where I copied the model code from in my own projects that includes some "expanded lm head size" stuff just ignore that if you're looking at the config and code this isn't a serious project so I don't care too much that it's there
10
 
11
+ | model | 512-token strided perplexity on a pile test set | tokens |
12
+ | --- | --- | --- |
13
+ | cerebras 111m | 21.550655364990234 | 2.2b |
14
+ | cerebras 256m | 15.203496932983398 | 5.1b |
15
+ | pythia 70m | 22.393400192260742 | 300b |
16
+ | pythia 160m | 13.933751106262207 | 300b |
17
+ | pythia 410m | 9.61842155456543 | 300b |
18
+ | GLORT2 (205m) | 13.051741600036621 | 2.2b |
19
+ | custom llama w same settings as cerebras 111m | 13.882301330566406 | 2.2b |
20
+
21
+
22
+ | Tasks |Version|Filter|n-shot| Metric |Value | |Stderr|
23
+ |-------------|------:|------|-----:|--------|-----:|---|-----:|
24
+ |arc_challenge| 1|none | 25|acc |0.1706|± |0.0110|
25
+ | | |none | 25|acc_norm|0.2099|± |0.0119|
26
+ |truthfulqa_mc2| 2|none | 0|acc |0.4599|± |0.0154|
27
+ |winogrande| 1|none | 5|acc |0.5083|± |0.0141|