Muennighoff's picture
Add eval
d522938
raw
history blame
1.06 kB
task,metric,value,err,version
anli_r1,acc,0.336,0.014944140233795027,0
anli_r2,acc,0.354,0.015129868238451772,0
anli_r3,acc,0.3325,0.013605417345710526,0
arc_challenge,acc,0.27986348122866894,0.013119040897725922,0
arc_challenge,acc_norm,0.29692832764505117,0.013352025976725223,0
arc_easy,acc,0.5841750841750841,0.010113348244647869,0
arc_easy,acc_norm,0.5614478114478114,0.010182010275471116,0
boolq,acc,0.6085626911314985,0.008536430524403957,1
cb,acc,0.48214285714285715,0.0673769750864465,1
cb,f1,0.40945083014048533,,1
copa,acc,0.77,0.04229525846816506,0
hellaswag,acc,0.4340768771161123,0.004946221512145273,0
hellaswag,acc_norm,0.5635331607249552,0.004949335356881862,0
piqa,acc,0.736126224156692,0.010282996367695562,0
piqa,acc_norm,0.7421109902067464,0.010206956662056246,0
rte,acc,0.49458483754512633,0.030094698123239966,0
sciq,acc,0.888,0.009977753031397234,0
sciq,acc_norm,0.869,0.010674874844837952,0
storycloze_2016,acc,0.6905398182789952,0.01068995674518907,0
winogrande,acc,0.5453827940015785,0.013994481027065997,0