| Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr| | |
|----------------|------:|------|-----:|--------|---|-----:|---|------| | |
|kobest_boolq | 1|none | 5|acc |↑ |0.5726|± |0.0132| | |
| | |none | 5|f1 |↑ |0.5725|± | N/A| | |
|kobest_copa | 1|none | 5|acc |↑ |0.5200|± |0.0158| | |
| | |none | 5|f1 |↑ |0.5189|± | N/A| | |
|kobest_hellaswag| 1|none | 5|acc |↑ |0.3640|± |0.0215| | |
| | |none | 5|acc_norm|↑ |0.4380|± |0.0222| | |
| | |none | 5|f1 |↑ |0.3592|± | N/A| | |
|kobest_sentineg | 1|none | 5|acc |↑ |0.5642|± |0.0249| | |
| | |none | 5|f1 |↑ |0.5554|± | N/A| | |
|kobest_wic | 1|none | 5|acc |↑ |0.5087|± |0.0141| | |
| | |none | 5|f1 |↑ |0.4979|± | N/A| | |
|Tasks|Version| Filter |n-shot| Metric | |Value | |Stderr| | |
|-----|------:|----------------|-----:|-----------|---|-----:|---|-----:| | |
|gsm8k| 3|flexible-extract| 5|exact_match|↑ |0.2995|± |0.0126| | |
| | |strict-match | 5|exact_match|↑ |0.2987|± |0.0126| |