Update README.md
Browse files
README.md
CHANGED
@@ -59,6 +59,6 @@ This model performs significantly worse than Memphis-CoT on benchmarks, despite
|
|
59 |
| Model | GSM8K (5-shot) | AGIEval (English/Nous subset, acc_norm) | BIG Bench Hard (CoT, few-shot*) |
|
60 |
|:---------------------------------------------------------------------------|:---------------|:----------------------------------------|:--------------------------------|
|
61 |
| [StableLM 3B Base](https://hf.co/stabilityai/stablelm-3b-4e1t) | 2.05% | 25.14% | 36.75% |
|
62 |
-
| [Memphis-CoT 3B](https://hf.co/euclaise/Memphis-CoT-3B) |
|
63 |
| [Memphis-scribe 3B](https://hf.co/euclaise/Memphis-scribe-3B) | 9.55% | 24.78% | |
|
64 |
*5-shot, as performed automatically by LM Evaluation Harness bbh_cot_fewshot even with num_fewshot=0
|
|
|
59 |
| Model | GSM8K (5-shot) | AGIEval (English/Nous subset, acc_norm) | BIG Bench Hard (CoT, few-shot*) |
|
60 |
|:---------------------------------------------------------------------------|:---------------|:----------------------------------------|:--------------------------------|
|
61 |
| [StableLM 3B Base](https://hf.co/stabilityai/stablelm-3b-4e1t) | 2.05% | 25.14% | 36.75% |
|
62 |
+
| [Memphis-CoT 3B](https://hf.co/euclaise/Memphis-CoT-3B) | 18.8% | 27.22% | 36.92% |
|
63 |
| [Memphis-scribe 3B](https://hf.co/euclaise/Memphis-scribe-3B) | 9.55% | 24.78% | |
|
64 |
*5-shot, as performed automatically by LM Evaluation Harness bbh_cot_fewshot even with num_fewshot=0
|