lighteternal
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -24,20 +24,19 @@ I recommend using the prompt template of Llama3: https://llama.meta.com/docs/mod
|
|
24 |
|
25 |
| Task | Metric | Ours (%) | Llama38BInstr. (%) |OpenBioLLM8B (%) |
|
26 |
|--------------------------------------|--------------------------|------------------|------------|-------------|
|
27 |
-
| **ARC Challenge** | Accuracy | 59.39 | 57.17 | 55.38 |
|
28 |
-
| | Normalized Accuracy | 63.65 | 60.75 | 58.62 |
|
29 |
-
| **Hellaswag** | Accuracy | 62.59
|
30 |
-
| | Normalized Accuracy | 81.53 | 78.55 | 80.76 |
|
31 |
-
| **Winogrande** | Accuracy | 75.93 | 74.51 | 70.88 |
|
32 |
-
| **GSM8K** | Accuracy | 59.36 | 68.69 | 10.
|
33 |
-
| **HendrycksTest-
|
34 |
-
|
|
35 |
-
| **HendrycksTest-
|
36 |
-
| **HendrycksTest-
|
37 |
-
| **HendrycksTest-
|
38 |
-
| **HendrycksTest-
|
39 |
-
|
40 |
-
| **HendrycksTest-Professional Medicine** | Accuracy | 71.69 | 71.69 | 69.41 |
|
41 |
|
42 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
43 |
|
|
|
24 |
|
25 |
| Task | Metric | Ours (%) | Llama38BInstr. (%) |OpenBioLLM8B (%) |
|
26 |
|--------------------------------------|--------------------------|------------------|------------|-------------|
|
27 |
+
| **ARC Challenge** | Accuracy | **59.39** | 57.17 | 55.38 |
|
28 |
+
| | Normalized Accuracy | **63.65** | 60.75 | 58.62 |
|
29 |
+
| **Hellaswag** | Accuracy | **62.59** | 59.04 | 61.83 |
|
30 |
+
| | Normalized Accuracy | **81.53** | 78.55 | 80.76 |
|
31 |
+
| **Winogrande** | Accuracy | **75.93** | 74.51 | 70.88 |
|
32 |
+
| **GSM8K** | Accuracy | 59.36 | **68.69** | 10.15 |
|
33 |
+
| **HendrycksTest-Anatomy** | Accuracy | **72.59** | 65.19 | 69.62 |
|
34 |
+
| **HendrycksTest-Clinical Knowledge** | Accuracy | **77.83** | 74.72 | 60.38 |
|
35 |
+
| **HendrycksTest-College Biology** | Accuracy | **81.94** | 79.86 | 79.86 |
|
36 |
+
| **HendrycksTest-College Medicine** | Accuracy | **69.36** | 63.58 | 70.52 |
|
37 |
+
| **HendrycksTest-Medical Genetics** | Accuracy | **86.00** | 80.00 | 80.00 |
|
38 |
+
| **HendrycksTest-Professional Medicine** | Accuracy | **77.94** | 71.69 | 77.94 |
|
39 |
+
|
|
|
40 |
|
41 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
42 |
|