alexmarques commited on
Commit
1343a1c
·
verified ·
1 Parent(s): f40c407

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -23
README.md CHANGED
@@ -33,7 +33,7 @@ base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
33
  - **Model Developers:** Neural Magic
34
 
35
  Quantized version of [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct).
36
- It achieves scores within 1.3% of the scores of the unquantized model for MMLU, ARC-Challenge, GSM-8k, Hellaswag, Winogrande and TruthfulQA.
37
 
38
  ### Model Optimizations
39
 
@@ -152,31 +152,31 @@ This version of the lm-evaluation-harness includes versions of MMLU, ARC-Challen
152
  <tr>
153
  <td>MMLU (5-shot)
154
  </td>
155
- <td>69.43
156
  </td>
157
- <td>68.78
158
  </td>
159
- <td>99.1%
160
  </td>
161
  </tr>
162
  <tr>
163
  <td>MMLU (CoT, 0-shot)
164
  </td>
165
- <td>72.56
166
  </td>
167
- <td>72.20
168
  </td>
169
- <td>99.5%
170
  </td>
171
  </tr>
172
  <tr>
173
  <td>ARC Challenge (0-shot)
174
  </td>
175
- <td>81.57
176
  </td>
177
- <td>81.06
178
  </td>
179
- <td>99.4%
180
  </td>
181
  </tr>
182
  <tr>
@@ -184,27 +184,27 @@ This version of the lm-evaluation-harness includes versions of MMLU, ARC-Challen
184
  </td>
185
  <td>82.79
186
  </td>
187
- <td>81.96
188
  </td>
189
- <td>99.0%
190
  </td>
191
  </tr>
192
  <tr>
193
  <td>Hellaswag (10-shot)
194
  </td>
195
- <td>80.01
196
  </td>
197
- <td>79.85
198
  </td>
199
- <td>99.8%
200
  </td>
201
  </tr>
202
  <tr>
203
  <td>Winogrande (5-shot)
204
  </td>
205
- <td>77.90
206
  </td>
207
- <td>77.11
208
  </td>
209
  <td>99.0%
210
  </td>
@@ -212,21 +212,21 @@ This version of the lm-evaluation-harness includes versions of MMLU, ARC-Challen
212
  <tr>
213
  <td>TruthfulQA (0-shot, mc2)
214
  </td>
215
- <td>54.04
216
  </td>
217
- <td>54.19
218
  </td>
219
- <td>100.3%
220
  </td>
221
  </tr>
222
  <tr>
223
  <td><strong>Average</strong>
224
  </td>
225
- <td><strong>74.04</strong>
226
  </td>
227
- <td><strong>73.59</strong>
228
  </td>
229
- <td><strong>99.4%</strong>
230
  </td>
231
  </tr>
232
  </table>
 
33
  - **Model Developers:** Neural Magic
34
 
35
  Quantized version of [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct).
36
+ It achieves scores within 1.0% of the scores of the unquantized model for MMLU, ARC-Challenge, GSM-8k, Hellaswag, Winogrande and TruthfulQA.
37
 
38
  ### Model Optimizations
39
 
 
152
  <tr>
153
  <td>MMLU (5-shot)
154
  </td>
155
+ <td>68.32
156
  </td>
157
+ <td>67.83
158
  </td>
159
+ <td>99.3%
160
  </td>
161
  </tr>
162
  <tr>
163
  <td>MMLU (CoT, 0-shot)
164
  </td>
165
+ <td>72.83
166
  </td>
167
+ <td>72.18
168
  </td>
169
+ <td>99.1%
170
  </td>
171
  </tr>
172
  <tr>
173
  <td>ARC Challenge (0-shot)
174
  </td>
175
+ <td>81.40
176
  </td>
177
+ <td>81.66
178
  </td>
179
+ <td>100.3%
180
  </td>
181
  </tr>
182
  <tr>
 
184
  </td>
185
  <td>82.79
186
  </td>
187
+ <td>84.84
188
  </td>
189
+ <td>102.5%
190
  </td>
191
  </tr>
192
  <tr>
193
  <td>Hellaswag (10-shot)
194
  </td>
195
+ <td>80.47
196
  </td>
197
+ <td>79.96
198
  </td>
199
+ <td>99.4%
200
  </td>
201
  </tr>
202
  <tr>
203
  <td>Winogrande (5-shot)
204
  </td>
205
+ <td>78.06
206
  </td>
207
+ <td>77.27
208
  </td>
209
  <td>99.0%
210
  </td>
 
212
  <tr>
213
  <td>TruthfulQA (0-shot, mc2)
214
  </td>
215
+ <td>54.48
216
  </td>
217
+ <td>54.17
218
  </td>
219
+ <td>99.4%
220
  </td>
221
  </tr>
222
  <tr>
223
  <td><strong>Average</strong>
224
  </td>
225
+ <td><strong>74.05</strong>
226
  </td>
227
+ <td><strong>73.99</strong>
228
  </td>
229
+ <td><strong>99.9%</strong>
230
  </td>
231
  </tr>
232
  </table>