MaziyarPanahi commited on
Commit
1f8989d
1 Parent(s): e288c94

clean up the evals (#10)

Browse files

- clean up the evals (a28484100ec7164da081963a94ce5d9c49872a4d)

Files changed (1) hide show
  1. README.md +15 -25
README.md CHANGED
@@ -147,6 +147,20 @@ GGUF (2/3/4/5/6/8 bits): [MaziyarPanahi/phi-2-logical-sft-GGUF](https://huggingf
147
  ### Response:
148
  ```
149
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
150
  ## Examples
151
 
152
  ```
@@ -222,19 +236,6 @@ Now, let's eliminate the first possibility, because it contradicts the premise t
222
  ---
223
 
224
 
225
-
226
- ## Model description
227
-
228
- More information needed
229
-
230
- ## Intended uses & limitations
231
-
232
- More information needed
233
-
234
- ## Training and evaluation data
235
-
236
- More information needed
237
-
238
  ## Training procedure
239
 
240
  ### Training hyperparameters
@@ -359,17 +360,6 @@ special_tokens:
359
  pad_token: "<|endoftext|>"
360
  ```
361
 
362
- </details><br>
363
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
364
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__phi-2-logical-sft)
365
 
366
- | Metric |Value|
367
- |---------------------------------|----:|
368
- |Avg. |61.50|
369
- |AI2 Reasoning Challenge (25-Shot)|61.35|
370
- |HellaSwag (10-Shot) |75.14|
371
- |MMLU (5-Shot) |57.40|
372
- |TruthfulQA (0-shot) |44.39|
373
- |Winogrande (5-shot) |74.90|
374
- |GSM8k (5-shot) |55.80|
375
 
 
147
  ### Response:
148
  ```
149
 
150
+ ## [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
151
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__phi-2-logical-sft)
152
+
153
+ | Metric |Value|
154
+ |---------------------------------|----:|
155
+ |Avg. |61.50|
156
+ |AI2 Reasoning Challenge (25-Shot)|61.35|
157
+ |HellaSwag (10-Shot) |75.14|
158
+ |MMLU (5-Shot) |57.40|
159
+ |TruthfulQA (0-shot) |44.39|
160
+ |Winogrande (5-shot) |74.90|
161
+ |GSM8k (5-shot) |55.80|
162
+
163
+
164
  ## Examples
165
 
166
  ```
 
236
  ---
237
 
238
 
 
 
 
 
 
 
 
 
 
 
 
 
 
239
  ## Training procedure
240
 
241
  ### Training hyperparameters
 
360
  pad_token: "<|endoftext|>"
361
  ```
362
 
363
+ </details>
 
 
364
 
 
 
 
 
 
 
 
 
 
365