Adding Evaluation Results

#3
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -192,4 +192,17 @@ The training was done for 2 epochs. We used 8x[H100s](https://www.nvidia.com/en
192
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
193
 
194
  ## Safety
195
- ...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
192
  [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
193
 
194
  ## Safety
195
+ ...
196
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
197
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_anthracite-org__magnum-v3-9b-chatml)
198
+
199
+ | Metric |Value|
200
+ |-------------------|----:|
201
+ |Avg. |19.29|
202
+ |IFEval (0-Shot) |12.75|
203
+ |BBH (3-Shot) |35.32|
204
+ |MATH Lvl 5 (4-Shot)| 5.66|
205
+ |GPQA (0-shot) |12.75|
206
+ |MuSR (0-shot) |13.24|
207
+ |MMLU-PRO (5-shot) |36.02|
208
+