Daemontatox commited on
Commit
cc50b4f
1 Parent(s): c418e48

Adding Evaluation Results

Browse files

This is an automated PR created with [this space](https://huggingface.co/spaces/T145/open-llm-leaderboard-results-to-modelcard)!

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

Please report any issues here: https://huggingface.co/spaces/T145/open-llm-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -178,3 +178,17 @@ Summarized results can be found [here](https://huggingface.co/datasets/open-llm-
178
  |MuSR (0-shot) | 16.21|
179
  |MMLU-PRO (5-shot) | 36.43|
180
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
178
  |MuSR (0-shot) | 16.21|
179
  |MMLU-PRO (5-shot) | 36.43|
180
 
181
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
182
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/Daemontatox__AetherSett-details)!
183
+ Summarized results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox%2FAetherSett&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!
184
+
185
+ | Metric |Value (%)|
186
+ |-------------------|--------:|
187
+ |**Average** | 29.92|
188
+ |IFEval (0-Shot) | 53.70|
189
+ |BBH (3-Shot) | 34.74|
190
+ |MATH Lvl 5 (4-Shot)| 30.74|
191
+ |GPQA (0-shot) | 7.72|
192
+ |MuSR (0-shot) | 16.21|
193
+ |MMLU-PRO (5-shot) | 36.43|
194
+