C10X commited on
Commit
768b76e
1 Parent(s): f91a136

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -152,3 +152,17 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
152
  |MuSR (0-shot) | 6.70|
153
  |MMLU-PRO (5-shot) |28.08|
154
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
152
  |MuSR (0-shot) | 6.70|
153
  |MMLU-PRO (5-shot) |28.08|
154
 
155
+
156
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
157
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_c10x__longthinker)
158
+
159
+ | Metric |Value|
160
+ |-------------------|----:|
161
+ |Avg. |19.47|
162
+ |IFEval (0-Shot) |36.09|
163
+ |BBH (3-Shot) |28.42|
164
+ |MATH Lvl 5 (4-Shot)|15.63|
165
+ |GPQA (0-shot) | 1.90|
166
+ |MuSR (0-shot) | 6.70|
167
+ |MMLU-PRO (5-shot) |28.08|
168
+