MaziyarPanahi commited on
Commit
41d8f3a
1 Parent(s): 614b091

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -16
README.md CHANGED
@@ -127,9 +127,18 @@ This model is an advanced iteration of the powerful `Qwen/Qwen2.5-72B`, specific
127
 
128
  coming soon!
129
 
130
- # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 
131
 
132
- Leaderboard 2 coming soon!
 
 
 
 
 
 
 
 
133
 
134
  # Prompt Template
135
 
@@ -172,17 +181,3 @@ model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/calme-3.2-instruct-7
172
  # Ethical Considerations
173
 
174
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
175
-
176
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
177
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-3.2-instruct-78b)
178
-
179
- | Metric |Value|
180
- |-------------------|----:|
181
- |Avg. |52.02|
182
- |IFEval (0-Shot) |80.63|
183
- |BBH (3-Shot) |62.61|
184
- |MATH Lvl 5 (4-Shot)|39.95|
185
- |GPQA (0-shot) |20.36|
186
- |MuSR (0-shot) |38.53|
187
- |MMLU-PRO (5-shot) |70.03|
188
-
 
127
 
128
  coming soon!
129
 
130
+ # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
131
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-3.2-instruct-78b)
132
 
133
+ | Metric |Value|
134
+ |-------------------|----:|
135
+ |Avg. |52.02|
136
+ |IFEval (0-Shot) |80.63|
137
+ |BBH (3-Shot) |62.61|
138
+ |MATH Lvl 5 (4-Shot)|39.95|
139
+ |GPQA (0-shot) |20.36|
140
+ |MuSR (0-shot) |38.53|
141
+ |MMLU-PRO (5-shot) |70.03|
142
 
143
  # Prompt Template
144
 
 
181
  # Ethical Considerations
182
 
183
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.