MaziyarPanahi commited on
Commit
30a21b4
1 Parent(s): 95b6eed

Update README.md (#8)

Browse files

- Update README.md (6d91f5f5906844f1c330c22787c89546bd7fbfbe)

Files changed (1) hide show
  1. README.md +12 -15
README.md CHANGED
@@ -127,9 +127,19 @@ This model is an advanced iteration of the powerful `Qwen/Qwen2.5-72B`, specific
127
 
128
  Thanks to `mradermacher`: [calme-3.1-instruct-78b-GGUF](https://huggingface.co/mradermacher/calme-3.1-instruct-78b-GGUF)
129
 
130
- # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
 
 
 
 
 
 
 
 
 
 
 
131
 
132
- Leaderboard 2 coming soon!
133
 
134
  # Prompt Template
135
 
@@ -173,16 +183,3 @@ model = AutoModelForCausalLM.from_pretrained("MaziyarPanahi/calme-3.1-instruct-7
173
 
174
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
175
 
176
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
177
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-3.1-instruct-78b)
178
-
179
- | Metric |Value|
180
- |-------------------|----:|
181
- |Avg. |51.20|
182
- |IFEval (0-Shot) |81.36|
183
- |BBH (3-Shot) |62.41|
184
- |MATH Lvl 5 (4-Shot)|38.75|
185
- |GPQA (0-shot) |19.46|
186
- |MuSR (0-shot) |36.50|
187
- |MMLU-PRO (5-shot) |68.72|
188
-
 
127
 
128
  Thanks to `mradermacher`: [calme-3.1-instruct-78b-GGUF](https://huggingface.co/mradermacher/calme-3.1-instruct-78b-GGUF)
129
 
130
+ # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
131
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-3.1-instruct-78b)
132
+
133
+ | Metric |Value|
134
+ |-------------------|----:|
135
+ |Avg. |51.20|
136
+ |IFEval (0-Shot) |81.36|
137
+ |BBH (3-Shot) |62.41|
138
+ |MATH Lvl 5 (4-Shot)|38.75|
139
+ |GPQA (0-shot) |19.46|
140
+ |MuSR (0-shot) |36.50|
141
+ |MMLU-PRO (5-shot) |68.72|
142
 
 
143
 
144
  # Prompt Template
145
 
 
183
 
184
  As with any large language model, users should be aware of potential biases and limitations. We recommend implementing appropriate safeguards and human oversight when deploying this model in production environments.
185