leaderboard-pr-bot commited on
Commit
2b5859e
1 Parent(s): b407c1e

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -169,3 +169,17 @@ These benchmarks currently have us at #1 on ARC-c, ARC-e, Hellaswag, and OpenBoo
169
  The model is available for download on Hugging Face. It is suitable for a wide range of language tasks, from generating creative text to understanding and following complex instructions.
170
 
171
  Compute provided by our project sponsor Redmond AI, thank you!!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
169
  The model is available for download on Hugging Face. It is suitable for a wide range of language tasks, from generating creative text to understanding and following complex instructions.
170
 
171
  Compute provided by our project sponsor Redmond AI, thank you!!
172
+
173
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
174
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_TheBloke__Nous-Hermes-13B-SuperHOT-8K-fp16)
175
+
176
+ | Metric | Value |
177
+ |-----------------------|---------------------------|
178
+ | Avg. | 49.3 |
179
+ | ARC (25-shot) | 55.29 |
180
+ | HellaSwag (10-shot) | 81.87 |
181
+ | MMLU (5-shot) | 48.23 |
182
+ | TruthfulQA (0-shot) | 51.19 |
183
+ | Winogrande (5-shot) | 75.3 |
184
+ | GSM8K (5-shot) | 1.21 |
185
+ | DROP (3-shot) | 32.03 |