Adding Evaluation Results

#1
Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -169,3 +169,17 @@ These benchmarks currently have us at #1 on ARC-c, ARC-e, Hellaswag, and OpenBoo
169
  The model is available for download on Hugging Face. It is suitable for a wide range of language tasks, from generating creative text to understanding and following complex instructions.
170
 
171
  Compute provided by our project sponsor Redmond AI, thank you!!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
169
  The model is available for download on Hugging Face. It is suitable for a wide range of language tasks, from generating creative text to understanding and following complex instructions.
170
 
171
  Compute provided by our project sponsor Redmond AI, thank you!!
172
+
173
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
174
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_TheBloke__Nous-Hermes-13B-SuperHOT-8K-fp16)
175
+
176
+ | Metric | Value |
177
+ |-----------------------|---------------------------|
178
+ | Avg. | 49.3 |
179
+ | ARC (25-shot) | 55.29 |
180
+ | HellaSwag (10-shot) | 81.87 |
181
+ | MMLU (5-shot) | 48.23 |
182
+ | TruthfulQA (0-shot) | 51.19 |
183
+ | Winogrande (5-shot) | 75.3 |
184
+ | GSM8K (5-shot) | 1.21 |
185
+ | DROP (3-shot) | 32.03 |