Commit
•
412a040
1
Parent(s):
16d36cd
Adding Evaluation Results (#6)
Browse files- Adding Evaluation Results (a6ff2f09c14e69d58d4711e69e80199b8f377314)
Co-authored-by: Open LLM Leaderboard PR Bot <leaderboard-pr-bot@users.noreply.huggingface.co>
README.md
CHANGED
@@ -90,4 +90,17 @@ ASSISTANT: To help your vehicle start, I will guide you through a step-by-step p
|
|
90 |
By following these steps, you should be able to diagnose and potentially fix the issue causing your car to not start. However, if after going through these checks and still having trouble, it is recommended to seek assistance from a qualified mechanic.
|
91 |
```
|
92 |
|
93 |
-
[Buy me a coffee](https://www.buymeacoffee.com/ehartford)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
90 |
By following these steps, you should be able to diagnose and potentially fix the issue causing your car to not start. However, if after going through these checks and still having trouble, it is recommended to seek assistance from a qualified mechanic.
|
91 |
```
|
92 |
|
93 |
+
[Buy me a coffee](https://www.buymeacoffee.com/ehartford)
|
94 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
95 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_ehartford__dolphin-llama2-7b)
|
96 |
+
|
97 |
+
| Metric | Value |
|
98 |
+
|-----------------------|---------------------------|
|
99 |
+
| Avg. | 41.88 |
|
100 |
+
| ARC (25-shot) | 46.59 |
|
101 |
+
| HellaSwag (10-shot) | 67.52 |
|
102 |
+
| MMLU (5-shot) | 48.37 |
|
103 |
+
| TruthfulQA (0-shot) | 49.72 |
|
104 |
+
| Winogrande (5-shot) | 63.77 |
|
105 |
+
| GSM8K (5-shot) | 5.69 |
|
106 |
+
| DROP (3-shot) | 11.53 |
|