metadata
license: gpl-3.0
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 49.13 |
ARC (25-shot) | 57.25 |
HellaSwag (10-shot) | 80.88 |
MMLU (5-shot) | 52.92 |
TruthfulQA (0-shot) | 50.55 |
Winogrande (5-shot) | 74.11 |
GSM8K (5-shot) | 14.1 |
DROP (3-shot) | 14.07 |