Lazycuber's picture
Adding Evaluation Results (#1)
736ef76 verified

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 44.95
ARC (25-shot) 55.97
HellaSwag (10-shot) 77.89
MMLU (5-shot) 49.48
TruthfulQA (0-shot) 44.11
Winogrande (5-shot) 74.11
GSM8K (5-shot) 5.91
DROP (3-shot) 7.14