CodeLlama-34b-hf / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
ee6e46c
|
raw
history blame
660 Bytes

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 35.11
ARC (25-shot) 37.54
HellaSwag (10-shot) 31.84
MMLU (5-shot) 37.2
TruthfulQA (0-shot) 38.89
Winogrande (5-shot) 73.4
GSM8K (5-shot) 21.61
DROP (3-shot) 5.31