v2ray's picture
Adding Evaluation Results (#3)
59bd151
metadata
license: mit
datasets:
  - v2ray/jannie-log
language:
  - en
pipeline_tag: conversational
tags:
  - not-for-all-audiences

LLaMA 2 Jannie 70B QLoRA

Fine tuned on moxxie proxy log.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 64.76
ARC (25-shot) 68.94
HellaSwag (10-shot) 86.9
MMLU (5-shot) 69.37
TruthfulQA (0-shot) 53.67
Winogrande (5-shot) 82.95
GSM8K (5-shot) 31.77
DROP (3-shot) 59.75