UltraLM-65b / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
42ccc37
|
raw
history blame
2.11 kB
metadata
datasets:
  - stingning/ultrachat

UltraLM-65b

This is UltraLM-65b delta weights, a chat language model trained upon UltraChat

Model Details

Model Description

The model is fine-tuned based on LLaMA-65b with a multi-turn chat-format template as below

User: instruction 1
Assistant: response 1<eos_token>
User: instruction 2
Assistant: response 2<eos_token>
...
  • License: UltraLM is based on LLaMA and should be used under LLaMA's model license.
  • Finetuned from model: LLaMA-65b
  • Finetuned on data: UltraChat

Model Sources

Uses

To use this model, you need to recover the full model from the delta weights and perform inference following the template below:

[Optional]User: system prompt
User: user input
Assistant: 

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 58.99
ARC (25-shot) 67.06
HellaSwag (10-shot) 84.98
MMLU (5-shot) 63.48
TruthfulQA (0-shot) 53.51
Winogrande (5-shot) 81.14
GSM8K (5-shot) 32.75
DROP (3-shot) 30.0