deita-7b-v1.0-sft / README.md
PeterV09's picture
Update README.md
a08b314
|
raw
history blame
4.11 kB
metadata
license: apache-2.0
datasets:
  - hkust-nlp/deita-6k-v0
language:
  - en
Deita banner

Model Card for Deita 7B V1.0 SFT

Deita is an open-sourced project designed to facilitate Automatic Data Selection for instruction tuning in Large Language Models (LLMs). Deita 7B V1.0 SFT (6k) is a fine-tuned version of Mistral-7B-v0.1 that was trained on 6k automatically selected lightweight, high-quality alignment SFT data: Deita 6K V0.

Model description

  • Model type: Model fine tuned on automatically selected lightweight, high-quality alignment SFT data.
  • Language(s) (NLP): Primarily English
  • Finetuned from model: Mistral-7B-v0.1

Model Sources

Performance

Model Align Data Size MT-Bench AlpacaEval(%) OpenLLM (Avg.)
Proprietary Models
GPT-4-Turbo ? -- 9.32 97.70 --
GPT-4 SFT + PPO -- 8.99 95.03 --
Claude-2 SFT + PPO -- 8.06 91.36 --
GPT-3.5-turbo SFT + PPO -- 7.94 89.37 --
Open-sourced Models based on Mistral-7B
Mistral-7B-Instruct-v0.1 -- -- 6.84 69.65 60.45
Zephyr-7B-sft SFT 200K SFT 5.32 75.12 60.93
Zephyr-7B-beta SFT + DPO 200K SFT + 60K DPO 7.34 90.60 66.36
OpenChat-3.5 C-RLFT >70K C-RLFT 7.81 88.51 --
Starling-7B C-RLFT + APA >70K C-RLFT + 183K APA 8.09 91.99 --
Random SFT 10K SFT 5.89 56.90 61.72
DEITA-7B-v1.0-sft SFT 6K SFT 7.22 80.78 64.94
DEITA-7B-v1.0-sft SFT 10K SFT 7.32 81.67 64.00
DEITA-7B-v1.0 SFT + DPO 6K SFT + 10K DPO 7.55 90.06 69.86

Input Format

The model is trained using the vicuna_v1.1 template

A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hello! ASSISTANT: Hi!</s>USER: How are you? ASSISTANT:

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 128
  • total_train_batch_size: 512
  • total_eval_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 6.0

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1