PeterV09 AndrewZeng commited on
Commit
de16c09
1 Parent(s): dd4513e

Update README.md (#3)

Browse files

- Update README.md (8ad4fa76c099397803667e7427ef9b0e0b4eba88)


Co-authored-by: WeihaoZeng <AndrewZeng@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -10,6 +10,8 @@ language:
10
 
11
  # Model Card for Deita 7B V1.0
12
 
 
 
13
  Deita is an open-sourced project designed to facilitate **Automatic Data Selection** for instruction tuning in Large Language Models (LLMs).
14
  Deita 7B V1.0 is a fine-tuned + DPO version of Mistral-7B-v0.1 that was trained on **6K** automatically selected lightweight, high-quality alignment SFT data: [Deita 6K V0](https://huggingface.co/datasets/hkust-nlp/deita-6k-v0) and **10K** randomly sampled alignment preference data from Ultrafeedback.
15
 
@@ -27,8 +29,7 @@ Deita 7B V1.0 is a fine-tuned + DPO version of Mistral-7B-v0.1 that was trained
27
  ## Performance
28
 
29
 
30
- <details>
31
- <summary>See full evaluations</summary>
32
 
33
  | Model | Align | Data Size | MT-Bench | AlpacaEval(%) | OpenLLM (Avg.) |
34
  |------------------------------------------------|-----------|------------|----------|---------------|----------------|
@@ -63,7 +64,6 @@ Deita 7B V1.0 is a fine-tuned + DPO version of Mistral-7B-v0.1 that was trained
63
  | DEITA-7B-v1.0 | SFT + DPO | 6K SFT + 10K DPO | 7.55 | 90.06 | 69.86 |
64
 
65
 
66
- </details>
67
 
68
 
69
  ## Input Format
 
10
 
11
  # Model Card for Deita 7B V1.0
12
 
13
+ [GitHub](https://github.com/hkust-nlp/deita) | [Paper](https://arxiv.org/abs/2312.15685)
14
+
15
  Deita is an open-sourced project designed to facilitate **Automatic Data Selection** for instruction tuning in Large Language Models (LLMs).
16
  Deita 7B V1.0 is a fine-tuned + DPO version of Mistral-7B-v0.1 that was trained on **6K** automatically selected lightweight, high-quality alignment SFT data: [Deita 6K V0](https://huggingface.co/datasets/hkust-nlp/deita-6k-v0) and **10K** randomly sampled alignment preference data from Ultrafeedback.
17
 
 
29
  ## Performance
30
 
31
 
32
+
 
33
 
34
  | Model | Align | Data Size | MT-Bench | AlpacaEval(%) | OpenLLM (Avg.) |
35
  |------------------------------------------------|-----------|------------|----------|---------------|----------------|
 
64
  | DEITA-7B-v1.0 | SFT + DPO | 6K SFT + 10K DPO | 7.55 | 90.06 | 69.86 |
65
 
66
 
 
67
 
68
 
69
  ## Input Format