AndrewZeng commited on
Commit
53f6961
·
1 Parent(s): 230e39d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -10,6 +10,8 @@ language:
10
 
11
  # Model Card for Deita 7B V1.0 SFT
12
 
 
 
13
  Deita is an open-sourced project designed to facilitate **Automatic Data Selection** for instruction tuning in Large Language Models (LLMs).
14
  Deita 7B V1.0 SFT is a fine-tuned version of Mistral-7B-v0.1 that was trained on 6k automatically selected lightweight, high-quality alignment SFT data: [Deita 6K V0](https://huggingface.co/datasets/hkust-nlp/deita-6k-v0).
15
 
@@ -27,8 +29,6 @@ Deita 7B V1.0 SFT is a fine-tuned version of Mistral-7B-v0.1 that was trained on
27
  ## Performance
28
 
29
 
30
- <details>
31
- <summary>See full evaluations</summary>
32
 
33
  | Model | Align | Data Size | MT-Bench | AlpacaEval(%) | OpenLLM (Avg.) |
34
  |------------------------------------------------|-----------|------------|----------|---------------|----------------|
@@ -63,7 +63,7 @@ Deita 7B V1.0 SFT is a fine-tuned version of Mistral-7B-v0.1 that was trained on
63
  | DEITA-7B-v1.0 | SFT + DPO | 6K SFT + 10K DPO | 7.55 | 90.06 | 69.86 |
64
 
65
 
66
- </details>
67
 
68
 
69
  ## Input Format
 
10
 
11
  # Model Card for Deita 7B V1.0 SFT
12
 
13
+ [GitHub](https://github.com/hkust-nlp/deita) | [Paper](https://arxiv.org/abs/2312.15685)
14
+
15
  Deita is an open-sourced project designed to facilitate **Automatic Data Selection** for instruction tuning in Large Language Models (LLMs).
16
  Deita 7B V1.0 SFT is a fine-tuned version of Mistral-7B-v0.1 that was trained on 6k automatically selected lightweight, high-quality alignment SFT data: [Deita 6K V0](https://huggingface.co/datasets/hkust-nlp/deita-6k-v0).
17
 
 
29
  ## Performance
30
 
31
 
 
 
32
 
33
  | Model | Align | Data Size | MT-Bench | AlpacaEval(%) | OpenLLM (Avg.) |
34
  |------------------------------------------------|-----------|------------|----------|---------------|----------------|
 
63
  | DEITA-7B-v1.0 | SFT + DPO | 6K SFT + 10K DPO | 7.55 | 90.06 | 69.86 |
64
 
65
 
66
+
67
 
68
 
69
  ## Input Format