Update README.md
Browse files
README.md
CHANGED
@@ -16,6 +16,7 @@ pipeline_tag: text-generation
|
|
16 |
|
17 |
**SeqKD-gpt2-760M** is a gpt2-large (760M) model distilled from [gpt2-xlarge (1.5B)](https://huggingface.co/MiniLLM/teacher-gpt2-1.5B) on [databricks-dolly-15k](https://huggingface.co/datasets/aisquared/databricks-dolly-15k) with sequence-level forward KLD.
|
18 |
|
|
|
19 |
|
20 |
## Other Baselines
|
21 |
+ [SFT w/o KD](https://huggingface.co/MiniLLM/SFT-gpt2-760M)
|
|
|
16 |
|
17 |
**SeqKD-gpt2-760M** is a gpt2-large (760M) model distilled from [gpt2-xlarge (1.5B)](https://huggingface.co/MiniLLM/teacher-gpt2-1.5B) on [databricks-dolly-15k](https://huggingface.co/datasets/aisquared/databricks-dolly-15k) with sequence-level forward KLD.
|
18 |
|
19 |
+
It is used as a baseline for [MiniLLM](https://huggingface.co/MiniLLM/MiniLLM-gpt2-760M).
|
20 |
|
21 |
## Other Baselines
|
22 |
+ [SFT w/o KD](https://huggingface.co/MiniLLM/SFT-gpt2-760M)
|