Add library name, pipeline tag, paper link, and Github link
#1
by
nielsr
HF staff
- opened
README.md
CHANGED
@@ -1,13 +1,19 @@
|
|
1 |
---
|
2 |
-
license: llama3
|
3 |
-
datasets:
|
4 |
-
- BAAI/Infinity-Instruct
|
5 |
base_model:
|
6 |
- meta-llama/Meta-Llama-3.1-8B-Instruct
|
|
|
|
|
|
|
|
|
|
|
7 |
---
|
8 |
|
9 |
We prune the Llama-3.1-8B-Instruct to 1.4B and fine-tune it with LLM-Neo method,which combines LoRA and KD in one. Training data is sampling from BAAI/Infinity-Instruct for 1 Million lines.
|
10 |
|
|
|
|
|
|
|
|
|
11 |
## Benchmarks
|
12 |
|
13 |
In this section, we report the results for Llama3.1-Neo-1B-100w on standard automatic benchmarks. For all the evaluations, we use [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) library.
|
|
|
1 |
---
|
|
|
|
|
|
|
2 |
base_model:
|
3 |
- meta-llama/Meta-Llama-3.1-8B-Instruct
|
4 |
+
datasets:
|
5 |
+
- BAAI/Infinity-Instruct
|
6 |
+
license: apache-2.0
|
7 |
+
library_name: transformers
|
8 |
+
pipeline_tag: text-generation
|
9 |
---
|
10 |
|
11 |
We prune the Llama-3.1-8B-Instruct to 1.4B and fine-tune it with LLM-Neo method,which combines LoRA and KD in one. Training data is sampling from BAAI/Infinity-Instruct for 1 Million lines.
|
12 |
|
13 |
+
For more information, please refer to the paper: [LLM-Neo: Parameter Efficient Knowledge Distillation for Large Language Models](https://huggingface.co/papers/2411.06839)
|
14 |
+
|
15 |
+
Code can be found here: https://github.com/yang3121099/LLM-Neo
|
16 |
+
|
17 |
## Benchmarks
|
18 |
|
19 |
In this section, we report the results for Llama3.1-Neo-1B-100w on standard automatic benchmarks. For all the evaluations, we use [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) library.
|