Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
---
|
2 |
language:
|
3 |
- en
|
4 |
-
license:
|
5 |
tags:
|
6 |
- text-generation-inference
|
7 |
- transformers
|
@@ -9,14 +9,50 @@ tags:
|
|
9 |
- mistral
|
10 |
- trl
|
11 |
base_model: alnrg2arg/blockchainlabs_7B_merged_test2_4
|
|
|
|
|
12 |
---
|
13 |
|
14 |
-
|
15 |
|
16 |
-
|
17 |
-
- **License:** apache-2.0
|
18 |
-
- **Finetuned from model :** alnrg2arg/blockchainlabs_7B_merged_test2_4
|
19 |
|
20 |
-
|
21 |
|
22 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
language:
|
3 |
- en
|
4 |
+
license: cc-by-nc-4.0
|
5 |
tags:
|
6 |
- text-generation-inference
|
7 |
- transformers
|
|
|
9 |
- mistral
|
10 |
- trl
|
11 |
base_model: alnrg2arg/blockchainlabs_7B_merged_test2_4
|
12 |
+
datasets:
|
13 |
+
- Intel/orca_dpo_pairs
|
14 |
---
|
15 |
|
16 |
+
This is a model from blockchainlab test 2.4 - alnrg2arg/blockchainlabs_7B_merged_test2_4.
|
17 |
|
18 |
+
The project is running to make a small LLM for a on-device purpose.
|
|
|
|
|
19 |
|
20 |
+
Overall pipeline for this iteration is
|
21 |
|
22 |
+
1.Merging to make a base model (7B) 2.Prune the model to reduce the parameter (50% sparcity) 3.For recovery phase of the pruning, the DPO is chosen.
|
23 |
+
|
24 |
+
This model which is not pruned is intended to compare with the pruned model.
|
25 |
+
|
26 |
+
This is the code and parameters I chose for this model(DPO).
|
27 |
+
```
|
28 |
+
from transformers import TrainingArguments, AutoModelForCausalLM
|
29 |
+
from trl import DPOTrainer
|
30 |
+
|
31 |
+
dpo_trainer = DPOTrainer(
|
32 |
+
model = model,
|
33 |
+
|
34 |
+
ref_model = None,
|
35 |
+
args = TrainingArguments(
|
36 |
+
per_device_train_batch_size = 8,
|
37 |
+
gradient_accumulation_steps = 8,
|
38 |
+
warmup_ratio = 0.1,
|
39 |
+
num_train_epochs = 3,
|
40 |
+
learning_rate = 5e-6,
|
41 |
+
fp16 = not torch.cuda.is_bf16_supported(),
|
42 |
+
bf16 = torch.cuda.is_bf16_supported(),
|
43 |
+
logging_steps = 1,
|
44 |
+
optim = "adamw_8bit",
|
45 |
+
weight_decay = 0.0,
|
46 |
+
lr_scheduler_type = "linear",
|
47 |
+
seed = 42,
|
48 |
+
output_dir = "output_DPO",
|
49 |
+
),
|
50 |
+
beta = 0.1,
|
51 |
+
train_dataset = dataset,
|
52 |
+
# eval_dataset = raw_datasets["test"],
|
53 |
+
tokenizer = tokenizer,
|
54 |
+
max_length = 1024,
|
55 |
+
max_prompt_length = 512,
|
56 |
+
)
|
57 |
+
```
|
58 |
+
The code and parameters are borrowed from https://colab.research.google.com/drive/1SKrKGV-BZoU4kv5q3g0jtE_OhRgPtrrQ?usp=sharing
|