Youliang commited on
Commit
859c13d
1 Parent(s): 71aec37

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -3
README.md CHANGED
@@ -1,3 +1,89 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ ---
5
+ library_name: peft
6
+ tags:
7
+ - generated_from_trainer
8
+ base_model: meta-llama/Meta-Llama-3-8B-Instruct
9
+ model-index:
10
+ - name: lora_Meta-Llama-3-8B_derta
11
+ results: []
12
+ license: apache-2.0
13
+ ---
14
+
15
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
+ should probably proofread and complete it, then remove this comment. -->
17
+
18
+ # lora_Meta-Llama-3-8B_derta
19
+
20
+ This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on the [Evol-Instruct](https://huggingface.co/datasets/WizardLMTeam/WizardLM_evol_instruct_70k) and [BeaverTails](https://huggingface.co/datasets/PKU-Alignment/BeaverTails) dataset.
21
+
22
+ ## Model description
23
+
24
+ Please refer to the paper [Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training](https://arxiv.org/abs/2407.09121) and GitHub [DeRTa](https://github.com/RobustNLP/DeRTa).
25
+ The model is continued train 100 steps with DeRTa on LLaMA3-8B-Instruct.
26
+
27
+
28
+ Input format:
29
+ ```
30
+ [INST] Your Instruction [\INST]
31
+ ```
32
+ ## Intended uses & limitations
33
+
34
+ More information needed
35
+
36
+ ## Training and evaluation data
37
+
38
+ More information needed
39
+
40
+ ## Training procedure
41
+
42
+ ### Training hyperparameters
43
+
44
+ The following hyperparameters were used during training:
45
+ - learning_rate: 0.0001
46
+ - train_batch_size: 8
47
+ - eval_batch_size: 1
48
+ - seed: 1
49
+ - distributed_type: multi-GPU
50
+ - num_devices: 8
51
+ - gradient_accumulation_steps: 2
52
+ - total_train_batch_size: 128
53
+ - total_eval_batch_size: 8
54
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
55
+ - lr_scheduler_type: cosine
56
+ - num_epochs: 2.0
57
+
58
+
59
+ The lora config is:
60
+ ```
61
+ {
62
+ "lora_r": 96,
63
+ "lora_alpha": 16,
64
+ "lora_dropout": 0.05,
65
+ "lora_target_modules": [
66
+ "q_proj",
67
+ "v_proj",
68
+ "k_proj",
69
+ "o_proj",
70
+ "gate_proj",
71
+ "down_proj",
72
+ "up_proj",
73
+ "w1",
74
+ "w2",
75
+ "w3"
76
+ ]
77
+ }
78
+ ```
79
+ ### Training results
80
+
81
+
82
+
83
+ ### Framework versions
84
+
85
+ - PEFT 0.10.0
86
+ - Transformers 4.40.0
87
+ - Pytorch 2.2.0+cu118
88
+ - Datasets 2.10.0
89
+ - Tokenizers 0.19.1