fangloveskari commited on
Commit
80f4599
1 Parent(s): 07e9990

update training, export model and evaluation part

Browse files
Files changed (1) hide show
  1. README.md +47 -7
README.md CHANGED
@@ -8,7 +8,7 @@ license: llama2
8
 
9
  # Dolphin_ORCA_PlatyPus_LLaMA_70b
10
 
11
- #### Dataset
12
  Here is the list of datasets used:
13
  * Dolphin
14
  * Open-Platypus
@@ -21,26 +21,66 @@ Here is the list of datasets used:
21
 
22
  <br>
23
 
24
- #### license disclaimer:
25
 
26
- This model is bound by the license & usage restrictions of the original Llama-2 model. And comes with no warranty or gurantees of any kind.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
  <br>
29
 
30
- #### Evaluation
31
 
32
- TODO
 
 
 
 
 
 
33
 
34
  <br>
35
 
 
 
 
 
 
36
 
37
- #### Limitations & Biases:
 
 
 
38
 
39
  Llama 2 and fine-tuned variants are a new technology that carries risks with use. Testing conducted to date has been in English, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, Llama 2 and any fine-tuned varient's potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 2 variants, developers should perform safety testing and tuning tailored to their specific applications of the model.
40
 
41
  Please see the Responsible Use Guide available at https://ai.meta.com/llama/responsible-use-guide/
42
 
43
-
44
  <br>
45
 
46
  ### Citiation:
 
8
 
9
  # Dolphin_ORCA_PlatyPus_LLaMA_70b
10
 
11
+ ### Dataset
12
  Here is the list of datasets used:
13
  * Dolphin
14
  * Open-Platypus
 
21
 
22
  <br>
23
 
24
+ ### Training FrameWork and Parameters
25
 
26
+ #### FrameWork
27
+ https://github.com/hiyouga/LLaMA-Efficient-Tuning
28
+ We add flash_attention_2 and ORCA dataset support, with some minor modifications.
29
+
30
+ <br>
31
+
32
+ #### Parameters
33
+ We list some training parameters here:
34
+ | Parameter | Value |
35
+ |-----------------------|-------------|
36
+ | Finetune_Type | QLoRA(NF4) |
37
+ | LoRA_Rank | 16 |
38
+ | LoRA_Alpha | 16 |
39
+ | Batch_Size | 14 |
40
+ | GPUs | 8xA100(80G) |
41
+ | LR_Scheduler | cosine |
42
+ | LR | 3e-4 |
43
+ | Epoch | 1 |
44
+ | DeepSpeed | ZERO-2 |
45
+
46
+ <br>
47
+
48
+ ### Model Export
49
+ We tried two methods to fuse the adapter back to the base model:
50
+ * https://github.com/hiyouga/LLaMA-Efficient-Tuning/blob/main/src/export_model.py
51
+ * https://github.com/jondurbin/qlora/blob/main/qmerge.py
52
+
53
+ Generally, the second will get better ARC(+0.15) and Truthful_QA(+0.3) scores but the other two(MMLU(-0.2) and HelloSwag(-0.2)) seems to degenerate (Just for my model).
54
 
55
  <br>
56
 
57
+ ### Evaluation
58
 
59
+ | Metric | Value |
60
+ |-----------------------|-------|
61
+ | ARC (25-shot) | 72.27 |
62
+ | HellaSwag (10-shot) | 87.74 |
63
+ | MMLU (5-shot) | 70.23 |
64
+ | TruthfulQA (0-shot) | 63.37 |
65
+ | Avg. | 73.40 |
66
 
67
  <br>
68
 
69
+ ### license disclaimer:
70
+
71
+ This model is bound by the license & usage restrictions of the original Llama-2 model. And comes with no warranty or gurantees of any kind.
72
+
73
+ <br>
74
 
75
+
76
+
77
+
78
+ ### Limitations & Biases:
79
 
80
  Llama 2 and fine-tuned variants are a new technology that carries risks with use. Testing conducted to date has been in English, and has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, Llama 2 and any fine-tuned varient's potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 2 variants, developers should perform safety testing and tuning tailored to their specific applications of the model.
81
 
82
  Please see the Responsible Use Guide available at https://ai.meta.com/llama/responsible-use-guide/
83
 
 
84
  <br>
85
 
86
  ### Citiation: