File size: 3,859 Bytes
a49e7e2
be54c8b
 
 
 
 
 
a49e7e2
be54c8b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
acf91e9
be54c8b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23d6c39
 
 
 
 
 
be54c8b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
---
library_name: peft
tags:
- generated_from_trainer
model-index:
- name: qlora-yi-34b-200k-aezakmi-v2-rawrr-v1-run1
  results: []
---


[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
<details><summary>See axolotl config</summary>

axolotl version: `0.3.0`
```yaml
base_model: ./yi-34b-rawrr-dpo-1
base_model_config: ./yi-34b-rawrr-dpo-1
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
is_mistral_derived_model: false
is_llama_derived_model: true

load_in_8bit: false
load_in_4bit: true
bnb_4bit_use_double_quant: true
buse_double_quants: true
bnb_4bit_compute_dtype: torch.bfloat16
torch_dtype: bf16
strict: false
datasets:
  - path: /run/media/..../aezakmi_v2/aezakmi_v2_draft2.jsonl
    type: alpaca_w_system2.load_open_orca_chatml
    conversation: chatml
dataset_prepared_path: last_run_prepared
val_set_size: 0.01
adapter: qlora
lora_model_dir:
sequence_len: 1400
sample_packing: true
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
lora_target_modules:
  - q_proj
  - v_proj
  - k_proj
  - o_proj
  - gate_proj
  - down_proj
  - up_proj
lora_target_linear: true
lora_fan_in_fan_out:
wandb_project:
wandb_watch:
wandb_run_id:
wandb_log_model:
output_dir: ./qlora-yi-34b-200k-aezakmi-v2-rawrr-v1-run1
pad_to_sequence_len: false
micro_batch_size: 1
gradient_accumulation_steps: 1
num_epochs: 2.4
optimizer: adamw_bnb_8bit
torchdistx_path:
lr_scheduler: constant
learning_rate: 0.00005
train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: false
bfloat16: true
flash_optimum: false
gradient_checkpointing: true
early_stopping_patience:
save_safetensors:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
save_total_limit: 10
deepspeed:
seed: 42
warmup_steps: 100
eval_steps: 5000000
save_steps: 500
eval_table_size: 
eval_table_max_new_tokens:
debug:
weight_decay:
fsdp:
fsdp_config:
special_tokens:
  bos_token: "<|startoftext|>"
  eos_token: "<|endoftext|>"
  unk_token: "<unk>"

```

</details><br>

# qlora-yi-34b-200k-aezakmi-v2-rawrr-v1-run1

This LoRA has been trained from [adamo1139/yi-34b-200k-rawrr-dpo-1](https://huggingface.co/adamo1139/yi-34b-200k-rawrr-dpo-1).
If you want to re-create the model from this, first get yourelf Yi-34B-200K llama-fied, then merge in LoRA [adamo1139/Yi-34B-200K-rawrr1-LORA-DPO-experimental-r2](https://huggingface.co/adamo1139/Yi-34B-200K-rawrr1-LORA-DPO-experimental-r2) \
Then, merge in this lora.

It's all pretty experimental still. I think I will re-run DPO training on rawrr with maybe a bit longer context and with higher lr or epoch count. I want even stronger anti-refusal effect.
It's still better in this regard than Yi-34B-200K-AEZAKMI-v2, but it's not perfect.

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure


The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True
- bnb_4bit_compute_dtype: bfloat16

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- lr_scheduler_warmup_steps: 100
- num_epochs: 2.4

### Training results



### Framework versions

- PEFT 0.7.0
- Transformers 4.37.0.dev0
- Pytorch 2.0.1+cu118
- Datasets 2.15.0
- Tokenizers 0.15.0