File size: 3,859 Bytes
a49e7e2 be54c8b a49e7e2 be54c8b acf91e9 be54c8b 23d6c39 be54c8b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 |
---
library_name: peft
tags:
- generated_from_trainer
model-index:
- name: qlora-yi-34b-200k-aezakmi-v2-rawrr-v1-run1
results: []
---
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
<details><summary>See axolotl config</summary>
axolotl version: `0.3.0`
```yaml
base_model: ./yi-34b-rawrr-dpo-1
base_model_config: ./yi-34b-rawrr-dpo-1
model_type: LlamaForCausalLM
tokenizer_type: LlamaTokenizer
is_mistral_derived_model: false
is_llama_derived_model: true
load_in_8bit: false
load_in_4bit: true
bnb_4bit_use_double_quant: true
buse_double_quants: true
bnb_4bit_compute_dtype: torch.bfloat16
torch_dtype: bf16
strict: false
datasets:
- path: /run/media/..../aezakmi_v2/aezakmi_v2_draft2.jsonl
type: alpaca_w_system2.load_open_orca_chatml
conversation: chatml
dataset_prepared_path: last_run_prepared
val_set_size: 0.01
adapter: qlora
lora_model_dir:
sequence_len: 1400
sample_packing: true
lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
lora_target_modules:
- q_proj
- v_proj
- k_proj
- o_proj
- gate_proj
- down_proj
- up_proj
lora_target_linear: true
lora_fan_in_fan_out:
wandb_project:
wandb_watch:
wandb_run_id:
wandb_log_model:
output_dir: ./qlora-yi-34b-200k-aezakmi-v2-rawrr-v1-run1
pad_to_sequence_len: false
micro_batch_size: 1
gradient_accumulation_steps: 1
num_epochs: 2.4
optimizer: adamw_bnb_8bit
torchdistx_path:
lr_scheduler: constant
learning_rate: 0.00005
train_on_inputs: false
group_by_length: false
bf16: true
fp16: false
tf32: false
bfloat16: true
flash_optimum: false
gradient_checkpointing: true
early_stopping_patience:
save_safetensors:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
save_total_limit: 10
deepspeed:
seed: 42
warmup_steps: 100
eval_steps: 5000000
save_steps: 500
eval_table_size:
eval_table_max_new_tokens:
debug:
weight_decay:
fsdp:
fsdp_config:
special_tokens:
bos_token: "<|startoftext|>"
eos_token: "<|endoftext|>"
unk_token: "<unk>"
```
</details><br>
# qlora-yi-34b-200k-aezakmi-v2-rawrr-v1-run1
This LoRA has been trained from [adamo1139/yi-34b-200k-rawrr-dpo-1](https://huggingface.co/adamo1139/yi-34b-200k-rawrr-dpo-1).
If you want to re-create the model from this, first get yourelf Yi-34B-200K llama-fied, then merge in LoRA [adamo1139/Yi-34B-200K-rawrr1-LORA-DPO-experimental-r2](https://huggingface.co/adamo1139/Yi-34B-200K-rawrr1-LORA-DPO-experimental-r2) \
Then, merge in this lora.
It's all pretty experimental still. I think I will re-run DPO training on rawrr with maybe a bit longer context and with higher lr or epoch count. I want even stronger anti-refusal effect.
It's still better in this regard than Yi-34B-200K-AEZAKMI-v2, but it's not perfect.
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
The following `bitsandbytes` quantization config was used during training:
- quant_method: bitsandbytes
- load_in_8bit: False
- load_in_4bit: True
- llm_int8_threshold: 6.0
- llm_int8_skip_modules: None
- llm_int8_enable_fp32_cpu_offload: False
- llm_int8_has_fp16_weight: False
- bnb_4bit_quant_type: nf4
- bnb_4bit_use_double_quant: True
- bnb_4bit_compute_dtype: bfloat16
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- lr_scheduler_warmup_steps: 100
- num_epochs: 2.4
### Training results
### Framework versions
- PEFT 0.7.0
- Transformers 4.37.0.dev0
- Pytorch 2.0.1+cu118
- Datasets 2.15.0
- Tokenizers 0.15.0 |