lzy510016411
/

law_correct

Model card Files Files and versions Community

lzy510016411 commited on Jun 11, 2024

Commit

dd27402

·

verified ·

1 Parent(s): 0ac5f40

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -19,10 +19,11 @@ language:
 ### Model Sources [optional]
-使用qwen1.5 14b作为基础，进行lora训练而成
 训练参数如下：
 quantization_bit: 4
 stage: sft
@@ -54,12 +55,11 @@ overwrite_output_dir: true
 flash_attn: fa2
 per_device_train_batch_size: 2
 gradient_accumulation_steps: 8
-#之前3e-4学习率疑似有点高了，loss震荡比较厉害
 learning_rate: 0.0001
 num_train_epochs: 3
 weight_decay: 0.01
 optim: adamw_torch
-#似乎8bit优化器存在问题
 lr_scheduler_type: cosine
 warmup_steps: 0.01
 bf16: true
@@ -69,7 +69,7 @@ val_size: 0.001
 per_device_eval_batch_size: 1
 evaluation_strategy: steps
 eval_steps: 250
 ## Uses

 ### Model Sources [optional]
+使用qwen1.5 14b作为基础，进行lora训练而成，使用的llamafactory框架
 训练参数如下：
+```yaml
 quantization_bit: 4
 stage: sft
 flash_attn: fa2
 per_device_train_batch_size: 2
 gradient_accumulation_steps: 8
 learning_rate: 0.0001
 num_train_epochs: 3
 weight_decay: 0.01
 optim: adamw_torch
+#8bit优化器似乎存在问题
 lr_scheduler_type: cosine
 warmup_steps: 0.01
 bf16: true
 per_device_eval_batch_size: 1
 evaluation_strategy: steps
 eval_steps: 250
+```
 ## Uses