aaronday3 commited on
Commit
47461c9
1 Parent(s): 7d06943

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -1
README.md CHANGED
@@ -14,4 +14,56 @@ Write a story using this writing prompt: As a prank a witch detached your cock a
14
 
15
  Apparently RP has also become a bit less sloppy by coincidence.
16
 
17
- We are looking into opening the datasets up, I'm a bit tired atm, you can also just go get this torrent of the entire reddit, select only the subreddits you want and DIY [https://academictorrents.com/details/56aa49f9653ba545f48df2e33679f014d2829c10]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  Apparently RP has also become a bit less sloppy by coincidence.
16
 
17
+ We are looking into opening the datasets up, I'm a bit tired atm, you can also just go get this torrent of the entire reddit, select only the subreddits you want and DIY [https://academictorrents.com/details/56aa49f9653ba545f48df2e33679f014d2829c10]
18
+
19
+ (for context - this model was a test run, on a small dataset. It will be scaled up later.)
20
+
21
+
22
+ ## Training Config:
23
+
24
+ Thanks a lot to llamafactory, the easiest train I've ever done so far.
25
+
26
+ ```
27
+ llamafactory-cli train \
28
+ --stage kto \
29
+ --do_train True \
30
+ --model_name_or_path cognitivecomputations/dolphin-2.9.1-llama-3-8b \
31
+ --preprocessing_num_workers 16 \
32
+ --finetuning_type lora \
33
+ --quantization_bit 8 \
34
+ --template chatml \
35
+ --flash_attn auto \
36
+ --use_unsloth True \
37
+ --dataset_dir /workspace/kto \
38
+ --dataset kto_dataset \
39
+ --cutoff_len 2048 \
40
+ --learning_rate 5e-05 \
41
+ --num_train_epochs 3.0 \
42
+ --max_samples 100000 \
43
+ --per_device_train_batch_size 2 \
44
+ --gradient_accumulation_steps 8 \
45
+ --lr_scheduler_type cosine \
46
+ --max_grad_norm 1.0 \
47
+ --logging_steps 5 \
48
+ --save_steps 500 \
49
+ --warmup_steps 50 \
50
+ --optim adamw_torch \
51
+ --packing False \
52
+ --report_to all \
53
+ --output_dir saves/LLaMA3-8B/lora/train_2024-06-15-15-18-25 \
54
+ --bf16 True \
55
+ --plot_loss True \
56
+ --ddp_timeout 180000000 \
57
+ --include_num_input_tokens_seen True \
58
+ --lora_rank 32 \
59
+ --lora_alpha 32 \
60
+ --lora_dropout 0 \
61
+ --lora_target all \
62
+ --pref_beta 0.1 \
63
+ --pref_ftx 0 \
64
+ --pref_loss sigmoid \
65
+ --val_size 0.05 \
66
+ --eval_strategy steps \
67
+ --eval_steps 50 \
68
+ --per_device_eval_batch_size 2
69
+ ```