huihui-ai commited on
Commit
f4208c5
·
verified ·
1 Parent(s): 984059a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -11
README.md CHANGED
@@ -20,7 +20,6 @@ The [SFT (Supervised Fine-Tuning)](https://github.com/modelscope/ms-swift) proce
20
  1. Create the environment.
21
 
22
  ```
23
-
24
  mkdir MicroThinker-1B-Preview
25
  cd MicroThinker-1B-Preview
26
  conda create -yn ms-swift python=3.11
@@ -37,7 +36,6 @@ cd ..
37
  2. Download the model and dataset.
38
 
39
  ```
40
-
41
  huggingface-cli download huihui-ai/Llama-3.2-1B-Instruct-abliterated --local-dir ./huihui-ai/Llama-3.2-1B-Instruct-abliterated
42
  huggingface-cli download --repo-type dataset huihui-ai/QWQ-LONGCOT-500K --local-dir ./data/QWQ-LONGCOT-500K
43
  huggingface-cli download --repo-type dataset huihui-ai/LONGCOT-Refine-500K --local-dir ./data/LONGCOT-Refine-500K
@@ -47,7 +45,6 @@ huggingface-cli download --repo-type dataset huihui-ai/LONGCOT-Refine-500K --lo
47
  3. Used only the huihui-ai/QWQ-LONGCOT-500K dataset (#20000), Trained for 1 epoch:
48
 
49
  ```
50
-
51
  swift sft --model huihui-ai/Llama-3.2-1B-Instruct-abliterated --model_type llama3_2 --train_type lora --dataset "data/qwq_500k.jsonl#20000" --torch_dtype bfloat16 --num_train_epochs 1 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --target_modules all-linear --gradient_accumulation_steps 16 --eval_steps 50 --save_steps 50 --save_total_limit 2 --logging_steps 5 --max_length 16384 --output_dir output/Llama-3.2-1B-Instruct-abliterated/lora/sft --system "You are a helpful assistant. You should think step-by-step." --warmup_ratio 0.05 --dataloader_num_workers 4 --model_author "huihui-ai" --model_name "huihui-ai-robot"
52
  ```
53
 
@@ -56,7 +53,6 @@ swift sft --model huihui-ai/Llama-3.2-1B-Instruct-abliterated --model_type llama
56
  Replace the directories below with specific ones.
57
 
58
  ```
59
-
60
  swift infer --model huihui-ai/Llama-3.2-1B-Instruct-abliterated --adapters output/Llama-3.2-1B-Instruct-abliterated/lora/sft/v0-20250102-153619/checkpoint-1237 --merge_lora true
61
  ```
62
 
@@ -66,34 +62,30 @@ This should create a new model directory: `checkpoint-1237-merged`, Copy or move
66
  5. Perform inference on the fine-tuned model.
67
 
68
  ```
69
-
70
  swift infer --model huihui/checkpoint-1237-merged --stream true --infer_backend pt --max_new_tokens 8192
71
  ```
72
 
73
 
74
- 5. Combined training with huihui-ai/QWQ-LONGCOT-500K (#20000) and huihui-ai/LONGCOT-Refine datasets (#20000), Trained for 1 epoch:
75
 
76
  ```
77
-
78
  swift sft --model huihui-ai/checkpoint-1237-merged --model_type llama3_2 --train_type lora --dataset "data/qwq_500k.jsonl#20000" "data/refine_from_qwen2_5.jsonl#20000" --torch_dtype bfloat16 --num_train_epochs 1 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --target_modules all-linear --gradient_accumulation_steps 16 --eval_steps 50 --save_steps 50 --save_total_limit 2 --logging_steps 5 --max_length 16384 --output_dir output/Llama-3.2-1B-Instruct-abliterated/lora/sft2 --system "You are a helpful assistant. You should think step-by-step." --warmup_ratio 0.05 --dataloader_num_workers 4 --model_author "huihui-ai" --model_name "huihui-ai-robot"
79
  ```
80
 
81
 
82
- 6. Save the final fine-tuned model.
83
  Replace the directories below with specific ones.
84
 
85
  ```
86
-
87
  swift infer --model huihui-ai/checkpoint-1237-merged --adapters output/Llama-3.2-1B-Instruct-abliterated/lora/sft2/v0-20250103-121319/checkpoint-1237 --merge_lora true
88
  ```
89
 
90
 
91
  This should create a new model directory: `checkpoint-1237-merged`, Rename the directory to `MicroThinker-1B-Preview`, Copy or move this directory to the `huihui` directory.
92
 
93
- 7. Perform inference on the final fine-tuned model.
94
 
95
  ```
96
-
97
  swift infer --model huihui/MicroThinker-1B-Preview --stream true --infer_backend pt --max_new_tokens 8192
98
  ```
99
 
 
20
  1. Create the environment.
21
 
22
  ```
 
23
  mkdir MicroThinker-1B-Preview
24
  cd MicroThinker-1B-Preview
25
  conda create -yn ms-swift python=3.11
 
36
  2. Download the model and dataset.
37
 
38
  ```
 
39
  huggingface-cli download huihui-ai/Llama-3.2-1B-Instruct-abliterated --local-dir ./huihui-ai/Llama-3.2-1B-Instruct-abliterated
40
  huggingface-cli download --repo-type dataset huihui-ai/QWQ-LONGCOT-500K --local-dir ./data/QWQ-LONGCOT-500K
41
  huggingface-cli download --repo-type dataset huihui-ai/LONGCOT-Refine-500K --local-dir ./data/LONGCOT-Refine-500K
 
45
  3. Used only the huihui-ai/QWQ-LONGCOT-500K dataset (#20000), Trained for 1 epoch:
46
 
47
  ```
 
48
  swift sft --model huihui-ai/Llama-3.2-1B-Instruct-abliterated --model_type llama3_2 --train_type lora --dataset "data/qwq_500k.jsonl#20000" --torch_dtype bfloat16 --num_train_epochs 1 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --target_modules all-linear --gradient_accumulation_steps 16 --eval_steps 50 --save_steps 50 --save_total_limit 2 --logging_steps 5 --max_length 16384 --output_dir output/Llama-3.2-1B-Instruct-abliterated/lora/sft --system "You are a helpful assistant. You should think step-by-step." --warmup_ratio 0.05 --dataloader_num_workers 4 --model_author "huihui-ai" --model_name "huihui-ai-robot"
49
  ```
50
 
 
53
  Replace the directories below with specific ones.
54
 
55
  ```
 
56
  swift infer --model huihui-ai/Llama-3.2-1B-Instruct-abliterated --adapters output/Llama-3.2-1B-Instruct-abliterated/lora/sft/v0-20250102-153619/checkpoint-1237 --merge_lora true
57
  ```
58
 
 
62
  5. Perform inference on the fine-tuned model.
63
 
64
  ```
 
65
  swift infer --model huihui/checkpoint-1237-merged --stream true --infer_backend pt --max_new_tokens 8192
66
  ```
67
 
68
 
69
+ 6. Combined training with huihui-ai/QWQ-LONGCOT-500K (#20000) and huihui-ai/LONGCOT-Refine datasets (#20000), Trained for 1 epoch:
70
 
71
  ```
 
72
  swift sft --model huihui-ai/checkpoint-1237-merged --model_type llama3_2 --train_type lora --dataset "data/qwq_500k.jsonl#20000" "data/refine_from_qwen2_5.jsonl#20000" --torch_dtype bfloat16 --num_train_epochs 1 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --target_modules all-linear --gradient_accumulation_steps 16 --eval_steps 50 --save_steps 50 --save_total_limit 2 --logging_steps 5 --max_length 16384 --output_dir output/Llama-3.2-1B-Instruct-abliterated/lora/sft2 --system "You are a helpful assistant. You should think step-by-step." --warmup_ratio 0.05 --dataloader_num_workers 4 --model_author "huihui-ai" --model_name "huihui-ai-robot"
73
  ```
74
 
75
 
76
+ 7. Save the final fine-tuned model.
77
  Replace the directories below with specific ones.
78
 
79
  ```
 
80
  swift infer --model huihui-ai/checkpoint-1237-merged --adapters output/Llama-3.2-1B-Instruct-abliterated/lora/sft2/v0-20250103-121319/checkpoint-1237 --merge_lora true
81
  ```
82
 
83
 
84
  This should create a new model directory: `checkpoint-1237-merged`, Rename the directory to `MicroThinker-1B-Preview`, Copy or move this directory to the `huihui` directory.
85
 
86
+ 8. Perform inference on the final fine-tuned model.
87
 
88
  ```
 
89
  swift infer --model huihui/MicroThinker-1B-Preview --stream true --infer_backend pt --max_new_tokens 8192
90
  ```
91