huihui-ai commited on
Commit
984059a
·
verified ·
1 Parent(s): c73f2d5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -15
README.md CHANGED
@@ -1,6 +1,5 @@
1
  ---
2
  license: apache-2.0
3
- ---
4
  datasets:
5
  - huihui-ai/QWQ-LONGCOT-500K
6
  - huihui-ai/LONGCOT-Refine-500K
@@ -20,7 +19,8 @@ The model was trained using 1 RTX 4090 GPU(24GB)
20
  The [SFT (Supervised Fine-Tuning)](https://github.com/modelscope/ms-swift) process is divided into several steps, and no code needs to be written.
21
  1. Create the environment.
22
 
23
- '''
 
24
  mkdir MicroThinker-1B-Preview
25
  cd MicroThinker-1B-Preview
26
  conda create -yn ms-swift python=3.11
@@ -31,48 +31,69 @@ git clone https://github.com/modelscope/ms-swift.git
31
  cd ms-swift
32
  pip install -e .
33
  cd ..
34
- '''
 
35
 
36
  2. Download the model and dataset.
37
 
38
- '''
 
39
  huggingface-cli download huihui-ai/Llama-3.2-1B-Instruct-abliterated --local-dir ./huihui-ai/Llama-3.2-1B-Instruct-abliterated
40
  huggingface-cli download --repo-type dataset huihui-ai/QWQ-LONGCOT-500K --local-dir ./data/QWQ-LONGCOT-500K
41
  huggingface-cli download --repo-type dataset huihui-ai/LONGCOT-Refine-500K --local-dir ./data/LONGCOT-Refine-500K
42
- '''
 
43
 
44
  3. Used only the huihui-ai/QWQ-LONGCOT-500K dataset (#20000), Trained for 1 epoch:
45
 
46
- '''
 
47
  swift sft --model huihui-ai/Llama-3.2-1B-Instruct-abliterated --model_type llama3_2 --train_type lora --dataset "data/qwq_500k.jsonl#20000" --torch_dtype bfloat16 --num_train_epochs 1 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --target_modules all-linear --gradient_accumulation_steps 16 --eval_steps 50 --save_steps 50 --save_total_limit 2 --logging_steps 5 --max_length 16384 --output_dir output/Llama-3.2-1B-Instruct-abliterated/lora/sft --system "You are a helpful assistant. You should think step-by-step." --warmup_ratio 0.05 --dataloader_num_workers 4 --model_author "huihui-ai" --model_name "huihui-ai-robot"
48
- '''
 
49
 
50
  4. Save the fine-tuned model.
51
  Replace the directories below with specific ones.
52
 
53
- '''
 
54
  swift infer --model huihui-ai/Llama-3.2-1B-Instruct-abliterated --adapters output/Llama-3.2-1B-Instruct-abliterated/lora/sft/v0-20250102-153619/checkpoint-1237 --merge_lora true
55
- '''
 
56
 
57
  This should create a new model directory: `checkpoint-1237-merged`, Copy or move this directory to the `huihui` directory.
58
 
 
 
 
 
 
 
 
 
59
  5. Combined training with huihui-ai/QWQ-LONGCOT-500K (#20000) and huihui-ai/LONGCOT-Refine datasets (#20000), Trained for 1 epoch:
60
 
61
- '''
 
62
  swift sft --model huihui-ai/checkpoint-1237-merged --model_type llama3_2 --train_type lora --dataset "data/qwq_500k.jsonl#20000" "data/refine_from_qwen2_5.jsonl#20000" --torch_dtype bfloat16 --num_train_epochs 1 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --target_modules all-linear --gradient_accumulation_steps 16 --eval_steps 50 --save_steps 50 --save_total_limit 2 --logging_steps 5 --max_length 16384 --output_dir output/Llama-3.2-1B-Instruct-abliterated/lora/sft2 --system "You are a helpful assistant. You should think step-by-step." --warmup_ratio 0.05 --dataloader_num_workers 4 --model_author "huihui-ai" --model_name "huihui-ai-robot"
63
- '''
 
64
 
65
  6. Save the final fine-tuned model.
66
  Replace the directories below with specific ones.
67
 
68
- '''
 
69
  swift infer --model huihui-ai/checkpoint-1237-merged --adapters output/Llama-3.2-1B-Instruct-abliterated/lora/sft2/v0-20250103-121319/checkpoint-1237 --merge_lora true
70
- '''
 
71
 
72
  This should create a new model directory: `checkpoint-1237-merged`, Rename the directory to `MicroThinker-1B-Preview`, Copy or move this directory to the `huihui` directory.
73
 
74
  7. Perform inference on the final fine-tuned model.
75
 
76
- '''
 
77
  swift infer --model huihui/MicroThinker-1B-Preview --stream true --infer_backend pt --max_new_tokens 8192
78
- '''
 
 
1
  ---
2
  license: apache-2.0
 
3
  datasets:
4
  - huihui-ai/QWQ-LONGCOT-500K
5
  - huihui-ai/LONGCOT-Refine-500K
 
19
  The [SFT (Supervised Fine-Tuning)](https://github.com/modelscope/ms-swift) process is divided into several steps, and no code needs to be written.
20
  1. Create the environment.
21
 
22
+ ```
23
+
24
  mkdir MicroThinker-1B-Preview
25
  cd MicroThinker-1B-Preview
26
  conda create -yn ms-swift python=3.11
 
31
  cd ms-swift
32
  pip install -e .
33
  cd ..
34
+ ```
35
+
36
 
37
  2. Download the model and dataset.
38
 
39
+ ```
40
+
41
  huggingface-cli download huihui-ai/Llama-3.2-1B-Instruct-abliterated --local-dir ./huihui-ai/Llama-3.2-1B-Instruct-abliterated
42
  huggingface-cli download --repo-type dataset huihui-ai/QWQ-LONGCOT-500K --local-dir ./data/QWQ-LONGCOT-500K
43
  huggingface-cli download --repo-type dataset huihui-ai/LONGCOT-Refine-500K --local-dir ./data/LONGCOT-Refine-500K
44
+ ```
45
+
46
 
47
  3. Used only the huihui-ai/QWQ-LONGCOT-500K dataset (#20000), Trained for 1 epoch:
48
 
49
+ ```
50
+
51
  swift sft --model huihui-ai/Llama-3.2-1B-Instruct-abliterated --model_type llama3_2 --train_type lora --dataset "data/qwq_500k.jsonl#20000" --torch_dtype bfloat16 --num_train_epochs 1 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --target_modules all-linear --gradient_accumulation_steps 16 --eval_steps 50 --save_steps 50 --save_total_limit 2 --logging_steps 5 --max_length 16384 --output_dir output/Llama-3.2-1B-Instruct-abliterated/lora/sft --system "You are a helpful assistant. You should think step-by-step." --warmup_ratio 0.05 --dataloader_num_workers 4 --model_author "huihui-ai" --model_name "huihui-ai-robot"
52
+ ```
53
+
54
 
55
  4. Save the fine-tuned model.
56
  Replace the directories below with specific ones.
57
 
58
+ ```
59
+
60
  swift infer --model huihui-ai/Llama-3.2-1B-Instruct-abliterated --adapters output/Llama-3.2-1B-Instruct-abliterated/lora/sft/v0-20250102-153619/checkpoint-1237 --merge_lora true
61
+ ```
62
+
63
 
64
  This should create a new model directory: `checkpoint-1237-merged`, Copy or move this directory to the `huihui` directory.
65
 
66
+ 5. Perform inference on the fine-tuned model.
67
+
68
+ ```
69
+
70
+ swift infer --model huihui/checkpoint-1237-merged --stream true --infer_backend pt --max_new_tokens 8192
71
+ ```
72
+
73
+
74
  5. Combined training with huihui-ai/QWQ-LONGCOT-500K (#20000) and huihui-ai/LONGCOT-Refine datasets (#20000), Trained for 1 epoch:
75
 
76
+ ```
77
+
78
  swift sft --model huihui-ai/checkpoint-1237-merged --model_type llama3_2 --train_type lora --dataset "data/qwq_500k.jsonl#20000" "data/refine_from_qwen2_5.jsonl#20000" --torch_dtype bfloat16 --num_train_epochs 1 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --target_modules all-linear --gradient_accumulation_steps 16 --eval_steps 50 --save_steps 50 --save_total_limit 2 --logging_steps 5 --max_length 16384 --output_dir output/Llama-3.2-1B-Instruct-abliterated/lora/sft2 --system "You are a helpful assistant. You should think step-by-step." --warmup_ratio 0.05 --dataloader_num_workers 4 --model_author "huihui-ai" --model_name "huihui-ai-robot"
79
+ ```
80
+
81
 
82
  6. Save the final fine-tuned model.
83
  Replace the directories below with specific ones.
84
 
85
+ ```
86
+
87
  swift infer --model huihui-ai/checkpoint-1237-merged --adapters output/Llama-3.2-1B-Instruct-abliterated/lora/sft2/v0-20250103-121319/checkpoint-1237 --merge_lora true
88
+ ```
89
+
90
 
91
  This should create a new model directory: `checkpoint-1237-merged`, Rename the directory to `MicroThinker-1B-Preview`, Copy or move this directory to the `huihui` directory.
92
 
93
  7. Perform inference on the final fine-tuned model.
94
 
95
+ ```
96
+
97
  swift infer --model huihui/MicroThinker-1B-Preview --stream true --infer_backend pt --max_new_tokens 8192
98
+ ```
99
+