File size: 4,712 Bytes
c73f2d5 0dba2f8 26d5042 d69c022 51155ba c73f2d5 0dba2f8 c73f2d5 70d7386 68586a9 396a90e 70d7386 c73f2d5 fc2dd47 c7236a6 c73f2d5 0543b31 c73f2d5 984059a c73f2d5 984059a c73f2d5 984059a c73f2d5 984059a c73f2d5 984059a dc9a68c 984059a c73f2d5 7c17a14 c73f2d5 984059a 7c17a14 984059a c73f2d5 7c17a14 984059a f4208c5 c73f2d5 984059a dc9a68c 984059a c73f2d5 7c17a14 c73f2d5 984059a 7c17a14 984059a c73f2d5 7c17a14 c73f2d5 f4208c5 c73f2d5 984059a c73f2d5 984059a 1e5e267 d69c022 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 |
---
license: apache-2.0
datasets:
- huihui-ai/QWQ-LONGCOT-500K
- huihui-ai/LONGCOT-Refine-500K
base_model:
- huihui-ai/Llama-3.2-1B-Instruct-abliterated
tags:
- llama3.2
- abliterated
- uncensored
library_name: transformers
pipeline_tag: text-generation
language:
- en
---
# MicroThinker-1B-Preview
MicroThinker-1B-Preview, a new model fine-tuned from the [huihui-ai/Llama-3.2-1B-Instruct-abliterated](https://huggingface.co/huihui-ai/Llama-3.2-1B-Instruct-abliterated) model, focused on advancing AI reasoning capabilities.
## Use with ollama
You can use [huihui_ai/microthinker](https://ollama.com/huihui_ai/microthinker) directly
```
ollama run huihui_ai/microthinker
```
## Training Details
This is just a test, but the performance is quite good.
Now, I'll introduce the test environment.
The model was trained using 1 RTX 4090 GPU(24GB) .
The fine-tuning process used only 20,000 records from each dataset.
The [SFT (Supervised Fine-Tuning)](https://github.com/modelscope/ms-swift) process is divided into several steps, and no code needs to be written.
1. Create the environment.
```
conda create -yn ms-swift python=3.11
conda activate ms-swift
git clone https://github.com/modelscope/ms-swift.git
cd ms-swift
pip install -e .
cd ..
```
2. Download the model and dataset.
```
huggingface-cli download huihui-ai/Llama-3.2-1B-Instruct-abliterated --local-dir ./huihui-ai/Llama-3.2-1B-Instruct-abliterated
huggingface-cli download --repo-type dataset huihui-ai/QWQ-LONGCOT-500K --local-dir ./data/QWQ-LONGCOT-500K
huggingface-cli download --repo-type dataset huihui-ai/LONGCOT-Refine-500K --local-dir ./data/LONGCOT-Refine-500K
```
3. Used only the huihui-ai/QWQ-LONGCOT-500K dataset (#20000), Trained for 1 epoch:
```
swift sft --model huihui-ai/Llama-3.2-1B-Instruct-abliterated --model_type llama3_2 --train_type lora --dataset "data/QWQ-LONGCOT-500K/qwq_500k.jsonl#20000" --torch_dtype bfloat16 --num_train_epochs 1 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --target_modules all-linear --gradient_accumulation_steps 16 --eval_steps 50 --save_steps 50 --save_total_limit 2 --logging_steps 5 --max_length 16384 --output_dir output/Llama-3.2-1B-Instruct-abliterated/lora/sft --system "You are a helpful assistant. You should think step-by-step." --warmup_ratio 0.05 --dataloader_num_workers 4 --model_author "huihui-ai" --model_name "MicroThinker"
```
4. Save the fine-tuned model. After you're done, input `exit` to exit.
Replace the directories below with specific ones.
```
swift infer --model huihui-ai/Llama-3.2-1B-Instruct-abliterated --adapters output/Llama-3.2-1B-Instruct-abliterated/lora/sft/v0-20250102-153619/checkpoint-1237 --stream true --merge_lora true
```
This should create a new model directory: `checkpoint-1237-merged`, Copy or move this directory to the `huihui` directory.
5. Perform inference on the fine-tuned model.
```
swift infer --model huihui/checkpoint-1237-merged --stream true --infer_backend pt --max_new_tokens 8192
```
6. Combined training with huihui-ai/QWQ-LONGCOT-500K (#20000) and huihui-ai/LONGCOT-Refine datasets (#20000), Trained for 1 epoch:
```
swift sft --model huihui-ai/checkpoint-1237-merged --model_type llama3_2 --train_type lora --dataset "data/QWQ-LONGCOT-500K/qwq_500k.jsonl#20000" "data/LONGCOT-Refine-500K/refine_from_qwen2_5.jsonl#20000" --torch_dtype bfloat16 --num_train_epochs 1 --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --learning_rate 1e-4 --lora_rank 8 --lora_alpha 32 --target_modules all-linear --gradient_accumulation_steps 16 --eval_steps 50 --save_steps 50 --save_total_limit 2 --logging_steps 5 --max_length 16384 --output_dir output/Llama-3.2-1B-Instruct-abliterated/lora/sft2 --system "You are a helpful assistant. You should think step-by-step." --warmup_ratio 0.05 --dataloader_num_workers 4 --model_author "huihui-ai" --model_name "MicroThinker"
```
7. Save the final fine-tuned model. After you're done, input `exit` to exit.
Replace the directories below with specific ones.
```
swift infer --model huihui-ai/checkpoint-1237-merged --adapters output/Llama-3.2-1B-Instruct-abliterated/lora/sft2/v0-20250103-121319/checkpoint-2474 --stream true --merge_lora true
```
This should create a new model directory: `checkpoint-2474-merged`, Rename the directory to `MicroThinker-1B-Preview`, Copy or move this directory to the `huihui` directory.
8. Perform inference on the final fine-tuned model.
```
swift infer --model huihui/MicroThinker-1B-Preview --stream true --infer_backend pt --max_new_tokens 8192
```
9. Test examples.
```
How many 'r' characters are there in the word "strawberry"?
``` |