Qwen2.5-0.5B Fine-Tuned on GSM8K with DeepSeek Augmentation

Model Overview 🚀

This model is a fine-tuned version of Qwen2.5-0.5B, specifically trained for mathematical reasoning tasks using the GSM8K dataset, with additional Chain-of-Thought (CoT) reasoning augmentation from DeepSeek-V3. The model has been fine-tuned to generate detailed step-by-step solutions to grade school math problems, ensuring better logical reasoning and interpretability.

🔹 Key Features

Base Model: Qwen/Qwen2.5-0.5B
Fine-Tuned On: eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1
Optimized for: Mathematical problem-solving & step-by-step reasoning
Fine-tuned with: LoRA (Low-Rank Adaptation) for parameter-efficient training
Chain-of-Thought (CoT): Generates clear and structured reasoning for each problem
Inference-ready: Available on 🤗 Hugging Face Hub

Model Details 📜

📝 Description

Developed by: [Your Name or Organization]
Funded by: [Optional: Mention if funded]
Shared by: Hugging Face Hub
Model Type: Causal Language Model (Text Generation)
Languages: English (en)
License: MIT License
Fine-tuned from: Qwen/Qwen2.5-0.5B

📂 Model Repository

Hugging Face Model Page:
👉 Fine-tuned Qwen2.5-0.5B

📥 How to Load & Use This Model

You can load this model using 🤗 transformers as follows:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Define model repo ID (Replace with actual HF repo)
model_name = "your-repo-id"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Move model to GPU (if available)
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)

# Example inference
question = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"
inputs = tokenizer(question, return_tensors="pt").to(device)
output = model.generate(**inputs, max_length=200)

# Decode and print response
print(tokenizer.decode(output[0], skip_special_tokens=True))

🔬 Training Details

🗄️ Training Data

The model was fine-tuned on the GSM8K dataset, specifically the augmented dataset: 🔹 eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1

This dataset contains:

8K training samples (train split)
1K testing samples (test split)
Features: "question", "answer", and "cot" (Chain-of-Thought)

⚙️ Training Procedure

Preprocessing: Each question was formatted using a prompt template to encourage step-by-step reasoning.
Training Framework: Used transformers, trl, and unsloth for efficient fine-tuning.
Fine-Tuning Strategy: LoRA (Low-Rank Adaptation)
- Applied to query and value projection layers (q_proj, v_proj)
- LoRA hyperparameters:
  - r=8, lora_alpha=16, lora_dropout=0.1
Optimization:
- Mixed Precision Training (fp16)
- Batch Size: 16
- Gradient Accumulation: 1
- Learning Rate: 2e-4
Training Time: Approx. 10,446 seconds (~3 hours)

📊 Performance & Evaluation

✅ Training Performance

Step	Loss	Grad Norm	Learning Rate	Epoch
10	2.1319	3.656	2e-4	0.0107
1000	0.2013	0.328	2.3e-7	9.98
9340	0.2048	0.341	2.1e-8	9.99

🧪 Testing & Expected Results

The model was evaluated on the 1K test samples and showed strong accuracy in multi-step problem-solving.

Example expected response:

To solve the problem, we first find the clips sold in May:
  Clips in May = 48 / 2 = 24
Next, we find the total:
  Total Clips = 48 + 24 = 72
#### Answer: 72

🛑 Bias, Risks, and Limitations

⚠️ Potential Risks

May hallucinate incorrect reasoning steps if prompts are unclear.
Could struggle with complex mathematical problems outside its training data.
Limited generalization to non-math reasoning tasks.

🎯 Recommendations

If using this model for critical applications, verify outputs with human review.
For better performance, fine-tune on larger datasets with real-world numerical reasoning.

🌎 Environmental Impact

Estimated Carbon Emissions:

Hardware Used: NVIDIA A100 GPU
Training Time: ~3 hours
Estimated CO2 Emitted: ~5.6 kg CO2eq (using ML Impact Calculator)

📚 Citation

If you use this model in your research, please cite it as:

@misc{Upcoming,
  title={Upcoming},
  author={Yiqiao},
  year={2025}
}

eagle0504
/

Qwen2_5-0_5B-using-openai-gsm8k-data-enhanced-with-deepseek-v2