A newer version of this model is available: Qwen/Qwen2.5-0.5B

Qwen2.5-0.5B Fine-Tuned on GSM8K with DeepSeek Augmentation

Model Overview πŸš€

This model is a fine-tuned version of Qwen2.5-0.5B, specifically trained for mathematical reasoning tasks using the GSM8K dataset, with additional Chain-of-Thought (CoT) reasoning augmentation from DeepSeek-V3. The model has been fine-tuned to generate detailed step-by-step solutions to grade school math problems, ensuring better logical reasoning and interpretability.

πŸ”Ή Key Features

  • Base Model: Qwen/Qwen2.5-0.5B
  • Fine-Tuned On: eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1
  • Optimized for: Mathematical problem-solving & step-by-step reasoning
  • Fine-tuned with: LoRA (Low-Rank Adaptation) for parameter-efficient training
  • Chain-of-Thought (CoT): Generates clear and structured reasoning for each problem
  • Inference-ready: Available on πŸ€— Hugging Face Hub

Model Details πŸ“œ

πŸ“ Description

  • Developed by: [Your Name or Organization]
  • Funded by: [Optional: Mention if funded]
  • Shared by: Hugging Face Hub
  • Model Type: Causal Language Model (Text Generation)
  • Languages: English (en)
  • License: MIT License
  • Fine-tuned from: Qwen/Qwen2.5-0.5B

πŸ“‚ Model Repository


πŸ“₯ How to Load & Use This Model

You can load this model using πŸ€— transformers as follows:

from transformers import AutoModelForCausalLM, AutoTokenizer

# Define model repo ID (Replace with actual HF repo)
model_name = "your-repo-id"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Move model to GPU (if available)
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)

# Example inference
question = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?"
inputs = tokenizer(question, return_tensors="pt").to(device)
output = model.generate(**inputs, max_length=200)

# Decode and print response
print(tokenizer.decode(output[0], skip_special_tokens=True))

πŸ”¬ Training Details

πŸ—„οΈ Training Data

The model was fine-tuned on the GSM8K dataset, specifically the augmented dataset: πŸ”Ή eagle0504/openai-gsm8k-enhanced-using-together-ai-deepseek-train8k-test1k-v1

This dataset contains:

  • 8K training samples (train split)
  • 1K testing samples (test split)
  • Features: "question", "answer", and "cot" (Chain-of-Thought)

βš™οΈ Training Procedure

  • Preprocessing: Each question was formatted using a prompt template to encourage step-by-step reasoning.
  • Training Framework: Used transformers, trl, and unsloth for efficient fine-tuning.
  • Fine-Tuning Strategy: LoRA (Low-Rank Adaptation)
    • Applied to query and value projection layers (q_proj, v_proj)
    • LoRA hyperparameters:
      • r=8, lora_alpha=16, lora_dropout=0.1
  • Optimization:
    • Mixed Precision Training (fp16)
    • Batch Size: 16
    • Gradient Accumulation: 1
    • Learning Rate: 2e-4
  • Training Time: Approx. 10,446 seconds (~3 hours)

πŸ“Š Performance & Evaluation

βœ… Training Performance

Step Loss Grad Norm Learning Rate Epoch
10 2.1319 3.656 2e-4 0.0107
1000 0.2013 0.328 2.3e-7 9.98
9340 0.2048 0.341 2.1e-8 9.99

πŸ§ͺ Testing & Expected Results

The model was evaluated on the 1K test samples and showed strong accuracy in multi-step problem-solving.

Example expected response:

To solve the problem, we first find the clips sold in May:
  Clips in May = 48 / 2 = 24
Next, we find the total:
  Total Clips = 48 + 24 = 72
#### Answer: 72

πŸ›‘ Bias, Risks, and Limitations

⚠️ Potential Risks

  • May hallucinate incorrect reasoning steps if prompts are unclear.
  • Could struggle with complex mathematical problems outside its training data.
  • Limited generalization to non-math reasoning tasks.

🎯 Recommendations

  • If using this model for critical applications, verify outputs with human review.
  • For better performance, fine-tune on larger datasets with real-world numerical reasoning.

🌎 Environmental Impact

Estimated Carbon Emissions:

  • Hardware Used: NVIDIA A100 GPU
  • Training Time: ~3 hours
  • Estimated CO2 Emitted: ~5.6 kg CO2eq (using ML Impact Calculator)

πŸ“š Citation

If you use this model in your research, please cite it as:

@misc{Upcoming,
  title={Upcoming},
  author={Yiqiao},
  year={2025}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Dataset used to train eagle0504/Qwen2_5-0_5B-using-openai-gsm8k-data-enhanced-with-deepseek-v2