tensopolis's picture
Update README.md
7ad800a verified
metadata
base_model: unsloth/Qwen2.5-3B-Instruct
datasets:
  - open-r1/OpenR1-Math-220k
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - qwen2.5
  - trl
  - sft
license_name: qwen-research
license_link: https://huggingface.co/Qwen/Qwen2.5-3B-Instruct/blob/main/LICENSE
language:
  - en
image

qwen2.5-3b-or1-tensopolis

This model is a reasoning fine-tune of unsloth/Qwen2.5-3B-Instruct. Trained in 1xA100 for about 50 hours. Please refer to the base model and dataset for more information about license, prompt format, etc.

Base model: Qwen/Qwen2.5-3B-Instruct

Dataset: open-r1/OpenR1-Math-220k

This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.