metadata
base_model: unsloth/Qwen2.5-3B-Instruct
datasets:
- open-r1/OpenR1-Math-220k
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2.5
- trl
- sft
license_name: qwen-research
license_link: https://huggingface.co/Qwen/Qwen2.5-3B-Instruct/blob/main/LICENSE
language:
- en

qwen2.5-3b-or1-tensopolis
This model is a reasoning fine-tune of unsloth/Qwen2.5-3B-Instruct. Trained in 1xA100 for about 50 hours. Please refer to the base model and dataset for more information about license, prompt format, etc.
Base model: Qwen/Qwen2.5-3B-Instruct
Dataset: open-r1/OpenR1-Math-220k
This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.