Auto Regressive Thinker (Art) v0 3B

Art v0 3B is our inaugural model in the Art series, fine-tuned from Qwen/Qwen2.5-3B-Instruct using a specialized dataset generated with Gemini 2.0 Flash Thinking. Read more about the Art series

Model Details

  • Base Model: Qwen2.5-3B-Instruct
  • Architecture: Transformer
  • Size: 3B parameters

Usage

The model incorporates a reasoning mechanism using specific tags:

<|start_reasoning|> model's reasoning process <|end_reasoning|> model's response

Recommendations

  • Use the model without quantization
  • Use the tokenizer chat template
  • Use a low temperature 0.1-0.3 and repetition_penalty of 1.1

Training Details

This experimental model was trained on a curated dataset generated using Gemini 2.0 Flash Thinking. Detailed training methodology, dataset, and code are available exclusively to our community members.

About Us

We are a community-funded AI research lab focused on advancing open-source AGI development. Our community members support us through Patreon donations.

Community Access

Our supporters get exclusive access to:

  • Training dataset
  • Training code and methodology
  • Behind-the-scenes development insights
  • Future model previews

Join Our Community

Downloads last month
33
Safetensors
Model size
3.09B params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for AGI-0/Art-v0-3B

Quantizations
8 models

Space using AGI-0/Art-v0-3B 1