Chess Reasoner Model

This is a LoRA-trained model based on Qwen2.5-14B-Instruct, fine-tuned to analyze chess positions and suggest moves. The model was trained using GRPO (Guided Reward Proximal Optimization) to maximize move quality as evaluated by Stockfish.

Model Details

Base Model: Qwen/Qwen2.5-14B-Instruct
Training Method: LoRA with GRPO
Training Data: Generated chess positions with Stockfish evaluations
Input Format: Chess positions in FEN notation with ASCII board visualization

Output Format:

<think>
Analysis of the position
</think>
<move>
Chosen move in UCI format (e.g., e2e4)
</move>

Usage

from unsloth import FastLanguageModel
from safetensors.torch import load_file

# Load the base model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="Qwen/Qwen2.5-14B-Instruct",
    max_seq_length=1024,
    load_in_4bit=True,
    fast_inference=True
)

# Get PEFT model with LoRA configuration
model = FastLanguageModel.get_peft_model(
    model,
    r=64,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_alpha=64
)

# Load the LoRA weights
state_dict = load_file("adapter_model.safetensors")
model.load_state_dict(state_dict)

Training

The model was trained on chess positions with rewards based on move quality as evaluated by Stockfish. The reward function considers:

Move legality
Move syntax
Move quality (compared to Stockfish's top 3 moves)

License

This model inherits the license of the base Qwen2.5-14B-Instruct model.

Model Card for Model ID

Model Details

Model Description

Developed by: [More Information Needed]
Funded by [optional]: [More Information Needed]
Shared by [optional]: [More Information Needed]
Model type: [More Information Needed]
Language(s) (NLP): [More Information Needed]
License: [More Information Needed]
Finetuned from model [optional]: [More Information Needed]

Model Sources [optional]

Repository: [More Information Needed]
Paper [optional]: [More Information Needed]
Demo [optional]: [More Information Needed]

Uses

Direct Use

[More Information Needed]

Downstream Use [optional]

[More Information Needed]

Out-of-Scope Use

[More Information Needed]

Bias, Risks, and Limitations

[More Information Needed]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

jayasuryajsk
/

chess-reasoner-qwen

Chess Reasoner Model

Model Details

Usage

Training

License

Model Card for Model ID

Model Details

Model Description

Model Sources [optional]

Uses

Direct Use

Downstream Use [optional]

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Training Details

Training Data

Training Procedure

Preprocessing [optional]

Training Hyperparameters

Speeds, Sizes, Times [optional]

Evaluation

Testing Data, Factors & Metrics

Testing Data

Factors

Metrics

Results

Summary