Chess Reasoner Model
This is a LoRA-trained model based on Qwen2.5-14B-Instruct, fine-tuned to analyze chess positions and suggest moves. The model was trained using GRPO (Guided Reward Proximal Optimization) to maximize move quality as evaluated by Stockfish.
Model Details
- Base Model: Qwen/Qwen2.5-14B-Instruct
- Training Method: LoRA with GRPO
- Training Data: Generated chess positions with Stockfish evaluations
- Input Format: Chess positions in FEN notation with ASCII board visualization
- Output Format:
<think> Analysis of the position </think> <move> Chosen move in UCI format (e.g., e2e4) </move>
Usage
from unsloth import FastLanguageModel
from safetensors.torch import load_file
# Load the base model
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="Qwen/Qwen2.5-14B-Instruct",
max_seq_length=1024,
load_in_4bit=True,
fast_inference=True
)
# Get PEFT model with LoRA configuration
model = FastLanguageModel.get_peft_model(
model,
r=64,
target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
lora_alpha=64
)
# Load the LoRA weights
state_dict = load_file("adapter_model.safetensors")
model.load_state_dict(state_dict)
Training
The model was trained on chess positions with rewards based on move quality as evaluated by Stockfish. The reward function considers:
- Move legality
- Move syntax
- Move quality (compared to Stockfish's top 3 moves)
License
This model inherits the license of the base Qwen2.5-14B-Instruct model.
Model Card for Model ID
Model Details
Model Description
- Developed by: [More Information Needed]
- Funded by [optional]: [More Information Needed]
- Shared by [optional]: [More Information Needed]
- Model type: [More Information Needed]
- Language(s) (NLP): [More Information Needed]
- License: [More Information Needed]
- Finetuned from model [optional]: [More Information Needed]
Model Sources [optional]
- Repository: [More Information Needed]
- Paper [optional]: [More Information Needed]
- Demo [optional]: [More Information Needed]
Uses
Direct Use
[More Information Needed]
Downstream Use [optional]
[More Information Needed]
Out-of-Scope Use
[More Information Needed]
Bias, Risks, and Limitations
[More Information Needed]
Recommendations
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
How to Get Started with the Model
Use the code below to get started with the model.
[More Information Needed]
Training Details
Training Data
[More Information Needed]
Training Procedure
Preprocessing [optional]
[More Information Needed]
Training Hyperparameters
- Training regime: [More Information Needed]
Speeds, Sizes, Times [optional]
[More Information Needed]
Evaluation
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
- Downloads last month
- 9