πŸš€ BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving

arXiv License: Apache 2.0 Lean 4

State-of-the-art tactic generation model in Lean4

This repository contains the latest tactic generator model checkpoint from BFS-Prover, a state-of-the-art theorem proving system in Lean4. While the full BFS-Prover system integrates multiple components for scalable theorem proving, we are releasing the core tactic generation model here. Given a proof state in Lean4, the model generates a tactic that transforms the current proof state into a new state, progressively working towards completing the proof.

πŸ“„ Paper: BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving

✨ Model Details

  • Base Model: Qwen2.5-Math-7B
  • Training Approach:
    • Supervised Fine-Tuning (SFT) on state-tactic pairs
    • Direct Preference Optimization (DPO) using compiler feedback
  • Training Data Sources:
    • Mathlib (via LeanDojo)
    • Lean-Github repositories
    • Lean-Workbook
    • Autoformalized NuminaMath-CoT dataset

πŸ“ˆ Performance

BFS-Prover achieves state-of-the-art performance on the MiniF2F test benchmark. Here's a detailed comparison:

πŸ” MiniF2F Test Benchmark Results

Prover System Search Method Critic Model Tactic Budget Score
BFS-Prover BFS No Accumulative 72.95%
BFS-Prover BFS No 2048Γ—2Γ—600 70.83% Β± 0.89%
HunyuanProver BFS Yes 600Γ—8Γ—400 68.4%
InternLM2.5-StepProver BFS Yes 256Γ—32Γ—600 65.9%
DeepSeek-Prover-V1.5 MCTS No 32Γ—16Γ—400 63.5%

πŸ”‘ Key Advantages

  • βœ… Achieves better performance without requiring a critic model (value function)
  • βœ… Combined with simpler search method (BFS) rather than MCTS

βš™οΈ Usage

  • The model expects Lean4 tactic states in the format "{state}:::"
  • ::: serves as a special indicator to signal the model to generate a tactic for the given state.
  • The model will echo back the input state followed by the generated tactic.
# Example code for loading and using the tactic generator model
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("bytedance-research/BFS-Prover")
tokenizer = AutoTokenizer.from_pretrained("bytedance-research/BFS-Prover")
state = "h : x = y + 2 ⊒ x - 1 = y + 1" 
sep = ":::"
prompt = state + sep  # Creates "h : x = y + 2 ⊒ x - 1 = y + 1:::"

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs)
tactic = tokenizer.decode(outputs[0], skip_special_tokens=True).split(sep)[1]
print(tactic)

# Complete example:
# Input state:  "h : x = y + 2 ⊒ x - 1 = y + 1"
# Full prompt:  "h : x = y + 2 ⊒ x - 1 = y + 1:::"
# Model output: "h : x = y + 2 ⊒ x - 1 = y + 1:::simp [h]"
# Final tactic: "simp [h]"

πŸ“š Citation

If you use this model in your research, please cite our paper:

@article{xin2025bfs,
  title={BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving},
  author={Xin, Ran and Xi, Chenguang and Yang, Jie and Chen, Feng and Wu, Hang and Xiao, Xia and Sun, Yifan and Zheng, Shen and Shen, Kai},
  journal={arXiv preprint arXiv:2502.03438},
  year={2025}
}

πŸ“„ License

https://choosealicense.com/licenses/apache-2.0/

πŸ“§ Contact

For questions and feedback about the tactic generator model, please contact:

Downloads last month
149
Safetensors
Model size
7.62B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for bytedance-research/BFS-Prover

Base model

Qwen/Qwen2.5-7B
Finetuned
(71)
this model
Quantizations
1 model

Datasets used to train bytedance-research/BFS-Prover