rasdani
/

qwen2-math-1_5b-step-dpo

Text Generation

alignment-handbook

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

qwen2-math-1_5b-step-dpo / trainer_state.json

rasdani's picture

Model save

40e5cfe verified 4 months ago

696 kB

File too large to display, you can check the raw version instead.