DeepSeekMath-Base-SFT-Step-DPO / train_results.json
xinlai's picture
upload model
07ddda3
{
"epoch": 20.0,
"total_flos": 0.0,
"train_loss": 0.18995573704399296,
"train_runtime": 12341.7767,
"train_samples": 7548,
"train_samples_per_second": 12.232,
"train_steps_per_second": 0.096
}