Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
WDong
/
dpo_0621
like
0
PEFT
Safetensors
llama-factory
lora
Generated from Trainer
License:
other
Model card
Files
Files and versions
Community
Use this model
5ec813d
dpo_0621
/
train_results.json
WDong
Upload 17 files
5ec813d
verified
4 months ago
raw
Copy download link
history
blame
221 Bytes
{
"epoch"
:
2.994495412844037
,
"total_flos"
:
7.837376281021809e+17
,
"train_loss"
:
0.23205441671113172
,
"train_runtime"
:
8090.8922
,
"train_samples_per_second"
:
1.616
,
"train_steps_per_second"
:
0.05
}