Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
XueyingJia
/
pythia-1b-deduped-hh-online-dpo
like
0
Text Generation
Transformers
Safetensors
XueyingJia/online_dpo_repo
gpt_neox
Generated from Trainer
trl
online-dpo
conversational
text-generation-inference
Inference Endpoints
arxiv:
2402.04792
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
pythia-1b-deduped-hh-online-dpo
Commit History
Create config.json
e1b0859
verified
XueyingJia
commited on
Nov 24, 2024
End of training
bf86b84
verified
XueyingJia
commited on
Nov 24, 2024
Model save
f0f67a2
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 15075
847c56d
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 13572
7bdead4
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 12064
e1a8902
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 10556
37a1550
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 9048
de587e3
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 7540
4efc623
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 6032
03ced79
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 4524
1745fea
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 3016
a607cfa
verified
XueyingJia
commited on
Nov 24, 2024
Training in progress, step 1508
9f68452
verified
XueyingJia
commited on
Nov 24, 2024
initial commit
fb87370
verified
XueyingJia
commited on
Nov 24, 2024