Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
blai88
/
reward_modeling_anthropic_hh
like
0
PEFT
Safetensors
opt
trl
reward-trainer
Generated from Trainer
License:
llama2
Model card
Files
Files and versions
Community
Use this model
main
reward_modeling_anthropic_hh
Commit History
End of training
c6a2310
verified
blai88
commited on
Jul 6
End of training
39ad302
verified
blai88
commited on
Jul 6
initial commit
0a64042
verified
blai88
commited on
Jul 6