Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
mnoukhov
/
pythia410m-dpo-tldr
like
0
PEFT
TensorBoard
Safetensors
Generated from Trainer
License:
apache-2.0
Model card
Files
Files and versions
Metrics
Training metrics
Community
Use this model
01300dd
pythia410m-dpo-tldr
/
code
/
__pycache__
1 contributor
History:
1 commit
This model has 1 file scanned as unsafe.
Show
files
mnoukhov
mnoukhov/pythia410m-dpo-tldr
01300dd
verified
9 months ago
callbacks.cpython-311.pyc
18.8 kB
mnoukhov/pythia410m-dpo-tldr
9 months ago
generate_and_eval.cpython-311.pyc
15.5 kB
mnoukhov/pythia410m-dpo-tldr
9 months ago
generate_and_llm_judge.cpython-311.pyc
17.2 kB
mnoukhov/pythia410m-dpo-tldr
9 months ago
generate_vllm.cpython-311.pyc
13.5 kB
mnoukhov/pythia410m-dpo-tldr
9 months ago
gpt_reward_modeling.cpython-311.pyc
24.1 kB
mnoukhov/pythia410m-dpo-tldr
9 months ago
scalar_rm_model.cpython-311.pyc
13.7 kB
mnoukhov/pythia410m-dpo-tldr
9 months ago