Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
koshirowada
/
pythia_70m_dpo
like
0
Text Generation
Transformers
TensorBoard
Safetensors
tatsu-lab/alpaca_farm
gpt_neox
Generated from Trainer
trl
dpo
text-generation-inference
Inference Endpoints
arxiv:
2305.18290
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
Deploy
Use this model
koshirowada
commited on
Nov 20, 2024
Commit
3debf66
·
verified
·
1 Parent(s):
eebbfdd
Update README.md
Browse files
Files changed (1)
hide
show
README.md
+2
-0
README.md
CHANGED
Viewed
@@ -7,6 +7,8 @@ tags:
7
- trl
8
- dpo
9
licence: license
10
---
11
12
# Model Card for pythia_70m_dpo
7
- trl
8
- dpo
9
licence: license
10
+
datasets:
11
+
- tatsu-lab/alpaca_farm
12
---
13
14
# Model Card for pythia_70m_dpo