hard_dpo / README.md
honggen's picture
Create README.md
bd014a4 verified
|
raw
history blame contribute delete
No virus
176 Bytes
---
license: apache-2.0
datasets:
- Anthropic/hh-rlhf
language:
- en
pipeline_tag: text-generation
---
The reference model after supervised fine-tuning on the chosen response.