hard_dpo / README.md
honggen's picture
Create README.md
bd014a4 verified
|
raw
history blame contribute delete
No virus
176 Bytes
metadata
license: apache-2.0
datasets:
  - Anthropic/hh-rlhf
language:
  - en
pipeline_tag: text-generation

The reference model after supervised fine-tuning on the chosen response.