nhn_dpo_v3_nox-solar-10.7b-v4_DPO
Our Team
- Youjin Chung
- Jingyeom Kim
Model
Base Model
Hardware and Software
- Hardware: A100 * 8 for training our model
- Deepspeed library & Huggingface TRL Trainer
Dataset
- DPO_dataset
- ์์ฒด ์ ์ dpo dataset(AI-hub dataset ํ์ฉ)
- OpenOrca DPO ๋ฑ ์์ด ๋ฐ์ดํฐ์ ๋ฒ์ญ(ENERGY-DRINK-LOVE/translate_share_gpt_dedup_llama_SFT_1024, ์์ฒด๋ชจ๋ธ ํ์ฉ)
Training Method
Benchmark
0 shot (macro f1)
kobest_boolq | kobest_copa | kobest_hellaswag | kobest_sentineg |
---|---|---|---|
0.931613 | 0.740751 | 0.468602 | 0.488465 |
- Downloads last month
- 1,665
This model does not have enough activity to be deployed to Inference API (serverless) yet.
Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.