nhn_dpo_v3_nox-solar-10.7b-v4_DPO
Our Team
- Youjin Chung
- Jingyeom Kim
Model
Base Model
Hardware and Software
- Hardware: A100 * 8 for training our model
- Deepspeed library & Huggingface TRL Trainer
Dataset
- DPO_dataset
- μ체 μ μ dpo dataset(AI-hub dataset νμ©)
- OpenOrca DPO λ± μμ΄ λ°μ΄ν°μ λ²μ(ENERGY-DRINK-LOVE/translate_share_gpt_dedup_llama_SFT_1024, μ체λͺ¨λΈ νμ©)
Training Method
Benchmark
0 shot (macro f1)
kobest_boolq | kobest_copa | kobest_hellaswag | kobest_sentineg |
---|---|---|---|
0.931613 | 0.740751 | 0.468602 | 0.488465 |
- Downloads last month
- 4,187
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.