--- license: cc-by-nc-4.0 base_model: Edentns/DataVortexS-10.7B-dpo-v1.11 tags: - trl - dpo - generated_from_trainer model-index: - name: nhn_dpo_v3_DataVortexS-10.7B-dpo-v1.11_DPO results: [] --- # ENERGY-DRINK-LOVE/DataVortexS_dpov3 ### Our Team * Youjin Chung * Jingyeom Kim ## Model ### Base Model * [Edentns/DataVortexS-10.7B-dpo-v1.11](https://huggingface.co/Edentns/DataVortexS-10.7B-dpo-v1.11) ### Hardware and Software * Hardware: A100 * 8 for training our model * Deepspeed library & Huggingface TRL Trainer ### Dataset * DPO_dataset * 자체 제작 dpo dataset(AI-hub dataset 활용) * OpenOrca DPO 등 영어 데이터셋 번역(ENERGY-DRINK-LOVE/translate_share_gpt_dedup_llama_SFT_1024, 자체모델 활용) ### Training Method * [DPO](https://arxiv.org/abs/2305.18290) ## Benchmark **[Ko LM Eval Harness](https://github.com/Beomi/ko-lm-evaluation-harness)** **[Ko-LLM-Leaderboard](https://www.aihub.or.kr/leaderboard/view.do?currMenu=500&topMenu=102)** * (240316기준 7등) * ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6551c0e37bbfce18781a8748/S4cpra6iTlzCdN7PP6A3o.png) | Average | Ko-ARC | Ko-HellaSwag | Ko-MMLU | Ko-TruthfulQA | Ko-CommonGen V2 | | ------: | -----: | -----------: | ------: | ------------: | --------------: | | 60.18 | 56.23 | 69.15 | 52.76 | 67.87 | 54.9 |