--- license: mit language: - en tags: - ODIN - RLHF - PPO --- ## Model Details This is an official implementation of ODIN-ppo-L230-7B model, which is a chat assistant trained by fine-tuning LLaMA on Open-Assistant dataset via PPO. The L230 means the output length in LIMA test set is ~230. ODIN is the reward model for the training. ## Model Description - **Developed by:** [Lichang-Chen](https://huggingface.co/Lichang-Chen) and [Chen Zhu](https://scholar.google.com/citations?hl=zh-CN&user=m-om5O8AAAAJ) - **Model type:** RLHF model. - **Language(s) (NLP):** English - **Finetuned from model:** [Vicuna-7b](https://huggingface.co/lmsys/vicuna-7b-v1.5) ### Model Sources - **Repository:** [ODIN](https://github.com/Lichang-Chen/ODIN) - **Paper:** [ODIN: Disentangled Reward Mitigates Hacking in RLHF](https://huggingface.co/papers/2402.07319)