--- license: llama3 --- [Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) finetuned on [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k) with [ORPO](https://arxiv.org/abs/2403.07691).\ Max length was reduced to 1024 tokens. LoRA (r=16) and 4bit quantization was used to increase memory efficiency. | **Benchmark** | **LLaMa 3 8B** | **LLaMa 3 8B Inst** | **LLaMa 3 8B ORPO V1** | **LLaMa 3 8B ORPO V2 (WIP)** | |--------------------|:-----------------:|:----------------:|:---------------:|----------------| | **MMLU** | 62.12 | 63.92 | 61.87 | | | **BoolQ** | 81.04 | 83.21 | 82.42 | | | **Winogrande** | 73.24 | 72.06 | 74.43 | | | **ARC-Challenge** | 53.24 | 56.91 | 52.90 | | | **TriviaQA** | 63.33 | 51.09 | 63.93 | | | **GSM-8K (flexible)** | 50.27 | 75.13 | 52.16 | | | **SQuAD V2 (f1)** | 32.48 | 29.68 | 33.68 | | | **LogiQA** | 29.23 | 32.87 | 30.26 | | All scores obtained with [lm-evaluation-harness v0.4.2](https://github.com/EleutherAI/lm-evaluation-harness)