Z3R6X commited on
Commit
1b68599
1 Parent(s): 5f38df4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -5
README.md CHANGED
@@ -4,12 +4,16 @@ license: llama3
4
  [Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) finetuned on [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k) with [ORPO](https://arxiv.org/abs/2403.07691).\
5
  Max length was reduced to 1024 tokens. LoRA (r=16) and 4bit quantization was used to increase memory efficiency.
6
 
7
- | **Benchmark** | **LLaMa 3 8B** | **LLaMa 3 8B Inst** | **LLaMa 3 8B ORPO V1** | **LLaMa 3 8B ORPO V2** |
8
  |--------------------|:-----------------:|:----------------:|:---------------:|----------------|
9
- | **MMLU** | 62.12 | - | 61.87 | - |
10
- | **BoolQ** | 81.04 | - | 82.42 | - |
11
- | **Winogrande** | 73.24 | - | 74.43 | - |
12
- | **ARC ch** | 53.24 | - | 52.90 | - |
 
 
 
 
13
  All scores obtained with [lm-evaluation-harness v0.4.2](https://github.com/EleutherAI/lm-evaluation-harness)
14
 
15
 
 
4
  [Llama 3 8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) finetuned on [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k) with [ORPO](https://arxiv.org/abs/2403.07691).\
5
  Max length was reduced to 1024 tokens. LoRA (r=16) and 4bit quantization was used to increase memory efficiency.
6
 
7
+ | **Benchmark** | **LLaMa 3 8B** | **LLaMa 3 8B Inst** | **LLaMa 3 8B ORPO V1** | **LLaMa 3 8B ORPO V2 (WIP)** |
8
  |--------------------|:-----------------:|:----------------:|:---------------:|----------------|
9
+ | **MMLU** | 62.12 | 63.92 | 61.87 | |
10
+ | **BoolQ** | 81.04 | 83.21 | 82.42 | |
11
+ | **Winogrande** | 73.24 | 72.06 | 74.43 | |
12
+ | **ARC-Challenge** | 53.24 | 56.91 | 52.90 | |
13
+ | **TriviaQA** | 63.33 | 51.09 | 63.93 | |
14
+ | **GSM-8K (flexible)** | 50.27 | 75.13 | 52.16 | |
15
+ | **SQuAD V2 (f1)** | 32.48 | 29.68 | 33.68 | |
16
+ | **LogiQA** | 29.23 | 32.87 | 30.26 | |
17
  All scores obtained with [lm-evaluation-harness v0.4.2](https://github.com/EleutherAI/lm-evaluation-harness)
18
 
19