jpacifico commited on
Commit
d34bbd5
1 Parent(s): 92bbdc4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -50
README.md CHANGED
@@ -21,66 +21,69 @@ Window context = 4k tokens
21
 
22
  ### OpenLLM Leaderboard
23
 
24
- TBD
25
 
26
  ### MT-Bench-French
27
 
28
- Chocolatine-14B-Instruct-DPO-v1.2 is outperforming its base model Phi-3-medium-4k-instruct on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french), used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench) and GPT-4-Turbo as LLM-judge.
29
 
30
  ```
31
  ########## First turn ##########
32
- score
33
- model turn
34
- gpt-4o-mini 1 9.28750
35
- Chocolatine-14B-Instruct-DPO-v1.2 1 8.61250
36
- Phi-3-medium-4k-instruct 1 8.22500
37
- gpt-3.5-turbo 1 8.13750
38
- Chocolatine-3B-Instruct-DPO-Revised 1 7.98750
39
- Daredevil-8B 1 7.88750
40
- NeuralDaredevil-8B-abliterated 1 7.62500
41
- Phi-3-mini-4k-instruct 1 7.21250
42
- Meta-Llama-3.1-8B-Instruct 1 7.05000
43
- vigostral-7b-chat 1 6.78750
44
- Mistral-7B-Instruct-v0.3 1 6.75000
45
- gemma-2-2b-it 1 6.45000
46
- French-Alpaca-7B-Instruct_beta 1 5.68750
47
- vigogne-2-7b-chat 1 5.66250
 
48
 
49
  ########## Second turn ##########
50
- score
51
- model turn
52
- gpt-4o-mini 2 8.912500
53
- Chocolatine-14B-Instruct-DPO-v1.2 2 8.337500
54
- Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
55
- Phi-3-medium-4k-instruct 2 7.750000
56
- gpt-3.5-turbo 2 7.679167
57
- NeuralDaredevil-8B-abliterated 2 7.125000
58
- Daredevil-8B 2 7.087500
59
- Meta-Llama-3.1-8B-Instruct 2 6.787500
60
- Mistral-7B-Instruct-v0.3 2 6.500000
61
- Phi-3-mini-4k-instruct 2 6.487500
62
- vigostral-7b-chat 2 6.162500
63
- gemma-2-2b-it 2 6.100000
64
- French-Alpaca-7B-Instruct_beta 2 5.487395
65
- vigogne-2-7b-chat 2 2.775000
 
66
 
67
  ########## Average ##########
68
- score
69
- model
70
- gpt-4o-mini 9.100000
71
- Chocolatine-14B-Instruct-DPO-v1.2 8.475000
72
- Phi-3-medium-4k-instruct 7.987500
73
- Chocolatine-3B-Instruct-DPO-Revised 7.962500
74
- gpt-3.5-turbo 7.908333
75
- Daredevil-8B 7.487500
76
- NeuralDaredevil-8B-abliterated 7.375000
77
- Meta-Llama-3.1-8B-Instruct 6.918750
78
- Phi-3-mini-4k-instruct 6.850000
79
- Mistral-7B-Instruct-v0.3 6.625000
80
- vigostral-7b-chat 6.475000
81
- gemma-2-2b-it 6.275000
82
- French-Alpaca-7B-Instruct_beta 5.587866
83
- vigogne-2-7b-chat 4.218750
 
84
  ```
85
 
86
  ### Usage
 
21
 
22
  ### OpenLLM Leaderboard
23
 
24
+ TBD.
25
 
26
  ### MT-Bench-French
27
 
28
+ Chocolatine-14B-Instruct-DPO-v1.2 outperforms its previous versions and its base model Phi-3-medium-4k-instruct on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french), used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench) and GPT-4-Turbo as LLM-judge.
29
 
30
  ```
31
  ########## First turn ##########
32
+ score
33
+ model turn
34
+ gpt-4o-mini 1 9.2875
35
+ Chocolatine-14B-Instruct-4k-DPO 1 8.6375
36
+ Chocolatine-14B-Instruct-DPO-v1.2 1 8.6125
37
+ Phi-3.5-mini-instruct 1 8.5250
38
+ Chocolatine-3B-Instruct-DPO-v1.2 1 8.3750
39
+ Phi-3-medium-4k-instruct 1 8.2250
40
+ gpt-3.5-turbo 1 8.1375
41
+ Chocolatine-3B-Instruct-DPO-Revised 1 7.9875
42
+ Daredevil-8B 1 7.8875
43
+ Meta-Llama-3.1-8B-Instruct 1 7.0500
44
+ vigostral-7b-chat 1 6.7875
45
+ Mistral-7B-Instruct-v0.3 1 6.7500
46
+ gemma-2-2b-it 1 6.4500
47
+ French-Alpaca-7B-Instruct_beta 1 5.6875
48
+ vigogne-2-7b-chat 1 5.6625
49
 
50
  ########## Second turn ##########
51
+ score
52
+ model turn
53
+ gpt-4o-mini 2 8.912500
54
+ Chocolatine-14B-Instruct-DPO-v1.2 2 8.337500
55
+ Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
56
+ Chocolatine-3B-Instruct-DPO-v1.2 2 7.862500
57
+ Phi-3-medium-4k-instruct 2 7.750000
58
+ Chocolatine-14B-Instruct-4k-DPO 2 7.737500
59
+ gpt-3.5-turbo 2 7.679167
60
+ Phi-3.5-mini-instruct 2 7.575000
61
+ Daredevil-8B 2 7.087500
62
+ Meta-Llama-3.1-8B-Instruct 2 6.787500
63
+ Mistral-7B-Instruct-v0.3 2 6.500000
64
+ vigostral-7b-chat 2 6.162500
65
+ gemma-2-2b-it 2 6.100000
66
+ French-Alpaca-7B-Instruct_beta 2 5.487395
67
+ vigogne-2-7b-chat 2 2.775000
68
 
69
  ########## Average ##########
70
+ score
71
+ model
72
+ gpt-4o-mini 9.100000
73
+ Chocolatine-14B-Instruct-DPO-v1.2 8.475000
74
+ Chocolatine-14B-Instruct-4k-DPO 8.187500
75
+ Chocolatine-3B-Instruct-DPO-v1.2 8.118750
76
+ Phi-3.5-mini-instruct 8.050000
77
+ Phi-3-medium-4k-instruct 7.987500
78
+ Chocolatine-3B-Instruct-DPO-Revised 7.962500
79
+ gpt-3.5-turbo 7.908333
80
+ Daredevil-8B 7.487500
81
+ Meta-Llama-3.1-8B-Instruct 6.918750
82
+ Mistral-7B-Instruct-v0.3 6.625000
83
+ vigostral-7b-chat 6.475000
84
+ gemma-2-2b-it 6.275000
85
+ French-Alpaca-7B-Instruct_beta 5.587866
86
+ vigogne-2-7b-chat 4.218750
87
  ```
88
 
89
  ### Usage