jpacifico commited on
Commit
a9859a2
1 Parent(s): a6d04ca

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -49
README.md CHANGED
@@ -39,63 +39,82 @@ Chocolatine is the best-performing 3B model on the [OpenLLM Leaderboard](https:/
39
 
40
  ### MT-Bench-French
41
 
42
- Chocolatine-3B-Instruct-DPO-Revised is outperforming GPT-3.5-Turbo on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french) by Bofeng Huang,
43
  used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench)
 
44
 
45
  ```
46
  ########## First turn ##########
47
- score
48
- model turn
49
- gpt-3.5-turbo 1 8.1375
50
- Chocolatine-3B-Instruct-DPO-Revised 1 7.9875
51
- Daredevil-8B 1 7.8875
52
- Daredevil-8B-abliterated 1 7.8375
53
- Chocolatine-3B-Instruct-DPO-v1.0 1 7.6875
54
- NeuralDaredevil-8B-abliterated 1 7.6250
55
- Phi-3-mini-4k-instruct 1 7.2125
56
- Meta-Llama-3-8B-Instruct 1 7.1625
57
- vigostral-7b-chat 1 6.7875
58
- Mistral-7B-Instruct-v0.3 1 6.7500
59
- Mistral-7B-Instruct-v0.2 1 6.2875
60
- French-Alpaca-7B-Instruct_beta 1 5.6875
61
- vigogne-2-7b-chat 1 5.6625
62
- vigogne-2-7b-instruct 1 5.1375
 
 
 
 
 
 
63
 
64
  ########## Second turn ##########
65
- score
66
- model turn
67
- Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
68
- gpt-3.5-turbo 2 7.679167
69
- Chocolatine-3B-Instruct-DPO-v1.0 2 7.612500
70
- NeuralDaredevil-8B-abliterated 2 7.125000
71
- Daredevil-8B 2 7.087500
72
- Daredevil-8B-abliterated 2 6.873418
73
- Meta-Llama-3-8B-Instruct 2 6.800000
74
- Mistral-7B-Instruct-v0.2 2 6.512500
75
- Mistral-7B-Instruct-v0.3 2 6.500000
76
- Phi-3-mini-4k-instruct 2 6.487500
77
- vigostral-7b-chat 2 6.162500
78
- French-Alpaca-7B-Instruct_beta 2 5.487395
79
- vigogne-2-7b-chat 2 2.775000
80
- vigogne-2-7b-instruct 2 2.240506
 
 
 
 
 
 
81
 
82
  ########## Average ##########
83
- score
84
- model
85
- Chocolatine-3B-Instruct-DPO-Revised 7.962500
86
- gpt-3.5-turbo 7.908333
87
- Chocolatine-3B-Instruct-DPO-v1.0 7.650000
88
- Daredevil-8B 7.487500
89
- NeuralDaredevil-8B-abliterated 7.375000
90
- Daredevil-8B-abliterated 7.358491
91
- Meta-Llama-3-8B-Instruct 6.981250
92
- Phi-3-mini-4k-instruct 6.850000
93
- Mistral-7B-Instruct-v0.3 6.625000
94
- vigostral-7b-chat 6.475000
95
- Mistral-7B-Instruct-v0.2 6.400000
96
- French-Alpaca-7B-Instruct_beta 5.587866
97
- vigogne-2-7b-chat 4.218750
98
- vigogne-2-7b-instruct 3.698113
 
 
 
 
 
 
99
  ```
100
 
101
  ### Usage
 
39
 
40
  ### MT-Bench-French
41
 
42
+ Chocolatine-3B-Instruct-DPO-Revised is outperforming GPT-3.5-Turbo on [MT-Bench-French](https://huggingface.co/datasets/bofenghuang/mt-bench-french),
43
  used with [multilingual-mt-bench](https://github.com/Peter-Devine/multilingual_mt_bench)
44
+ Notably, this latest version of the Chocolatine-3B model is approaching the performance of Phi-3-Medium (14B) in French, which is a remarkable achievement.
45
 
46
  ```
47
  ########## First turn ##########
48
+ score
49
+ model turn
50
+ gpt-4o-mini 1 9.28750
51
+ Chocolatine-14B-Instruct-4k-DPO 1 8.63750
52
+ Chocolatine-14B-Instruct-DPO-v1.2 1 8.61250
53
+ Phi-3-medium-4k-instruct 1 8.22500
54
+ gpt-3.5-turbo 1 8.13750
55
+ Chocolatine-3B-Instruct-DPO-Revised 1 7.98750
56
+ Daredevil-8B 1 7.88750
57
+ Daredevil-8B-abliterated 1 7.83750
58
+ Chocolatine-3B-Instruct-DPO-v1.0 1 7.68750
59
+ NeuralDaredevil-8B-abliterated 1 7.62500
60
+ Phi-3-mini-4k-instruct 1 7.21250
61
+ Meta-Llama-3-8B-Instruct 1 7.16250
62
+ Meta-Llama-3.1-8B-Instruct 1 7.05000
63
+ vigostral-7b-chat 1 6.78750
64
+ Mistral-7B-Instruct-v0.3 1 6.75000
65
+ gemma-2-2b-it 1 6.45000
66
+ Mistral-7B-Instruct-v0.2 1 6.28750
67
+ French-Alpaca-7B-Instruct_beta 1 5.68750
68
+ vigogne-2-7b-chat 1 5.66250
69
+ vigogne-2-7b-instruct 1 5.13750
70
 
71
  ########## Second turn ##########
72
+ score
73
+ model turn
74
+ gpt-4o-mini 2 8.912500
75
+ Chocolatine-14B-Instruct-DPO-v1.2 2 8.337500
76
+ Chocolatine-3B-Instruct-DPO-Revised 2 7.937500
77
+ Phi-3-medium-4k-instruct 2 7.750000
78
+ Chocolatine-14B-Instruct-4k-DPO 2 7.737500
79
+ gpt-3.5-turbo 2 7.679167
80
+ Chocolatine-3B-Instruct-DPO-v1.0 2 7.612500
81
+ NeuralDaredevil-8B-abliterated 2 7.125000
82
+ Daredevil-8B 2 7.087500
83
+ Daredevil-8B-abliterated 2 6.873418
84
+ Meta-Llama-3-8B-Instruct 2 6.800000
85
+ Meta-Llama-3.1-8B-Instruct 2 6.787500
86
+ Mistral-7B-Instruct-v0.2 2 6.512500
87
+ Mistral-7B-Instruct-v0.3 2 6.500000
88
+ Phi-3-mini-4k-instruct 2 6.487500
89
+ vigostral-7b-chat 2 6.162500
90
+ gemma-2-2b-it 2 6.100000
91
+ French-Alpaca-7B-Instruct_beta 2 5.487395
92
+ vigogne-2-7b-chat 2 2.775000
93
+ vigogne-2-7b-instruct 2 2.240506
94
 
95
  ########## Average ##########
96
+ score
97
+ model
98
+ gpt-4o-mini 9.100000
99
+ Chocolatine-14B-Instruct-DPO-v1.2 8.475000
100
+ Chocolatine-14B-Instruct-4k-DPO 8.187500
101
+ Phi-3-medium-4k-instruct 7.987500
102
+ Chocolatine-3B-Instruct-DPO-Revised 7.962500
103
+ gpt-3.5-turbo 7.908333
104
+ Chocolatine-3B-Instruct-DPO-v1.0 7.650000
105
+ Daredevil-8B 7.487500
106
+ NeuralDaredevil-8B-abliterated 7.375000
107
+ Daredevil-8B-abliterated 7.358491
108
+ Meta-Llama-3-8B-Instruct 6.981250
109
+ Meta-Llama-3.1-8B-Instruct 6.918750
110
+ Phi-3-mini-4k-instruct 6.850000
111
+ Mistral-7B-Instruct-v0.3 6.625000
112
+ vigostral-7b-chat 6.475000
113
+ Mistral-7B-Instruct-v0.2 6.400000
114
+ gemma-2-2b-it 6.275000
115
+ French-Alpaca-7B-Instruct_beta 5.587866
116
+ vigogne-2-7b-chat 4.218750
117
+ vigogne-2-7b-instruct 3.698113
118
  ```
119
 
120
  ### Usage