File size: 946 Bytes
515b3de
00cfded
515b3de
cc29995
eef6a56
06128af
 
cc29995
df1d4d4
b34565e
30a4b45
 
 
 
 
 
 
 
 
 
cc29995
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---
license: cc-by-2.0
---
Finetune of miqu-70b-sf dequant of miqudev's leak of Mistral-70B (allegedly an early mistral medium). My diffs are available under CC-0 (That is the Senku-70B repo, full includes the merge), this is a merge with the leaked model, you can use the other repository to save bandwidth.

EQ-Bench: 84.89 
GSM8k: 77.18 (71.04 when using ChatML)
Hellaswag: 87.67 

Edit: Upon further testing a score of 85.09 was achieved using ChatML instead of Mistral's prompt. 

I recommend using the ChatML format instead, I will run more benchmarks. This also fixes the bug with Miqu dequant failing to provide a stop. 
<|im_start|>system 
Provide some context and/or instructions to the model.
<|im_end|> 
<|im_start|>user 
The user’s message goes here
<|im_end|> 
<|im_start|>assistant <|im_end|>

Credit to https://twitter.com/hu_yifei for providing GSM & Hellaswag. It is the first open weight model to dethrone GPT-4 on EQ bench,