From your work, I find a new way to do model ensemble
1
#14 opened 7 months ago
by
xxx1
Adding Evaluation Results
#12 opened 8 months ago
by
leaderboard-pr-bot
The function_calling and translation abilities are weaker than Mixtral 8x7b
1
#11 opened 10 months ago
by
bingw5
Add mixture of experts tag
#10 opened 10 months ago
by
davanstrien
how this model goes work,can you share you idea or traning process? thanks
#9 opened 10 months ago
by
zachzhou
Add merge tag
2
#8 opened 10 months ago
by
osanseviero
Vram
2
#7 opened 10 months ago
by
DKRacingFan
source code and paper?
8
#6 opened 10 months ago
by
josephykwang
How does the MoE work?
3
#5 opened 10 months ago
by
PacmanIncarnate
Quant pls?
6
#4 opened 10 months ago
by
Yhyu13
What is your config?
1
#3 opened 10 months ago
by
Weyaxi
Should not be called mixtral, the models made into the moe are yi based
9
#2 opened 10 months ago
by
teknium
Add merge tags
#1 opened 10 months ago
by
JusticeDike