GGUF
Not-For-All-Audiences
nsfw
Merge
Inference Endpoints

Fairly pleased

#2
by Utochi - opened

I'm decently happy with this model using the Q5_0 variant. i cant say that its perfect but it really tries its best. Tested on a 2700 token character card with complex instructions and it does better than most. doesn't quite meet the precision of QuartetAnemoi Q2_k variant but where it lacks in precision instruction following it makes up for in speed and creativity IMO.
Using faraday GUI with default settings at 6k context

@Undi95 any chance you could make a 40b MOE with llama-3?

Sign up or log in to comment