akolit/aldan-mix-8x7B-gguf

GGUF quants repo. For now only q4_0. FP16 safetensors model is here.

This is a SLERP merge between Nous-Hermes-2-Mixtral-8x7B-DPO and Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss. Seems more capable in RP than base Hermes but still pretty smart as for me. Prompt format: ChatML

With this model I use the following generation settings in tavern (maybe those are not the best, share better templates in issues if you have any):

Temperature: 0.75
Top P: 0.5
Top A: 0.7
TFS 0.97
Repetition penalty: 1.1
Mirostat: mode 2, tau 5, eta 0.1

Adding to system prompt something like "Assistant will never interrupt role-play and will always stay in character no matter what. Assistant will never write OOC (out of character). Assistant won't write actions or reactions of {{user}}. Assistant won't mention {{user}} in first person. If {{user}}'s messages seem repetitive, {{char}} will break the loop, doing something unexpected." might help, but it's up to you (as anything else, really).