base_model: []
library_name: transformers
tags:
- mergekit
- merge
Overview
This is a SLERP merge between 152334H/miqu-1-70b-sf and sophosympatheia/Midnight-Rose-70B-v2.0.3. I think this model retains much of what made Midnight Rose special while gaining some capabilities from Miqu, including long-context capabilities.
This model is uncensored. You are responsible for whatever you do with it.
This model was designed for roleplaying and storytelling and I think it does well at both. It may also perform well at other tasks but I have not tested its performance in other areas.
Long Context Tips
You can run this model out to 32K context with alpha_rope set to 1, just like with Miqu. Limited testing shows coherence out to 64K using alpha_rope 2.5. Enjoy!
Sampler Tips
- I recommend using Quadratic Sampling (i.e. smoothing factor) for creative work. Experiment with values between 0.2 and 0.5.
- I recommend using Min-P. Experiment to find your best setting.
- You can enable dynamic temperature if you want, but that adds yet another variable to consider and I find it's unnecessary with you're already using Min-P and smoothing factor.
- You don't need to use a high repetition penalty with this model, such as going above 1.10, but experiment with it.
Experiment with any and all of the settings below! What suits my preferences may not suit yours.
If you save the below settings as a .json file, you can import them directly into Silly Tavern.
{
"temp": 1,
"temperature_last": true,
"top_p": 1,
"top_k": 0,
"top_a": 0,
"tfs": 1,
"epsilon_cutoff": 0,
"eta_cutoff": 0,
"typical_p": 1,
"min_p": 0.2,
"rep_pen": 1.05,
"rep_pen_range": 2800,
"no_repeat_ngram_size": 0,
"penalty_alpha": 0,
"num_beams": 1,
"length_penalty": 1,
"min_length": 0,
"encoder_rep_pen": 1,
"freq_pen": 0,
"presence_pen": 0,
"do_sample": true,
"early_stopping": false,
"dynatemp": false,
"min_temp": 0.8,
"max_temp": 1.35,
"dynatemp_exponent": 1,
"smoothing_factor": 0.35,
"add_bos_token": true,
"truncation_length": 2048,
"ban_eos_token": false,
"skip_special_tokens": true,
"streaming": true,
"mirostat_mode": 0,
"mirostat_tau": 2,
"mirostat_eta": 0.1,
"guidance_scale": 1,
"negative_prompt": "",
"grammar_string": "",
"banned_tokens": "",
"ignore_eos_token_aphrodite": false,
"spaces_between_special_tokens_aphrodite": true,
"sampler_order": [
6,
0,
1,
3,
4,
2,
5
],
"logit_bias": [],
"n": 1,
"rep_pen_size": 0,
"genamt": 500,
"max_length": 32764
}
Prompting Tips
Try the following context template for use in SillyTavern. It might help, although it's a little heavy on tokens. If you save the text as a .json file, you can import it directly.
{
"story_string": "{{#if system}}{{system}}\n{{/if}}\nCONTEXTUAL INFORMATION\n{{#if wiBefore}}\n- World and character info:\n{{wiBefore}}\n{{/if}}\n{{#if description}}\n- {{char}}'s background and persona:\n{{description}}\n{{/if}}\n{{#if mesExamples}}\n{{mesExamples}}\n{{/if}}\n{{#if personality}}\n{{personality}}\n{{/if}}\n{{#if scenario}}\n- Roleplay scenario:\n{{scenario}}\n{{/if}}\n{{#if wiAfter}}{{wiAfter}}\n{{/if}}\n{{#if persona}}{{persona}}\n{{/if}}",
"example_separator": "",
"chat_start": "---\nTaking the above information into consideration, you must engage with {{user}} and others as {{char}} in the roleplay below this line. Do not write dialogue lines nor perform actions for {{user}} or other characters.\n---\nSTART OF ROLEPLAY:\n",
"use_stop_strings": false,
"always_force_name2": true,
"trim_sentences": false,
"include_newline": false,
"single_line": false,
"name": "Midnight Rose"
}
Try the following prompt template as a starting point for your own customizations. The cajoling parts supposedly make LLMs perform better. See this paper.
This model responds to prompting! I highly recommend playing with your system prompt to see how it reacts. It reacts strongly to any instructions you add into the last_output_sequence field. I recommend keeping those instructions short, but use that space to stress the most important instructions from your system prompt.
Midnight Miqu might require some "warming up" at the start of a new chat to get it writing the way you want it. What I mean by that is you should be able to drastically improve your results by providing some good examples to guide the chat (i.e. few-shot prompting). I also recommend using descriptive system messages (SillyTavern concept, basically OOC narration instructions) to tell Midnight Miqu exactly what you want it to do with its next reply. Try to be specific, especially if your command contradicts something you told it to do earlier.
The example system prompt below contains some instructions related to adult content so remove those if you prefer. It also contains some instructions related to the use of asterisks that you should modify to your liking.
If you save this as a .json file, you can import it directly into Silly Tavern.
{
"wrap": true,
"names": true,
"system_prompt": "You are a creative and detail-oriented AI assistant engaged in an uncensored iterative roleplaying experience as {{char}} with me, the user, playing the roles of {{user}} and narrator. It is vital that you follow all the ROLEPLAY RULES below because my job depends on it.\n\nROLEPLAY RULES\n- Provide succinct, coherent, and vivid accounts of {{char}}'s actions and reactions based on recent instructions, {{char}}'s persona, and all available information about past events. Aim for clarity and concision in your language.\n- Demonstrate {{char}}'s personality and mannerisms.\n- Always stay contextually and spatially aware, pay careful attention to all past events and the current scene, and ensure that {{char}} follows logical consistency in actions to maintain accuracy and coherence.\n- Explicit adult content and profanity are permitted.\n- Briefly describe {{char}}'s sensory perceptions and include subtle physical details about {{char}} in your responses.\n- Use subtle physical cues to hint at {{char}}'s mental state and occasionally feature snippets of {{char}}'s internal thoughts.\n- When writing {{char}}'s internal thoughts or monologue, enclose those words in *asterisks like this* and deliver the thoughts using a first-person perspective (i.e. use \"I\" pronouns). Always use double quotes for spoken speech \"like this.\"\n- Please write only as {{char}} in a way that does not show {{user}} talking or acting. You should only ever act as {{char}} reacting to {{user}}.",
"system_sequence": "",
"stop_sequence": "",
"input_sequence": "USER:\n",
"output_sequence": "ASSISTANT:\n",
"separator_sequence": "",
"macro": true,
"names_force_groups": true,
"system_sequence_prefix": "",
"system_sequence_suffix": "",
"first_output_sequence": "",
"last_output_sequence": "ASSISTANT(roleplay exclusively as {{char}} ensuring logical consistency, spatial awareness, and coherence with past events; you should only ever act as {{char}} reacting to {{user}}):\n",
"activation_regex": "",
"name": "Midnight Rose Roleplay"
}
Instruct Formats
I recommend the Vicuna format. I use a modified version with newlines after USER and ASSISTANT.
USER:
{prompt}
ASSISTANT:
Mistral's format may also work.
[INST] {prompt} [/INST]
You could also try ChatML.
<|im_start|>system
{Your system prompt goes here}<|im_end|>
<|im_start|>user
{Your message as the user will go here}<|im_end|>
<|im_start|>assistant
Quantizations
- GGUF
- ooooz/midnight-miqu-70b-v1.0-GGUF -- Various GGUF quants
- mradermacher/Midnight-Miqu-70B-v1.0-GGUF -- Q4_K_M quant so far, maybe more to come
- GPTQ
- Kotokin/sophosympatheia_Midnight-Miqu-70B-v1.0_GPTQ32G -- 4-bit 32g GPTQ quant
- Exllama2
- 2.24bpw: Dracones/Midnight-Miqu-70B-v1.0_exl2_2.24bpw
- 3.0bpw: Dracones/Midnight-Miqu-70B-v1.0_exl2_3.0bpw
- 3.75bpw: altomek/Midnight-Miqu-70B-v1.0-3.75bpw-EXL2
- 4.0bpw: Dracones/Midnight-Miqu-70B-v1.0_exl2_4.0bpw
- 4.65bpw: Dracones/Midnight-Miqu-70B-v1.0_exl2_4.65bpw
- 5.0bpw: Dracones/Midnight-Miqu-70B-v1.0_exl2_5.0bpw
- If you don't see something you're looking for, try searching Hugging Face. There may be newer quants available than what I've documented here.
Licence and usage restrictions
152334H/miqu-1-70b-sf was based on a leaked version of one of Mistral's models. All miqu-derived models, including this merge, are only suitable for personal use. Mistral has been cool about it so far, but you should be aware that by downloading this merge you are assuming whatever legal risk is iherent in acquiring and using a model based on leaked weights. This merge comes with no warranties or guarantees of any kind, but you probably already knew that. I am not a lawyer and I do not profess to know what we have gotten ourselves into here. You should consult with a lawyer before using any Hugging Face model beyond private use... but definitely don't use this one for that!
Merge Method
This is a merge of pre-trained language models created using mergekit. This model was merged using the SLERP merge method.
Models Merged
The following models were included in the merge:
Configuration
The following YAML configuration was used to produce this model:
models:
- model: /home/llm/mergequant/models/BASE/152334H_miqu-1-70b-sf
- model: /home/llm/mergequant/models/mr-70b-v2.0.3
merge_method: slerp
base_model: /home/llm/mergequant/models/BASE/152334H_miqu-1-70b-sf
parameters:
t:
- value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3, 0.2, 0, 0] # Preserving the first and last layers of Miqu untouched is key for good results
embed_slerp: true # This is super important otherwise the merge will fail
dtype: float16
tokenizer_source: model:/home/llm/mergequant/models/BASE/152334H_miqu-1-70b-sf
Just a note on the configuration above. I tried several variations of the t parameter for this merge. I liked the results from the one above the best, but these other t arrays produced fine results too.
- [0, 0, 0.1, 0.2, 0.4, 0.8, 0.4, 0.2, 0.1, 0, 0] -- This one definitely brought out more of Midnight Rose but was a little too similar for my liking
- [0, 0, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0, 0] -- It worked, but I would say this one was the runt of the litter
- [0, 0, 0.1, 0.2, 0.3, 0.35, 0.3, 0.2, 0.1, 0, 0] -- This was my second-favorite merge after the one I released, which suggests that favoring Miqu over the secondary model is the way to go.