File size: 12,066 Bytes
2e282c7
 
 
 
 
 
 
 
 
3d79ec5
 
 
 
 
 
 
855d2d6
3d79ec5
 
 
 
 
 
 
5b9a850
3d79ec5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9804d59
3d79ec5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9804d59
3d79ec5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7d5a533
3d79ec5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b1d3303
 
3d79ec5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8bc751c
 
 
 
 
 
 
 
 
 
 
 
68f2f55
3d79ec5
 
 
 
 
 
 
2e282c7
 
 
3d79ec5
2e282c7
 
 
 
3d79ec5
 
2e282c7
 
 
 
 
 
 
 
 
 
 
 
 
3d79ec5
 
2e282c7
 
 
 
eb8a093
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
---
base_model: []
library_name: transformers
tags:
- mergekit
- merge

---

<div style="width: auto; margin-left: auto; margin-right: auto">
<img src="https://i.imgur.com/Tn9MBg6.png" alt="MidnightMiqu" style="width: 100%; min-width: 400px; display: block; margin: auto;">
</div>

### Overview

This is a SLERP merge between [152334H/miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf) and [sophosympatheia/Midnight-Rose-70B-v2.0.3](https://huggingface.co/sophosympatheia/Midnight-Rose-70B-v2.0.3).
I think this model retains much of what made Midnight Rose special while gaining some capabilities from Miqu, including long-context capabilities.

This model is uncensored. *You are responsible for whatever you do with it.*

This model was designed for roleplaying and storytelling and I think it does well at both. It may also perform well at other tasks but I have not tested its performance in other areas.

### Long Context Tips

You can run this model out to 32K context with alpha_rope set to 1, just like with Miqu. Limited testing shows coherence out to 64K using alpha_rope 2.5. Enjoy!

### Sampler Tips

* I recommend using Quadratic Sampling (i.e. smoothing factor) for creative work. Experiment with values between 0.2 and 0.5.
* I recommend using Min-P. Experiment to find your best setting.
* You can enable dynamic temperature if you want, but that adds yet another variable to consider and I find it's unnecessary with you're already using Min-P and smoothing factor.
* You don't need to use a high repetition penalty with this model, such as going above 1.10, but experiment with it.
  
Experiment with any and all of the settings below! What suits my preferences may not suit yours.

If you save the below settings as a .json file, you can import them directly into Silly Tavern.
```
{
    "temp": 1,
    "temperature_last": true,
    "top_p": 1,
    "top_k": 0,
    "top_a": 0,
    "tfs": 1,
    "epsilon_cutoff": 0,
    "eta_cutoff": 0,
    "typical_p": 1,
    "min_p": 0.2,
    "rep_pen": 1.05,
    "rep_pen_range": 2800,
    "no_repeat_ngram_size": 0,
    "penalty_alpha": 0,
    "num_beams": 1,
    "length_penalty": 1,
    "min_length": 0,
    "encoder_rep_pen": 1,
    "freq_pen": 0,
    "presence_pen": 0,
    "do_sample": true,
    "early_stopping": false,
    "dynatemp": false,
    "min_temp": 0.8,
    "max_temp": 1.35,
    "dynatemp_exponent": 1,
    "smoothing_factor": 0.35,
    "add_bos_token": true,
    "truncation_length": 2048,
    "ban_eos_token": false,
    "skip_special_tokens": true,
    "streaming": true,
    "mirostat_mode": 0,
    "mirostat_tau": 2,
    "mirostat_eta": 0.1,
    "guidance_scale": 1,
    "negative_prompt": "",
    "grammar_string": "",
    "banned_tokens": "",
    "ignore_eos_token_aphrodite": false,
    "spaces_between_special_tokens_aphrodite": true,
    "sampler_order": [
        6,
        0,
        1,
        3,
        4,
        2,
        5
    ],
    "logit_bias": [],
    "n": 1,
    "rep_pen_size": 0,
    "genamt": 500,
    "max_length": 32764
}
```

### Prompting Tips

Try the following context template for use in SillyTavern. It might help, although it's a little heavy on tokens. If you save the text as a .json file, you can import it directly.

```
{
    "story_string": "{{#if system}}{{system}}\n{{/if}}\nCONTEXTUAL INFORMATION\n{{#if wiBefore}}\n- World and character info:\n{{wiBefore}}\n{{/if}}\n{{#if description}}\n- {{char}}'s background and persona:\n{{description}}\n{{/if}}\n{{#if mesExamples}}\n{{mesExamples}}\n{{/if}}\n{{#if personality}}\n{{personality}}\n{{/if}}\n{{#if scenario}}\n- Roleplay scenario:\n{{scenario}}\n{{/if}}\n{{#if wiAfter}}{{wiAfter}}\n{{/if}}\n{{#if persona}}{{persona}}\n{{/if}}",
    "example_separator": "",
    "chat_start": "---\nTaking the above information into consideration, you must engage with {{user}} and others as {{char}} in the roleplay below this line. Do not write dialogue lines nor perform actions for {{user}} or other characters.\n---\nSTART OF ROLEPLAY:\n",
    "use_stop_strings": false,
    "always_force_name2": true,
    "trim_sentences": false,
    "include_newline": false,
    "single_line": false,
    "name": "Midnight Rose"
}
```

Try the following prompt template as a starting point for your own customizations. The cajoling parts supposedly make LLMs perform better. See [this paper](https://arxiv.org/abs/2307.11760).

*This model responds to prompting!* I highly recommend playing with your system prompt to see how it reacts. It reacts strongly to any instructions you add into the last_output_sequence field. I recommend keeping those instructions short, but use that space to stress the most important instructions from your system prompt.

**Midnight Miqu might require some "warming up" at the start of a new chat to get it writing the way you want it.** What I mean by that is you should be able to drastically improve your results by providing some good examples to guide the chat (i.e. few-shot prompting). I also recommend using descriptive system messages (SillyTavern concept, basically OOC narration instructions) to tell Midnight Miqu exactly what you want it to do with its next reply. Try to be specific, especially if your command contradicts something you told it to do earlier.

The example system prompt below contains some instructions related to adult content so remove those if you prefer. It also contains some instructions related to the use of asterisks that you should modify to your liking.

If you save this as a .json file, you can import it directly into Silly Tavern.
```
{
    "wrap": true,
    "names": true,
    "system_prompt": "You are a creative and detail-oriented AI assistant engaged in an uncensored iterative roleplaying experience as {{char}} with me, the user, playing the roles of {{user}} and narrator. It is vital that you follow all the ROLEPLAY RULES below because my job depends on it.\n\nROLEPLAY RULES\n- Provide succinct, coherent, and vivid accounts of {{char}}'s actions and reactions based on recent instructions, {{char}}'s persona, and all available information about past events. Aim for clarity and concision in your language.\n- Demonstrate {{char}}'s personality and mannerisms.\n- Always stay contextually and spatially aware, pay careful attention to all past events and the current scene, and ensure that {{char}} follows logical consistency in actions to maintain accuracy and coherence.\n- Explicit adult content and profanity are permitted.\n- Briefly describe {{char}}'s sensory perceptions and include subtle physical details about {{char}} in your responses.\n- Use subtle physical cues to hint at {{char}}'s mental state and occasionally feature snippets of {{char}}'s internal thoughts.\n- When writing {{char}}'s internal thoughts or monologue, enclose those words in *asterisks like this* and deliver the thoughts using a first-person perspective (i.e. use \"I\" pronouns). Always use double quotes for spoken speech \"like this.\"\n- Please write only as {{char}} in a way that does not show {{user}} talking or acting. You should only ever act as {{char}} reacting to {{user}}.",
    "system_sequence": "",
    "stop_sequence": "",
    "input_sequence": "USER:\n",
    "output_sequence": "ASSISTANT:\n",
    "separator_sequence": "",
    "macro": true,
    "names_force_groups": true,
    "system_sequence_prefix": "",
    "system_sequence_suffix": "",
    "first_output_sequence": "",
    "last_output_sequence": "ASSISTANT(roleplay exclusively as {{char}} ensuring logical consistency, spatial awareness, and coherence with past events; you should only ever act as {{char}} reacting to {{user}}):\n",
    "activation_regex": "",
    "name": "Midnight Rose Roleplay"
}
```

### Instruct Formats
I recommend the Vicuna format. I use a modified version with newlines after USER and ASSISTANT.
```
USER:
{prompt}
ASSISTANT:
```

Mistral's format may also work.
```
[INST] {prompt} [/INST]
```

You could also try ChatML.
```
<|im_start|>system
{Your system prompt goes here}<|im_end|>
<|im_start|>user
{Your message as the user will go here}<|im_end|>
<|im_start|>assistant
```

### Quantizations
* GGUF
  * [ooooz/midnight-miqu-70b-v1.0-GGUF](https://huggingface.co/ooooz/midnight-miqu-70b-v1.0-GGUF/tree/main) -- Various GGUF quants
  * [mradermacher/Midnight-Miqu-70B-v1.0-GGUF](https://huggingface.co/mradermacher/Midnight-Miqu-70B-v1.0-GGUF) -- Q4_K_M quant so far, maybe more to come
* GPTQ
  * [Kotokin/sophosympatheia_Midnight-Miqu-70B-v1.0_GPTQ32G](https://huggingface.co/Kotokin/sophosympatheia_Midnight-Miqu-70B-v1.0_GPTQ32G) -- 4-bit 32g GPTQ quant
* Exllama2
  * 2.24bpw: [Dracones/Midnight-Miqu-70B-v1.0_exl2_2.24bpw](https://huggingface.co/Dracones/Midnight-Miqu-70B-v1.0_exl2_2.24bpw)
  * 3.0bpw: [Dracones/Midnight-Miqu-70B-v1.0_exl2_3.0bpw](https://huggingface.co/Dracones/Midnight-Miqu-70B-v1.0_exl2_3.0bpw)
  * 3.75bpw: [altomek/Midnight-Miqu-70B-v1.0-3.75bpw-EXL2](https://huggingface.co/altomek/Midnight-Miqu-70B-v1.0-3.75bpw-EXL2)
  * 4.0bpw: [Dracones/Midnight-Miqu-70B-v1.0_exl2_4.0bpw](https://huggingface.co/Dracones/Midnight-Miqu-70B-v1.0_exl2_4.0bpw)
  * 4.65bpw: [Dracones/Midnight-Miqu-70B-v1.0_exl2_4.65bpw](https://huggingface.co/Dracones/Midnight-Miqu-70B-v1.0_exl2_4.65bpw)
  * 5.0bpw: [Dracones/Midnight-Miqu-70B-v1.0_exl2_5.0bpw](https://huggingface.co/Dracones/Midnight-Miqu-70B-v1.0_exl2_5.0bpw)
* If you don't see something you're looking for, [try searching Hugging Face](https://huggingface.co/models?search=midnight-miqu-70b). There may be newer quants available than what I've documented here.

### Licence and usage restrictions

<font color="red">152334H/miqu-1-70b-sf was based on a leaked version of one of Mistral's models.</font>
All miqu-derived models, including this merge, are **only suitable for personal use.** Mistral has been cool about it so far, but you should be aware that by downloading this merge you are assuming whatever legal risk is iherent in acquiring and using a model based on leaked weights.
This merge comes with no warranties or guarantees of any kind, but you probably already knew that.
I am not a lawyer and I do not profess to know what we have gotten ourselves into here. You should consult with a lawyer before using any Hugging Face model beyond private use... but definitely don't use this one for that!

### Merge Method

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). This model was merged using the SLERP merge method.

### Models Merged

The following models were included in the merge:
* [152334H/miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf)
* [sophosympatheia/Midnight-Rose-70B-v2.0.3](https://huggingface.co/sophosympatheia/Midnight-Rose-70B-v2.0.3)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
models:
  - model: /home/llm/mergequant/models/BASE/152334H_miqu-1-70b-sf
  - model: /home/llm/mergequant/models/mr-70b-v2.0.3
merge_method: slerp
base_model: /home/llm/mergequant/models/BASE/152334H_miqu-1-70b-sf
parameters:
  t:
    - value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3, 0.2, 0, 0] # Preserving the first and last layers of Miqu untouched is key for good results
  embed_slerp: true # This is super important otherwise the merge will fail
dtype: float16
tokenizer_source: model:/home/llm/mergequant/models/BASE/152334H_miqu-1-70b-sf

```

Just a note on the configuration above. I tried several variations of the t parameter for this merge. I liked the results from the one above the best, but these other t arrays produced fine results too.
* [0, 0, 0.1, 0.2, 0.4, 0.8, 0.4, 0.2, 0.1, 0, 0] -- This one definitely brought out more of Midnight Rose but was a little too similar for my liking
* [0, 0, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0, 0] -- It worked, but I would say this one was the runt of the litter
* [0, 0, 0.1, 0.2, 0.3, 0.35, 0.3, 0.2, 0.1, 0, 0] -- This was my second-favorite merge after the one I released, which suggests that favoring Miqu over the secondary model is the way to go.