Text Generation
GGUF
English
All use cases
reasoning
thoughts
deep thinking
deepseek
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
science fiction
romance
all genres
story
writing
vivid writing
fiction
bfloat16
swearing
sillytavern
Lmstudio
backyard
horror
Qwen 2.5
context 128k
mergekit
Inference Endpoints
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -35,6 +35,8 @@ pipeline_tag: text-generation
|
|
35 |
|
36 |
<h2>DeepSeek-R1-Distill-Qwen-25.5B with Brainstorm 40x, (88 layers, 1043 tensors) </h2>
|
37 |
|
|
|
|
|
38 |
Context : 128k.
|
39 |
|
40 |
Required: CHATML template.
|
@@ -49,17 +51,198 @@ The "thinking/reasoning" tech (for the model at this repo) is from the original
|
|
49 |
|
50 |
[ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B ]
|
51 |
|
52 |
-
In this case, Brainstorm 40x module was grafted directly onto "DeepSeek-R1-Distill-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
53 |
|
54 |
---
|
55 |
|
56 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
57 |
|
58 |
---
|
59 |
|
60 |
-
<
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
61 |
|
62 |
-
|
63 |
|
64 |
---
|
65 |
|
|
|
35 |
|
36 |
<h2>DeepSeek-R1-Distill-Qwen-25.5B with Brainstorm 40x, (88 layers, 1043 tensors) </h2>
|
37 |
|
38 |
+
<img src="deepseek.jpg" style="float:right; width:300px; height:300px; padding:10px;">
|
39 |
+
|
40 |
Context : 128k.
|
41 |
|
42 |
Required: CHATML template.
|
|
|
51 |
|
52 |
[ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B ]
|
53 |
|
54 |
+
In this case, Brainstorm 40x module was grafted directly onto "DeepSeek-R1-Distill-Qwen-14B" bringing it up to 88 layers, 25.5B parameters.
|
55 |
+
|
56 |
+
<B>USE CASES:</B>
|
57 |
+
|
58 |
+
This model is for all use cases, and it has a slightly more creative slant than a standard model.
|
59 |
+
|
60 |
+
This model can also be used for solving logic puzzles, riddles, and other problems with the enhanced "thinking" systems by DeepSeek.
|
61 |
+
|
62 |
+
This model also can solve problems/riddles/ and puzzles normally beyond the abilities of a Llama 3.1 model due to DeepSeek systems.
|
63 |
+
|
64 |
+
This model MAY produce NSFW / uncensored content.
|
65 |
+
|
66 |
+
<B>Special Operation Instructions:</B>
|
67 |
+
|
68 |
+
TEMP/SETTINGS:
|
69 |
+
|
70 |
+
1. Set Temp between 0 and .8, higher than this "think" functions will activate differently. The most "stable" temp seems to be .6, with a variance of +-0.05. Lower for more "logic" reasoning, raise it for more "creative" reasoning (max .8 or so). Also set context to at least 4096, to account for "thoughts" generation.
|
71 |
+
2. For temps 1+,2+ etc etc, thought(s) will expand, and become deeper and richer.
|
72 |
+
3. Set "repeat penalty" to 1.02 to 1.07 (recommended) .
|
73 |
+
4. This model requires a CHATML template. (see notes on "System Prompt" / "Role" below)
|
74 |
+
|
75 |
+
PROMPTS:
|
76 |
+
|
77 |
+
1. If you enter a prompt without implied "step by step" requirements (ie: Generate a scene, write a story, give me 6 plots for xyz), "thinking" (one or more) MAY activate AFTER first generation. (IE: Generate a scene -> scene will generate, followed by suggestions for improvement in "thoughts")
|
78 |
+
2. If you enter a prompt where "thinking" is stated or implied (ie puzzle, riddle, solve this, brainstorm this idea etc), "thoughts" process(es) in Deepseek will activate almost immediately. Sometimes you need to regen it to activate.
|
79 |
+
3. You will also get a lot of variations - some will continue the generation, others will talk about how to improve it, and some (ie generation of a scene) will cause the characters to "reason" about this situation. In some cases, the model will ask you to continue generation / thoughts too.
|
80 |
+
4. In some cases the model's "thoughts" may appear in the generation itself.
|
81 |
+
5. State the word size length max IN THE PROMPT for best results, especially for activation of "thinking." (see examples below)
|
82 |
+
6. Sometimes the "censorship" (from Deepseek) will activate, regen the prompt to clear it.
|
83 |
+
7. You may want to try your prompt once at "default" or "safe" temp settings, another at temp 1.2, and a third at 2.5 as an example. This will give you a broad range of "reasoning/thoughts/problem" solving.
|
84 |
+
|
85 |
+
GENERATION - THOUGHTS/REASONING:
|
86 |
+
|
87 |
+
1. It may take one or more regens for "thinking" to "activate." (depending on the prompt)
|
88 |
+
2. Model can generate a LOT of "thoughts". Sometimes the most interesting ones are 3,4,5 or more levels deep.
|
89 |
+
3. Many times the "thoughts" are unique and very different from one another.
|
90 |
+
4. Temp/rep pen settings can affect reasoning/thoughts too.
|
91 |
+
5. Change up or add directives/instructions or increase the detail level(s) in your prompt to improve reasoning/thinking.
|
92 |
+
6. Adding to your prompt: "think outside the box", "brainstorm X number of ideas", "focus on the most uncommon approaches" can drastically improve your results.
|
93 |
+
|
94 |
+
GENERAL SUGGESTIONS:
|
95 |
+
|
96 |
+
1. I have found opening a "new chat" per prompt works best with "thinking/reasoning activation", with temp .6, rep pen 1.05 ... THEN "regen" as required.
|
97 |
+
2. Sometimes the model will really really get completely unhinged and you need to manually stop it.
|
98 |
+
3. Depending on your AI app, "thoughts" may appear with "< THINK >" and "</ THINK >" tags AND/OR the AI will generate "thoughts" directly in the main output or later output(s).
|
99 |
+
4. Although quant q4KM was used for testing/examples, higher quants will provide better generation / more sound "reasoning/thinking".
|
100 |
+
|
101 |
+
ADDITIONAL SUPPORT:
|
102 |
+
|
103 |
+
For additional generational support, general questions, and detailed parameter info and a lot more see also:
|
104 |
+
|
105 |
+
NOTE: This is a CLASS 1/2 model.
|
106 |
+
|
107 |
+
https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters
|
108 |
+
|
109 |
+
---
|
110 |
+
|
111 |
+
<B>Recommended Settings (all) - For usage with "Think" / "Reasoning":</B>
|
112 |
+
|
113 |
+
temp: .6 , rep pen: 1.07 (range : 1.02 to 1.12), rep pen range: 64, top_k: 40, top_p: .95, min_p: .05
|
114 |
+
|
115 |
+
Temp of 1+, 2+, 3+ will result in much deeper, richer and "more interesting" thoughts and reasoning.
|
116 |
+
|
117 |
+
Model behaviour may change with other parameter(s) and/or sampler(s) activated - especially the "thinking/reasoning" process.
|
118 |
+
|
119 |
+
---
|
120 |
+
|
121 |
+
<B>System Role / System Prompt - Augment The Model's Power:</b>
|
122 |
+
|
123 |
+
---
|
124 |
+
|
125 |
+
If you set / have a system prompt this will affect both "generation" and "thinking/reasoning".
|
126 |
+
|
127 |
+
SIMPLE:
|
128 |
+
|
129 |
+
This is the generic system prompt used for generation and testing:
|
130 |
+
|
131 |
+
<PRE>
|
132 |
+
You are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests to the best of your ability.
|
133 |
+
</PRE>
|
134 |
+
|
135 |
+
This System Role/Prompt may give you a lot more "creative results":
|
136 |
+
|
137 |
+
<PRE>
|
138 |
+
Use vivid and graphic words focusing on verbs and use current 2020 fiction writing style. Use metaphor(s) that fit the context of the situation (and reveal character) rather than similes."
|
139 |
+
</PRE>
|
140 |
+
|
141 |
+
RECOMMENDED:
|
142 |
+
|
143 |
+
This is the recommended System Prompt/Role - "Blank":
|
144 |
+
|
145 |
+
<PRE>
|
146 |
+
</PRE>
|
147 |
+
|
148 |
+
Please see "DeepSeek-ai's" repo for more information:
|
149 |
+
|
150 |
+
[ https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B ]
|
151 |
+
|
152 |
+
ADVANCED:
|
153 |
+
|
154 |
+
Logical and Creative - these will SIGNFICANTLY alter the output, and many times improve it too.
|
155 |
+
|
156 |
+
This will also cause more thoughts, deeper thoughts, and in many cases more detailed/stronger thoughts too.
|
157 |
+
|
158 |
+
Keep in mind you may also want to test the model with NO system prompt at all - including the default one.
|
159 |
+
|
160 |
+
Special Credit to: Eric Hartford, Cognitivecomputations ; these are based on his work.
|
161 |
+
|
162 |
+
CRITICAL:
|
163 |
+
|
164 |
+
Copy and paste exactly as shown, preserve formatting and line breaks.
|
165 |
+
|
166 |
+
SIDE NOTE:
|
167 |
+
|
168 |
+
These can be used in ANY Deepseek / Thinking model, including models not at this repo.
|
169 |
+
|
170 |
+
These, if used in a "non thinking" model, will also alter model performance too.
|
171 |
+
|
172 |
+
<PRE>
|
173 |
+
You are an AI assistant developed by the world wide community of ai experts.
|
174 |
+
|
175 |
+
Your primary directive is to provide well-reasoned, structured, and extensively detailed responses.
|
176 |
+
|
177 |
+
Formatting Requirements:
|
178 |
+
|
179 |
+
1. Always structure your replies using: <think>{reasoning}</think>{answer}
|
180 |
+
2. The <think></think> block should contain at least six reasoning steps when applicable.
|
181 |
+
3. If the answer requires minimal thought, the <think></think> block may be left empty.
|
182 |
+
4. The user does not see the <think></think> section. Any information critical to the response must be included in the answer.
|
183 |
+
5. If you notice that you have engaged in circular reasoning or repetition, immediately terminate {reasoning} with a </think> and proceed to the {answer}
|
184 |
+
|
185 |
+
Response Guidelines:
|
186 |
+
|
187 |
+
1. Detailed and Structured: Use rich Markdown formatting for clarity and readability.
|
188 |
+
2. Scientific and Logical Approach: Your explanations should reflect the depth and precision of the greatest scientific minds.
|
189 |
+
3. Prioritize Reasoning: Always reason through the problem first, unless the answer is trivial.
|
190 |
+
4. Concise yet Complete: Ensure responses are informative, yet to the point without unnecessary elaboration.
|
191 |
+
5. Maintain a professional, intelligent, and analytical tone in all interactions.
|
192 |
+
</PRE>
|
193 |
+
|
194 |
+
CREATIVE:
|
195 |
+
|
196 |
+
<PRE>
|
197 |
+
You are an AI assistant developed by a world wide community of ai experts.
|
198 |
+
|
199 |
+
Your primary directive is to provide highly creative, well-reasoned, structured, and extensively detailed responses.
|
200 |
+
|
201 |
+
Formatting Requirements:
|
202 |
+
|
203 |
+
1. Always structure your replies using: <think>{reasoning}</think>{answer}
|
204 |
+
2. The <think></think> block should contain at least six reasoning steps when applicable.
|
205 |
+
3. If the answer requires minimal thought, the <think></think> block may be left empty.
|
206 |
+
4. The user does not see the <think></think> section. Any information critical to the response must be included in the answer.
|
207 |
+
5. If you notice that you have engaged in circular reasoning or repetition, immediately terminate {reasoning} with a </think> and proceed to the {answer}
|
208 |
+
|
209 |
+
Response Guidelines:
|
210 |
+
|
211 |
+
1. Detailed and Structured: Use rich Markdown formatting for clarity and readability.
|
212 |
+
2. Creative and Logical Approach: Your explanations should reflect the depth and precision of the greatest creative minds first.
|
213 |
+
3. Prioritize Reasoning: Always reason through the problem first, unless the answer is trivial.
|
214 |
+
4. Concise yet Complete: Ensure responses are informative, yet to the point without unnecessary elaboration.
|
215 |
+
5. Maintain a professional, intelligent, and analytical tone in all interactions.
|
216 |
+
</PRE>
|
217 |
+
|
218 |
|
219 |
---
|
220 |
|
221 |
+
<B>Software to Augment Generation:</B>
|
222 |
+
|
223 |
+
AI Auto-Correct Engine (built, and programmed by DavidAU) auto-corrects AI generation in real-time, including modification of the
|
224 |
+
live generation stream to and from the AI... creating a two way street of information that operates, changes, and edits automatically.
|
225 |
+
This system is for all GGUF, EXL2, HQQ, and other quants/compressions and full source models too.
|
226 |
+
|
227 |
+
Software Link:
|
228 |
+
|
229 |
+
https://huggingface.co/DavidAU/AI_Autocorrect__Auto-Creative-Enhancement__Auto-Low-Quant-Optimization__gguf-exl2-hqq-SOFTWARE
|
230 |
+
|
231 |
+
For Deepseek / Reasoning Models: This engine can adjust/correct both the "thinking" and "output generation" in real time.
|
232 |
|
233 |
---
|
234 |
|
235 |
+
<H2>EXAMPLES:</H2>
|
236 |
+
|
237 |
+
Examples are created using quant IQ4_XS, "temp=.6", rep pen 1.06 (unless otherwise stated), minimal parameters and "CHATML" template.
|
238 |
+
|
239 |
+
Model has been tested with "temp" from ".1" to "5".
|
240 |
+
|
241 |
+
Below are the least creative outputs, prompt is in BOLD.
|
242 |
+
|
243 |
+
IMPORTANT:
|
244 |
|
245 |
+
Higher quants / imatrix quants will have much stronger generation - words, sentences, ideas, dialog and general quality.
|
246 |
|
247 |
---
|
248 |
|