DavidAU
/

Qwen2.5-MOE-2X7B-DeepSeek-Abliterated-Censored-19B-gguf

Model card Files Files and versions Community

DavidAU commited on 8 days ago

Commit

871b089

verified ·

1 Parent(s): 2817b57

Create README.md

Browse files

Files changed (1) hide show

README.md +187 -0

README.md ADDED Viewed

	@@ -0,0 +1,187 @@

+---
+license: apache-2.0
+language:
+- en
+- zh
+tags:
+- MOE
+- Qwen 2.5 MOE
+- Mixture of Experts
+- Uncensored
+- 2X1.5B
+- deepseek
+- reasoning
+- thinking
+- creative
+- 128k context
+- general usage
+- problem solving
+- brainstorming
+- solve riddles
+- story generation
+- plot generation
+- storytelling
+- fiction story
+- story
+- writing
+- fiction
+- Qwen 2.5
+- mergekit
+pipeline_tag: text-generation
+---
+(quants uploading, examples to be added)
+<H2>Qwen2.5-MOE-2X7B-DeepSeek-Abliterated-Censored-15B-gguf</H2>
+<img src="qwen-tiny.jpg" style="float:right; width:300px; height:300px; padding:5px;">
+This is a Qwen2.5 MOE (Mixture of Experts) model comprised of TWO Qwen 2.5 Deepseek (Censored/Normal AND Uncensored) 7B models
+creating a 15B model with the "Abliterated" (Uncensored) version of Deepseek Qwen 2.5 7B "in charge" so to speak.
+The model is just over 15B because of the unqiue "shared expert" (roughly 2.5 models here) used in Qwen MOEs.
+The oddball configuration yields interesting "thinking/reasoning" which is stronger than either 1.5B model on its own.
+Example generations at the bottom of this page.
+This model can be used for all use cases, and is also (mostly) uncensored.
+Context: 128k.
+You need to use the "Jinja Template" encoded in the GGUF to use this model. You might be able to use Llama 3, and/or Chatml templates
+if your AI/LLM app can not access the "Jinja Template".
+In Lmstudio the "Jinja Template" should load by default.
+In other apps - use the Deepseek Tokenizer and/or "Jinja Template".
+This model contains 2 times the power of DeepSeek Distill reasoning/thinking and shows exceptional performance.
+Also, the DeepSeek Qwen 7B model is based on Qwen's 7B Math model so this model is slanted more towards math/logic problem solving
+and I would also say more "sciency" too.
+This does not mean it will not work for your use case.
+Also, because of how this model works (uncensored and censored in the same model) you may want to try 1-4 generations depending
+on your use case because even the "right" response will vary widely, and in many cases be more "interesting".
+Examples below so you have some idea what this model can do.
+Keep in mind this model is two 7B parameters models working together, and will come close but may not have the power of a 14B or 32B reasoning/thinking model.
+However, sometimes it will generate truly "out of the park" responses.
+Temp of .4 to .8 is suggested (for best reasoning/thinking), however it will still operate at much higher temps like 1.8, 2.6 etc.
+Depending on your prompt change temp SLOWLY: IE: .41,.42,.43 ... etc etc.
+The model MAY function better if you breakdown the reasoning/thinking task(s) into smaller pieces :
+"IE: Instead of asking for 6 plots FOR theme XYZ, ASK IT for ONE plot for theme XYZ at a time".
+Also set context limit at 4k minimum, 8K+ suggested.
+I also suggest quant of IQ4/Q4 or higher, as larger quants will reasoning/thinking and perform much better.
+If you can run Q6/Q8, please use these one(s).
+IQ4XS will give very different responses VS other quants.
+---
+<B> Additional Support / Documents for this model to assist with generation / performance: </b>
+Document #1:
+Details how to use reasoning/thinking models and get maximum performance from them, and includes links to all reasoning/thinking models - GGUF and source, as well as adapters to turn any "regular" model into a "reasoning/thinking" model.
+[ https://huggingface.co/DavidAU/How-To-Use-Reasoning-Thinking-Models-and-Create-Them ]
+Document #2:
+Document detailing all parameters, settings, samplers and advanced samplers to use not only my models to their maximum potential - but all models (and quants) online (regardless of the repo) to their maximum potential. Included quick start and detailed notes, include AI / LLM apps and other critical information and references too. A must read if you are using any AI/LLM right now.
+[ https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters ]
+Software:
+SOFTWARE patch (by me) for Silly Tavern (front end to connect to multiple AI apps / connect to AIs- like Koboldcpp, Lmstudio, Text Gen Web UI and other APIs) to control and improve output generation of ANY AI model. Also designed to control/wrangle some of my more "creative" models and make them perform perfectly with little to no parameter/samplers adjustments too.
+[ https://huggingface.co/DavidAU/AI_Autocorrect__Auto-Creative-Enhancement__Auto-Low-Quant-Optimization__gguf-exl2-hqq-SOFTWARE ]
+---
+<h2>Example Generation:</h2>
+IQ4XS Quant, Temp 1.5, rep pen 1.06, topp: .95, minp: .05, topk: 40
+---
+EXAMPLE #1:
+---
+<B>
+</B>
+[[[Thinking Start]]]
+[[[Thinking End]]]
+OUTPUT:
+---
+EXAMPLE #2:
+---
+<B>
+</B>
+[[[Thinking Start]]]
+[[[Thinking End]]]
+OUTPUT:
+---
+EXAMPLE #3:
+---
+<B>
+</B>
+[[[Thinking Start]]]
+[[[Thinking End]]]
+OUTPUT:
+---
+EXAMPLE #4:
+---
+<B>
+</B>
+[[[Thinking Start]]]
+[[[Thinking End]]]
+OUTPUT: