DavidAU
/

Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters

@@ -36,7 +36,7 @@ tags:
 <h3>Maximizing Model Performance for All Quants Types And Full-Precision using Samplers, Advance Samplers and Parameters Guide</h3>
-(Updated: "INDEX", and added "Generation Steering" section)
 This document includes detailed information, references, and notes for general parameters, samplers and
 advanced samplers to get the most out of your model's abilities including notes / settings for the most popular AI/LLM app in use (LLAMACPP, KoboldCPP, Text-Generation-WebUI, LMStudio, Sillytavern, Ollama and others).
@@ -124,7 +124,8 @@ SOURCE FILES for my Models / APPS to Run LLMs / AIs:
 -	TEXT-GENERATION-WEBUI
 -	KOBOLDCPP
 -	SILLYTAVERN
--	Lmstudio, Ollama, Llamacpp, and OTHER PROGRAMS
 TESTING / Default / Generation Example PARAMETERS AND SAMPLERS
 -	Basic settings suggested for general model operation.
@@ -438,7 +439,7 @@ In section 1 a,b, and c, below are all the LLAMA_CPP parameters and samplers.
 I have added notes below each one for adjustment / enhancement(s) for specific use cases.
-TEXT-GENERATION-WEBUI
 In section 2, will be additional samplers, which become available when using "llamacpp_HF" loader in https://github.com/oobabooga/text-generation-webui
 AND/OR https://github.com/LostRuins/koboldcpp ("KOBOLDCPP").
@@ -449,7 +450,7 @@ The "llamacpp_HF" (for "text-generation-webui") only requires the GGUF you want
 This allows access to very advanced samplers in addition to all the parameters / samplers here.
-KOBOLDCPP:
 Note that https://github.com/LostRuins/koboldcpp also allows access to all LLAMACPP parameters/samplers too as well as additional advanced samplers too.
@@ -457,7 +458,7 @@ You can use almost all parameters, samplers and advanced samplers using "KOBOLDC
 Note: This program has one of the newest samplers called "Anti-slop" which allows phrase/word banning at the generation level.
-SILLYTAVERN:
 Note that https://github.com/SillyTavern/SillyTavern also allows access to all LLAMACPP parameters/samplers too as well as additional advanced samplers too.
@@ -480,16 +481,12 @@ Currently, at time of this writing, connecting Silly Tavern via KoboldCPP or Tex
 However for some, connecting to Lmstudio, LlamaCPP, or Ollama may be preferred.
-NOTE:
-It appears that Silly Tavern also supports "DRY" and "XTC" too ; but it is not yet in the documentation at the time of writing.
 You may also want to check out how to connect SillyTavern to local AI "apps" running on your pc here:
 https://docs.sillytavern.app/usage/api-connections/
-OTHER PROGRAMS:
 Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
@@ -515,6 +512,29 @@ Most AI/LLM apps operate on Windows, Mac, and Linux.
 Mobile devices (and O/S) are in many cases also supported.
 ---
@@ -552,6 +572,12 @@ You should set these as noted first. I would say these are the minimum settings
 Note for Class 3/Class 4 models settings/samplers (discussed below) "repeat-last-n" is a CRITICAL setting.
 ---
@@ -799,12 +825,22 @@ Likewise there may be some "name variation(s)" - in other LLM/AI apps - this is
 </small>
-CLASS 3/4 Models:
 If you are using a class 3 or class 4 model for use case(s) such as role play, multi-turn, chat etc etc, it is suggested to activate / set all samplers for class 3 but may be required for class 4 models.
 Likewise for fine control of a class 3/4 via "DRY" and "Quadratic" samplers is detailed below. These allow you to dial up or dial down the model's raw power directly.
 MICROSTAT Sampler - IMPORTANT:
 Make sure to review MIROSTAT sampler settings below, due to behaviour of this specific sampler / affect on parameters/other samplers which varies from app to app too.
@@ -855,6 +891,10 @@ Too much temp can affect instruction following in some cases and sometimes not e
 Newer model archs (L3,L3.1,L3.2, Mistral Nemo, Gemma2 etc) many times NEED more temp (1+) to get their best generations.
 <B>top-p</B>
 top-p sampling (default: 0.9, 1.0 = disabled)
@@ -937,6 +977,12 @@ This setting also works in conjunction with all other "rep pens" below.
 This parameter is the "RANGE" of tokens looked at for the samplers directly below.
 <B>SECONDARIES:</B>
 <B>repeat-penalty</B>

 <h3>Maximizing Model Performance for All Quants Types And Full-Precision using Samplers, Advance Samplers and Parameters Guide</h3>
+(Updated: "INDEX", and added "Generation Steering" section ; notes on Roleplay/Simulation added)
 This document includes detailed information, references, and notes for general parameters, samplers and
 advanced samplers to get the most out of your model's abilities including notes / settings for the most popular AI/LLM app in use (LLAMACPP, KoboldCPP, Text-Generation-WebUI, LMStudio, Sillytavern, Ollama and others).
 -	TEXT-GENERATION-WEBUI
 -	KOBOLDCPP
 -	SILLYTAVERN
+-	Lmstudio, Ollama, Llamacpp, Backyard, and OTHER PROGRAMS
+-	Roleplay and Simulation Programs/Notes on models.
 TESTING / Default / Generation Example PARAMETERS AND SAMPLERS
 -	Basic settings suggested for general model operation.
 I have added notes below each one for adjustment / enhancement(s) for specific use cases.
+<B>TEXT-GENERATION-WEBUI</B>
 In section 2, will be additional samplers, which become available when using "llamacpp_HF" loader in https://github.com/oobabooga/text-generation-webui
 AND/OR https://github.com/LostRuins/koboldcpp ("KOBOLDCPP").
 This allows access to very advanced samplers in addition to all the parameters / samplers here.
+<B>KOBOLDCPP:</B>
 Note that https://github.com/LostRuins/koboldcpp also allows access to all LLAMACPP parameters/samplers too as well as additional advanced samplers too.
 Note: This program has one of the newest samplers called "Anti-slop" which allows phrase/word banning at the generation level.
+<B>SILLYTAVERN:</B>
 Note that https://github.com/SillyTavern/SillyTavern also allows access to all LLAMACPP parameters/samplers too as well as additional advanced samplers too.
 However for some, connecting to Lmstudio, LlamaCPP, or Ollama may be preferred.
 You may also want to check out how to connect SillyTavern to local AI "apps" running on your pc here:
 https://docs.sillytavern.app/usage/api-connections/
+<B>Lmstudio, Ollama, Llamacpp, and OTHER PROGRAMS</B>
 Other programs like https://www.LMStudio.ai allows access to most of STANDARD samplers, where as others (llamacpp only here) you may need to add to the json file(s) for a model and/or template preset.
 Mobile devices (and O/S) are in many cases also supported.
+<B>Roleplay and Simulation Programs/Notes on models.</B>
+Text Generation Webui, KoboldCPP, and Silly Tavern (and AI/LLM apps connected via Silly Tavern) can all do roleplay / simulation AS WELL as "chat" and other creative activities.
+LMStudio (the app here directly), Ollama and other LLM/AI apps are for general usage, however they can be connected to Silly Tavern via API too.
+Backyard ( https://backyard.ai/ ) is software that is dedicated primarily to Roleplay / Simulation, however it can not be (at time of this writing) connected via API to Silly Tavern at this time.
+If you are using Backyard app, see special notes for "roleplay / simulation" and where applicable, "BACKYARD APP" for specific notes on using these app.
+Models that are Class 3/4 :
+Some of my models that are rated Class 3 or 4 maybe a little more challenging to operate with roleplay, especially if you can not access / control certain samplers.
+How to handle this issue is addressed in "Generational Steering" section (you control it) as well as Quick Reference, and Detailed Parameters, Samplers and Advanced Samplers Sections (automated control).
+Also, some of my models are available in multiple "classes", IE Dark Planet, and Grand Gutenberg.
+In these cases, Dark Planet 8B versions and Grand Gutenberg 12B ("Darkness" / "Madness") are class 1 - any use case, including role play and simulation.
+Likewise Darkest Planet 16.5B and Grand Gutenberg 23/23.5B are class 3 - great at roleplay/simulation, but need a bit more steering and/or parameter/samplers adjustments to work flawlessly for this use case.
+Note: Dark Planet 8B (class 1) is also a compressed version of Grand Horror 16B (a full on class 4)
 ---
 Note for Class 3/Class 4 models settings/samplers (discussed below) "repeat-last-n" is a CRITICAL setting.
+BACKYARD APP:
+In "Backyard" app, "repetition_penalty_range" is called "Repeat Penalty Tokens" (set on the "character card").
+For class 3/4 models (if using with Backyard app), set this to 64 OR LESS.
 ---
 </small>
+CLASS 3/4 MODELS:
 If you are using a class 3 or class 4 model for use case(s) such as role play, multi-turn, chat etc etc, it is suggested to activate / set all samplers for class 3 but may be required for class 4 models.
 Likewise for fine control of a class 3/4 via "DRY" and "Quadratic" samplers is detailed below. These allow you to dial up or dial down the model's raw power directly.
+ROLEPLAY / SIMULATION NOTES:
+If you are using a model (regardless of "class") for these uses cases, you may need to LOWER "temp" to get better instruction following.
+Instruction following issues can cascade over the "adventure" if the temp is set too high for the specific model(s) you are using.
+Likewise you may want to set MAXIMUM output tokens (a hard limit how much the model can output) to much lower values such as 128 to 300.
+(This will assist with steering, and stop the model from endlessly "yapping")
 MICROSTAT Sampler - IMPORTANT:
 Make sure to review MIROSTAT sampler settings below, due to behaviour of this specific sampler / affect on parameters/other samplers which varies from app to app too.
 Newer model archs (L3,L3.1,L3.2, Mistral Nemo, Gemma2 etc) many times NEED more temp (1+) to get their best generations.
+ROLEPLAY / SIMULATION NOTE:
+If you are using a model (regardless of "class") for these uses cases, you may need to LOWER temp to get better instruction following.
 <B>top-p</B>
 top-p sampling (default: 0.9, 1.0 = disabled)
 This parameter is the "RANGE" of tokens looked at for the samplers directly below.
+BACKYARD APP:
+In "Backyard" app, "repetition_penalty_range" is called "Repeat Penalty Tokens" (set on the "character card").
+For class 3/4 models (if using with Backyard app), set this to 64 OR LESS.
 <B>SECONDARIES:</B>
 <B>repeat-penalty</B>