sophosympatheia
/

Midnight-Miqu-70B-v1.0

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sophosympatheia commited on Mar 2

Commit

9804d59

•

1 Parent(s): 6179695

Update README.md

Updates based on more testing at 16K context

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -22,9 +22,9 @@ This model was designed for roleplaying and storytelling and I think it does wel
 ### Long Context Tips
-You can run this model past 4096 context with alpha_rope set to 1, but I think it performs better if you set alpha_rope to what you would normally use for a Llama2 model with 4096 context. For example, alpha_rope 2.5 for 8K.
-Miqu can go up to 32K context in theory. I would expect performance to degrade as you exceed 8K, which is typical for Llama2 models, but the dropoff may not be as extreme with this merge thanks to Miqu.
-**UPDATE:** I was able to test my 5.0 bpw exl2 quant of this model out to 16K context just now using 8-bit cache with alpha_rope 1 and it was okay!
 ### Sampler Tips
@@ -47,7 +47,7 @@ If you save the below settings as a .json file, you can import them directly int
     "epsilon_cutoff": 0,
     "eta_cutoff": 0,
     "typical_p": 1,
-    "min_p": 0.15,
     "rep_pen": 1.05,
     "rep_pen_range": 2800,
     "no_repeat_ngram_size": 0,
@@ -64,7 +64,7 @@ If you save the below settings as a .json file, you can import them directly int
     "min_temp": 0.8,
     "max_temp": 1.35,
     "dynatemp_exponent": 1,
-    "smoothing_factor": 0.4,
     "add_bos_token": true,
     "truncation_length": 2048,
     "ban_eos_token": false,
@@ -92,7 +92,7 @@ If you save the below settings as a .json file, you can import them directly int
     "n": 1,
     "rep_pen_size": 0,
     "genamt": 500,
-    "max_length": 8192
 }
 ```

 ### Long Context Tips
+You can run this model past 4096 context with alpha_rope set to 1.
+I have tested my 5.0bpw exl2 quant of this model out to 16K context using 8-bit cache with alpha_rope 1 and it performs great without any noticable drop in quality as the context size filled from < 4K to the full 16K context.
+Miqu can go up to 32K context, so in theory this merge can too. I will test that theory soon.
 ### Sampler Tips
     "epsilon_cutoff": 0,
     "eta_cutoff": 0,
     "typical_p": 1,
+    "min_p": 0.2,
     "rep_pen": 1.05,
     "rep_pen_range": 2800,
     "no_repeat_ngram_size": 0,
     "min_temp": 0.8,
     "max_temp": 1.35,
     "dynatemp_exponent": 1,
+    "smoothing_factor": 0.35,
     "add_bos_token": true,
     "truncation_length": 2048,
     "ban_eos_token": false,
     "n": 1,
     "rep_pen_size": 0,
     "genamt": 500,
+    "max_length": 16128
 }
 ```