Doctor-Shotgun commited on
Commit
63254b0
1 Parent(s): c8fa24a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -1
README.md CHANGED
@@ -15,10 +15,31 @@ datasets:
15
 
16
  Experimental model, using a limarp qlora trained at 10k ctx length (greater than size of the longest limarp sample when tokenized via mistral's tokenizer) on [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) using [Charles Goddard](https://huggingface.co/chargoddard)'s ZLoss and Megablocks-based fork of transformers, and then fused to [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) at 0.5 weight.
17
 
18
- Would try with temp ~1.5-2 and min-p of ~0.03-0.05 since mixtral does appear to be highly confident on its responses and can enter repetition loops after several thousand tokens of responses.
 
 
 
 
 
 
 
19
 
20
  [Peft Adapter](https://huggingface.co/Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora)
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  ## Usage:
23
  The intended prompt format is the Alpaca instruction format of LimaRP v3:
24
  ```
@@ -45,6 +66,7 @@ Character: {utterance}
45
 
46
  (etc.)
47
  ```
 
48
 
49
  ## Message length control
50
  Due to the inclusion of LimaRP v3, it is possible to append a length modifier to the response instruction sequence, like this:
 
15
 
16
  Experimental model, using a limarp qlora trained at 10k ctx length (greater than size of the longest limarp sample when tokenized via mistral's tokenizer) on [mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) using [Charles Goddard](https://huggingface.co/chargoddard)'s ZLoss and Megablocks-based fork of transformers, and then fused to [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) at 0.5 weight.
17
 
18
+ My current generation settings are:
19
+ ```
20
+ Temperature: 1.25
21
+ Min-p: 0.05
22
+ Repetition penalty: 1.05
23
+ Repetition penalty: range 1024
24
+ ```
25
+ And this seems to avoid the Mixtral looping pitfalls for me so far. Play around with it and see what works well for you.
26
 
27
  [Peft Adapter](https://huggingface.co/Doctor-Shotgun/limarp-zloss-mixtral-8x7b-qlora)
28
 
29
+ Quants courtesy of TheBloke:
30
+ - [GPTQ](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-GPTQ)
31
+ - [GGUF](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-GGUF)
32
+ - [AWQ](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-AWQ)
33
+
34
+ Exl2 Quants courtesy of LoneStriker:
35
+ - [2.4bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-2.4bpw-h6-exl2)
36
+ - [3.0bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-3.0bpw-h6-exl2)
37
+ - [3.5bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-3.5bpw-h6-exl2)
38
+ - [3.75bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-3.75bpw-h6-exl2)
39
+ - [4.0bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-4.0bpw-h6-exl2)
40
+ - [5.0bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-5.0bpw-h6-exl2)
41
+ - [6.0bpw](https://huggingface.co/LoneStriker/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss-6.0bpw-h6-exl2)
42
+
43
  ## Usage:
44
  The intended prompt format is the Alpaca instruction format of LimaRP v3:
45
  ```
 
66
 
67
  (etc.)
68
  ```
69
+ My current templates have been uploaded to a [folder](https://huggingface.co/Doctor-Shotgun/Mixtral-8x7B-Instruct-v0.1-LimaRP-ZLoss/tree/main/ST%20Templates).
70
 
71
  ## Message length control
72
  Due to the inclusion of LimaRP v3, it is possible to append a length modifier to the response instruction sequence, like this: