InferenceIllusionist commited on
Commit
5f5cce6
·
verified ·
1 Parent(s): 89c7dee

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +69 -0
README.md ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model_relation: quantized
4
+ quantized_by: Quant-Cartel
5
+ base_model: rAIfle/SorcererLM-8x22b-bf16
6
+ pipeline_tag: text-generation
7
+ tags:
8
+ - chat
9
+ - iMat
10
+ - GGUF
11
+ ---
12
+ ```
13
+ e88 88e d8
14
+ d888 888b 8888 8888 ,"Y88b 888 8e d88
15
+ C8888 8888D 8888 8888 "8" 888 888 88b d88888
16
+ Y888 888P Y888 888P ,ee 888 888 888 888
17
+ "88 88" "88 88" "88 888 888 888 888
18
+ b
19
+ 8b,
20
+
21
+ e88'Y88 d8 888
22
+ d888 'Y ,"Y88b 888,8, d88 ,e e, 888
23
+ C8888 "8" 888 888 " d88888 d88 88b 888
24
+ Y888 ,d ,ee 888 888 888 888 , 888
25
+ "88,d88 "88 888 888 888 "YeeP" 888
26
+
27
+ PROUDLY PRESENTS
28
+ ```
29
+ # SorcererLM-8x22b-iMat-GGUF
30
+ Quantized with love from fp16.
31
+
32
+ Original model author: [rAIfle](https://huggingface.co/rAIfle/)
33
+
34
+ * Importance Matrix calculated using [groups_merged.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) in 105 chunks, n_ctx=512, and fp16 precision weights
35
+
36
+ Original model README [here](https://huggingface.co/Quant-Cartel/SorcererLM-8x22b-bf16-epoch2) and below:
37
+
38
+ # SorcererLM-8x22b-bf16
39
+
40
+ Oh boy, here we go. Low-rank (`r=16, alpha=32`) 16bit-LoRA on top of [WizardLM-2-8x22B](https://huggingface.co/alpindale/WizardLM-2-8x22B), trained on 2 epochs of (cleaned & deduped) c2-logs. As far as I can tell, this is an upgrade from `WizardLM-2-8x22B` for RP purposes.
41
+
42
+ Alongside this ready-to-use release I'm also releasing the LoRA itself as well as the earlier `epoch1`-checkpoint of the LoRA.
43
+
44
+ ## Why A LoRA?
45
+
46
+ The choice was fully intentional. I briefly considered a FFT but for this particular use-case a LoRA seemed a better fit. `WizardLM-2-8x22B` is smart by itself but its used vocabulary leaves much to be desired when it comes to RP. By training a low-rank LoRA on top of it to teach it some of Claude's writing style, we remedy that.
47
+
48
+ ## Prompting
49
+
50
+ - Use the templates in [Quant-Cartel/Recommended-Settings](https://huggingface.co/Quant-Cartel/Recommended-Settings) under the `SorcererLM`-folder.
51
+ - Or Vicuna 1.1 and a sane context template. It's somewhat sensitive to samplers, I'd recommend Temperature 1, MinP 0.05 and a dash of DRY but YMMV. Shorter prompts seem to work better, too.
52
+
53
+ ## Quantized Versions
54
+
55
+ - [iMat GGUFs](https://huggingface.co/Quant-Cartel/SorcererLM-8x22b-iMat-GGUF)
56
+ - [longcal exl2s](https://huggingface.co/Quant-Cartel/SorcererLM-8x22b-exl2-longcal)
57
+
58
+ ## Acknowledgments
59
+
60
+ The main shoutout I want to make is to my [Cartel](https://huggingface.co/Quant-Cartel) bros, [Envoid](https://huggingface.co/Envoid) and particularly [I^2](https://huggingface.co/InferenceIllusionist), for being amazing. I count this as a team effort, so they deserve kudos too if you like this.
61
+
62
+
63
+ ## Training
64
+
65
+ Trained using [qlora-pipe](https://github.com/tdrussell/qlora-pipe). Configs included in the `train`-subfolder.
66
+
67
+ ## Safety
68
+
69
+ ... n/a