Quant-Cartel
/

SorcererLM-8x22b-iMat-GGUF

Text Generation

Inference Endpoints

Model card Files Files and versions Community

InferenceIllusionist commited on Sep 13, 2024

Commit

5f5cce6

·

verified ·

1 Parent(s): 89c7dee

Create README.md

Files changed (1) hide show

README.md +69 -0

README.md ADDED Viewed

	@@ -0,0 +1,69 @@

+---
+license: apache-2.0
+base_model_relation: quantized
+quantized_by: Quant-Cartel
+base_model: rAIfle/SorcererLM-8x22b-bf16
+pipeline_tag: text-generation
+tags:
+- chat
+- iMat
+- GGUF
+---
+```
+  e88 88e                               d8
+ d888 888b  8888 8888  ,"Y88b 888 8e   d88
+C8888 8888D 8888 8888 "8" 888 888 88b d88888
+ Y888 888P  Y888 888P ,ee 888 888 888  888
+  "88 88"    "88 88"  "88 888 888 888  888
+      b
+      8b,
+  e88'Y88                  d8           888
+ d888  'Y  ,"Y88b 888,8,  d88    ,e e,  888
+C8888     "8" 888 888 "  d88888 d88 88b 888
+ Y888  ,d ,ee 888 888     888   888   , 888
+  "88,d88 "88 888 888     888    "YeeP" 888
+PROUDLY PRESENTS
+```
+# SorcererLM-8x22b-iMat-GGUF
+Quantized with love from fp16.
+Original model author: [rAIfle](https://huggingface.co/rAIfle/)
+* Importance Matrix calculated using [groups_merged.txt](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) in 105 chunks, n_ctx=512, and fp16 precision weights
+Original model README [here](https://huggingface.co/Quant-Cartel/SorcererLM-8x22b-bf16-epoch2) and below:
+# SorcererLM-8x22b-bf16
+Oh boy, here we go. Low-rank (`r=16, alpha=32`) 16bit-LoRA on top of [WizardLM-2-8x22B](https://huggingface.co/alpindale/WizardLM-2-8x22B), trained on 2 epochs of (cleaned & deduped) c2-logs. As far as I can tell, this is an upgrade from `WizardLM-2-8x22B` for RP purposes.
+Alongside this ready-to-use release I'm also releasing the LoRA itself as well as the earlier `epoch1`-checkpoint of the LoRA.
+## Why A LoRA?
+The choice was fully intentional. I briefly considered a FFT but for this particular use-case a LoRA seemed a better fit. `WizardLM-2-8x22B` is smart by itself but its used vocabulary leaves much to be desired when it comes to RP. By training a low-rank LoRA on top of it to teach it some of Claude's writing style, we remedy that.
+## Prompting
+- Use the templates in [Quant-Cartel/Recommended-Settings](https://huggingface.co/Quant-Cartel/Recommended-Settings) under the `SorcererLM`-folder.
+- Or Vicuna 1.1 and a sane context template. It's somewhat sensitive to samplers, I'd recommend Temperature 1, MinP 0.05 and a dash of DRY but YMMV. Shorter prompts seem to work better, too.
+## Quantized Versions
+- [iMat GGUFs](https://huggingface.co/Quant-Cartel/SorcererLM-8x22b-iMat-GGUF)
+- [longcal exl2s](https://huggingface.co/Quant-Cartel/SorcererLM-8x22b-exl2-longcal)
+## Acknowledgments
+The main shoutout I want to make is to my [Cartel](https://huggingface.co/Quant-Cartel) bros, [Envoid](https://huggingface.co/Envoid) and particularly [I^2](https://huggingface.co/InferenceIllusionist), for being amazing. I count this as a team effort, so they deserve kudos too if you like this.
+## Training
+Trained using [qlora-pipe](https://github.com/tdrussell/qlora-pipe). Configs included in the `train`-subfolder.
+## Safety
+... n/a