---
license: other
---
[WizardLM-33B-V1.0-Uncensored](https://huggingface.co/ehartford/WizardLM-33B-V1.0-Uncensored) merged with kaiokendev's [33b SuperHOT 8k LoRA](https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test), quantized at 4 bit.

It was created with GPTQ-for-LLaMA with group size 32 and act order true as parameters, to get the maximum perplexity vs FP16 model. 

I HIGHLY suggest to use exllama, to evade some VRAM issues.

Use (max_seq_len = context):

If max_seq_len  = 4096, compress_pos_emb = 2

If max_seq_len  = 8192, compress_pos_emb = 4