Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
1 |
+
This is https://huggingface.co/ehartford/Wizard-Vicuna-30B-Uncensored, merged with https://huggingface.co/kaiokendev/superhot-30b-8k-no-rlhf-test, then quantized to 4bit with AutoGPTQ.
|
2 |
+
There are two quantized versions. One is a plain 4bit version with only act-order and no groupsize. The other is an experimental version using groupsize 128, act-order, and kaiokendev's ScaledLLamaAttention monkey patch applied *during* quantization, the idea being to help the calibration account for the new scale. It seems to have worked as it improves by around 0.04 ppl vs the unpatched quant - maybe not worth the trouble, but it's better so I'll put it up anyway.
|