- nsfw
 - mergekit
 - merge
+- HQQ
+- 2bit
+library_name: transformers
+---
+## BagelMix-8x7B branch 2g16-4g64-HQQ
+By [Undi95](https://huggingface.co/Undi95/BagelMix-8x7B)
+#### (this readme has been written by a sleepy person. /disclaimer)
+---
+[main branch is the same quant config as last time, the reference one from mobius here](https://huggingface.co/mobiuslabsgmbh/Mixtral-8x7B-v0.1-hf-attn-4bit-moe-2bit-HQQ)
+the label i've chosen here refers to 2 bit linear layers with a 16 param group size (per 8 bit group weight), and 4 bits in groups of 64 for the attention layers
+thus the actual bpw is higher than 2 in no small part because we're adding another byte every 4 bytes (i think??) for the linear layers.
+from what I can gather of hqq's source code, the gate network isn't quantised (because it's tiny and very important)
+such reasoning has lead me to try experimenting with taking more bits away from ther expert/linear layers and put them in the attention layers.
+i've currently got a slightly heavier 2g16 experts with 8g512 attention (not really sure how meaningful groups of 512 are but w/e) model already,
+which would look like this, which is *not the model on the main branch*:
+```
+attn_prams     = BaseQuantizeConfig(nbits=8, group_size=512, quant_zero=True, quant_scale=True) # MAIN BRANCH IS nbits=4 group_size=64 !!!
+attn_prams['scale_quant_params']['group_size'] = 512 #was 256, not sure what this does lol
+experts_params = BaseQuantizeConfig(nbits=2, group_size=16, quant_zero=True, quant_scale=True)
+```
+again this is not what you're downloading if you get this right now: I want to see if I can actually keep the bpw down.
+these will be uploaded as alternate branches to this repo if they seem worth doing.
+might fiddle with 2g32 or even 3g128 or such for experts. given their most delectable sparseness
+### you could also use the included python script (and a big swap partition) to make them yourself. again it's just the one from mobiuslabs themselves
+### ps read Sleeper Agents (2024/01) :-)
 ---
 # BagelMix