Edit model card

Exllama v2 Quantizations of Buttercup-4x7B-V2-laser

Using turboderp's ExLlamaV2 v0.0.13 for quantization.

The "main" branch only contains the measurement.json, download one of the other branches for the model (see below)

Original model: https://huggingface.co/Kquant03/Buttercup-4x7B-V2-laser

Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions.

Branch Bits lm_head bits VRAM (4k) VRAM (16k) VRAM (32k) Description
8_0 8.0 8.0 24.8 GB 26.3 GB 28.3 GB Maximum quality that ExLlamaV2 can produce, near unquantized performance.
6_5 6.5 8.0 20.3 GB 21.8 GB 23.8 GB Near unquantized performance at vastly reduced size, recommended.
5_0 5.0 6.0 15.8 GB 17.3 GB 19.3 GB Slightly lower quality vs 6.5.
4_25 4.25 6.0 14.0 GB 15.5 GB 17.5 GB GPTQ equivalent bits per weight, slightly higher quality, great for 16gb cards with 16k context.
3_5 3.5 6.0 11.3 GB 12.8 GB 14.8 GB Lower quality, not recommended, only suitable for 12GB cards.

Download instructions

With git:

git clone --single-branch --branch 6_5 https://huggingface.co/bartowski/Buttercup-4x7B-V2-laser-exl2

With huggingface hub (credit to TheBloke for instructions):

pip3 install huggingface-hub

To download the main (only useful if you only care about measurement.json) branch to a folder called Buttercup-4x7B-V2-laser-exl2:

mkdir Buttercup-4x7B-V2-laser-exl2
huggingface-cli download bartowski/Buttercup-4x7B-V2-laser-exl2 --local-dir Buttercup-4x7B-V2-laser-exl2 --local-dir-use-symlinks False

To download from a different branch, add the --revision parameter:

Linux:

mkdir Buttercup-4x7B-V2-laser-exl2-6_5
huggingface-cli download bartowski/Buttercup-4x7B-V2-laser-exl2 --revision 6_5 --local-dir Buttercup-4x7B-V2-laser-exl2-6_5 --local-dir-use-symlinks False

Windows (which apparently doesn't like _ in folders sometimes?):

mkdir Buttercup-4x7B-V2-laser-exl2-6.5
huggingface-cli download bartowski/Buttercup-4x7B-V2-laser-exl2 --revision 6_5 --local-dir Buttercup-4x7B-V2-laser-exl2-6.5 --local-dir-use-symlinks False
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Examples
Unable to determine this model's library. Check the docs .