Upload README.md
Browse files
README.md
CHANGED
@@ -45,13 +45,23 @@ quantized_by: TheBloke
|
|
45 |
This repo contains AWQ model files for [Mistral AI_'s Mixtral 8X7B v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1).
|
46 |
|
47 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
### About AWQ
|
49 |
|
50 |
AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
|
51 |
|
52 |
AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead.
|
53 |
|
54 |
-
|
55 |
|
56 |
- [Text Generation Webui](https://github.com/oobabooga/text-generation-webui) - using Loader: AutoAWQ
|
57 |
- [vLLM](https://github.com/vllm-project/vllm) - version 0.2.2 or later for support for all model types.
|
|
|
45 |
This repo contains AWQ model files for [Mistral AI_'s Mixtral 8X7B v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1).
|
46 |
|
47 |
|
48 |
+
**MIXTRAL AWQ**
|
49 |
+
|
50 |
+
This is a Mixtral AWQ model.
|
51 |
+
|
52 |
+
For AutoAWQ inference, please install AutoAWQ from source.
|
53 |
+
|
54 |
+
Support via Transformers is coming soon, via this PR: https://github.com/huggingface/transformers/pull/27950 which should be merged to Transformers `main` very soon.
|
55 |
+
|
56 |
+
Support via vLLM and TGI has not yet been confirmed.
|
57 |
+
|
58 |
### About AWQ
|
59 |
|
60 |
AWQ is an efficient, accurate and blazing-fast low-bit weight quantization method, currently supporting 4-bit quantization. Compared to GPTQ, it offers faster Transformers-based inference with equivalent or better quality compared to the most commonly used GPTQ settings.
|
61 |
|
62 |
AWQ models are currently supported on Linux and Windows, with NVidia GPUs only. macOS users: please use GGUF models instead.
|
63 |
|
64 |
+
AWQ models are supported by (note that not all of these may support Mixtral models yet):
|
65 |
|
66 |
- [Text Generation Webui](https://github.com/oobabooga/text-generation-webui) - using Loader: AutoAWQ
|
67 |
- [vLLM](https://github.com/vllm-project/vllm) - version 0.2.2 or later for support for all model types.
|