--- license: mit --- This is the INT4 Llama-3-8b model quantized by per-channel QQQ. QQQ is an innovative and hardware-optimized W4A8 quantization solution. For more details, please refer to our code [repo](https://github.com/HandH1998/QQQ) and our [paper](https://arxiv.org/pdf/2406.09904).