Update README.md
Browse files
README.md
CHANGED
@@ -37,6 +37,8 @@ library_name: diffusers
|
|
37 |
data:image/s3,"s3://crabby-images/e9365/e9365cd6f56f2704873c15d0ae2a091a9ea77b71" alt="teaser"
|
38 |
SVDQuant is a post-training quantization technique for 4-bit weights and activations that well maintains visual fidelity. On 12B FLUX.1-dev, it achieves 3.6× memory reduction compared to the BF16 model. By eliminating CPU offloading, it offers 8.7× speedup over the 16-bit model when on a 16GB laptop 4090 GPU, 3× faster than the NF4 W4A16 baseline. On PixArt-∑, it demonstrates significantly superior visual quality over other W4A4 or even W4A8 baselines. "E2E" means the end-to-end latency including the text encoder and VAE decoder.
|
39 |
|
|
|
|
|
40 |
## Method
|
41 |
#### Quantization Method -- SVDQuant
|
42 |
|
|
|
37 |
data:image/s3,"s3://crabby-images/e9365/e9365cd6f56f2704873c15d0ae2a091a9ea77b71" alt="teaser"
|
38 |
SVDQuant is a post-training quantization technique for 4-bit weights and activations that well maintains visual fidelity. On 12B FLUX.1-dev, it achieves 3.6× memory reduction compared to the BF16 model. By eliminating CPU offloading, it offers 8.7× speedup over the 16-bit model when on a 16GB laptop 4090 GPU, 3× faster than the NF4 W4A16 baseline. On PixArt-∑, it demonstrates significantly superior visual quality over other W4A4 or even W4A8 baselines. "E2E" means the end-to-end latency including the text encoder and VAE decoder.
|
39 |
|
40 |
+
`svdq-int4-flux.1-schnell` is an INT4-quantized version of [`FLUX.1-schnell`](https://huggingface.co/black-forest-labs/FLUX.1-schnell), which can generate an image based on a text description.
|
41 |
+
|
42 |
## Method
|
43 |
#### Quantization Method -- SVDQuant
|
44 |
|