Text-to-Image
Diffusers
English
SVDQuant
FLUX.1-dev
INT4
FLUX.1
Diffusion
Quantization
LoRA
Lmxyy commited on
Commit
de8c4d9
·
verified ·
1 Parent(s): ec06d5f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -29
README.md CHANGED
@@ -1,26 +1,23 @@
1
  ---
2
  license: other
3
- license_name: attribution-noncommercial-4.0-international
4
- license_link: LICENSE
5
  tags:
6
- - text
7
- - images
8
  - text-to-image
 
 
 
 
 
 
 
9
  language:
10
  - en
11
- source_datasets:
12
- - sDCI
13
- task_categories:
14
- - text-to-image
15
- dataset_info:
16
- features:
17
- - name: filename
18
- dtype: string
19
- - name: image
20
- dtype: image
21
- - name: prompt
22
- dtype: string
23
- arxiv: 2411.05007
24
  ---
25
 
26
  <p align="center" style="border-radius: 10px">
@@ -34,9 +31,54 @@ arxiv: 2411.05007
34
  <a href='https://hanlab.mit.edu/projects/svdquant'>[Website]</a>&ensp;
35
  <a href='https://hanlab.mit.edu/blog/svdquant'>[Blog]</a>
36
  </div>
37
- This is the [sDCI](https://arxiv.org/abs/2312.08578) dataset used in [SVDQuant](https://hanlab.mit.edu/blog/svdquant) for benchmarking.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
 
39
- If you find this dataset useful or relevant to your research, please cite
40
 
41
  ```bibtex
42
  @article{
@@ -46,13 +88,4 @@ If you find this dataset useful or relevant to your research, please cite
46
  journal={arXiv preprint arXiv:2411.05007},
47
  year={2024}
48
  }
49
-
50
- @inproceedings{urbanek2024picture,
51
- title={A picture is worth more than 77 text tokens: Evaluating clip-style models on dense captions},
52
- author={Urbanek, Jack and Bordes, Florian and Astolfi, Pietro and Williamson, Mary and Sharma, Vasu and Romero-Soriano, Adriana},
53
- booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
54
- pages={26700--26709},
55
- year={2024}
56
- }
57
- ```
58
-
 
1
  ---
2
  license: other
3
+ license_name: flux-1-dev-non-commercial-license
 
4
  tags:
 
 
5
  - text-to-image
6
+ - SVDQuant
7
+ - FLUX.1-dev
8
+ - INT4
9
+ - FLUX.1
10
+ - Diffusion
11
+ - Quantization
12
+ - LoRA
13
  language:
14
  - en
15
+ base_model:
16
+ - mit-han-lab/svdq-int4-flux.1-dev
17
+ pipeline_tag: text-to-image
18
+ datasets:
19
+ - mit-han-lab/svdquant-datasets
20
+ library_name: diffusers
 
 
 
 
 
 
 
21
  ---
22
 
23
  <p align="center" style="border-radius: 10px">
 
31
  <a href='https://hanlab.mit.edu/projects/svdquant'>[Website]</a>&ensp;
32
  <a href='https://hanlab.mit.edu/blog/svdquant'>[Blog]</a>
33
  </div>
34
+ ![teaser](https://github.com/mit-han-lab/nunchaku/raw/refs/heads/main/assets/lora.jpg)
35
+ SVDQuant seamlessly integrates with off-the-shelf LoRAs without requiring re-quantization. When applying LoRAs, it matches the image quality of the original 16-bit FLUX.1-dev.
36
+
37
+ ## Model Description
38
+
39
+ <div>
40
+ This reposity contains a converted LoRA collection for SVDQuant INT4 FLUX.1-dev. The LoRA style includes <a href="https://huggingface.co/XLabs-AI/flux-RealismLora">Realism</a>,
41
+ <a href="https://huggingface.co/aleksa-codes/flux-ghibsky-illustration">Ghibsky Illustration</a>,
42
+ <a href="https://huggingface.co/alvdansen/sonny-anime-fixed">Anime</a>,
43
+ <a href="https://huggingface.co/Shakker-Labs/FLUX.1-dev-LoRA-Children-Simple-Sketch">Children Sketch</a>, and
44
+ <a href="https://huggingface.co/linoyts/yarn_art_Flux_LoRA">Yarn Art</a>.
45
+ </div>
46
+
47
+
48
+ ## Usage
49
+
50
+ ### Diffusers
51
+
52
+ Please follow the instructions in [mit-han-lab/nunchaku](https://github.com/mit-han-lab/nunchaku) to set up the environment. Then you can run the model with
53
+
54
+ ```python
55
+ import torch
56
+
57
+ from nunchaku.pipelines import flux as nunchaku_flux
58
+
59
+ pipeline = nunchaku_flux.from_pretrained(
60
+ "black-forest-labs/FLUX.1-dev",
61
+ torch_dtype=torch.bfloat16,
62
+ qmodel_path="mit-han-lab/svdq-int4-flux.1-dev", # download from Huggingface
63
+ ).to("cuda")
64
+ pipeline.transformer.nunchaku_update_params(mit-han-lab/svdquant-models/svdq-flux.1-dev-lora-anime.safetensors)
65
+ pipeline.transformer.nunchaku_set_lora_scale(1)
66
+ image = pipeline("a dog wearing a wizard hat", num_inference_steps=28, guidance_scale=3.5).images[0]
67
+ image.save("example.png")
68
+ ```
69
+
70
+ ### Comfy UI
71
+
72
+ Work in progress.
73
+
74
+ ## Limitations
75
+
76
+ - The model is only runnable on NVIDIA GPUs with architectures sm_86 (Ampere: RTX 3090, A6000), sm_89 (Ada: RTX 4090), and sm_80 (A100). See this [issue](https://github.com/mit-han-lab/nunchaku/issues/1) for more details.
77
+ - You may observe some slight differences from the BF16 models in details.
78
+
79
+ ### Citation
80
 
81
+ If you find this model useful or relevant to your research, please cite
82
 
83
  ```bibtex
84
  @article{
 
88
  journal={arXiv preprint arXiv:2411.05007},
89
  year={2024}
90
  }
91
+ ```