Files changed (5) hide show
  1. README.txt +82 -0
  2. README_ZH.txt +81 -0
  3. downloadfile(1).txt +37 -0
  4. downloadfile(2).txt +37 -0
  5. downloadfile.bin +3 -0
README.txt ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: flux-1-dev-non-commercial-license
4
+ license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
5
+ language:
6
+ - en
7
+ base_model: black-forest-labs/FLUX.1-dev
8
+ library_name: diffusers
9
+ tags:
10
+ - Text-to-Image
11
+ - FLUX
12
+ - Stable Diffusion
13
+ pipeline_tag: text-to-image
14
+ ---
15
+
16
+ <div style="display: flex; justify-content: center; align-items: center;">
17
+ <img src="./images/images_alibaba.png" alt="alibaba" style="width: 20%; height: auto; margin-right: 5%;">
18
+ <img src="./images/images_alimama.png" alt="alimama" style="width: 20%; height: auto;">
19
+ </div>
20
+
21
+ [中文版Readme](./README_ZH.md)
22
+
23
+ This repository provides a 8-step distilled lora for [FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev) model released by AlimamaCreative Team.
24
+
25
+ # Description
26
+ This checkpoint is a 8-step distilled Lora, trained based on FLUX.1-dev model. We use a multi-head discriminator to improve the distill quality. Our model can be used for T2I, inpainting controlnet and other FLUX related models. The recommended guidance_scale=3.5 and lora_scale=1. Our Lower steps version will release later.
27
+
28
+ - Text-to-Image.
29
+
30
+ ![](./images/T2I.png)
31
+
32
+ - With [alimama-creative/FLUX.1-dev-Controlnet-Inpainting-Beta](https://huggingface.co/alimama-creative/FLUX.1-dev-Controlnet-Inpainting-Beta). Our distilled lora can be well adapted to the Inpainting controlnet, and the accelerated generated effect can follow the original output well.
33
+
34
+ ![](./images/inpaint.png)
35
+
36
+ # How to use
37
+ ## diffusers
38
+ This model can be used ditrectly with diffusers
39
+
40
+ ```python
41
+ import torch
42
+ from diffusers.pipelines import FluxPipeline
43
+
44
+ model_id = "black-forest-labs/FLUX.1-dev"
45
+ adapter_id = "alimama-creative/FLUX.1-Turbo-Alpha"
46
+
47
+ pipe = FluxPipeline.from_pretrained(
48
+ model_id,
49
+ torch_dtype=torch.bfloat16
50
+ )
51
+ pipe.to("cuda")
52
+
53
+ pipe.load_lora_weights(adapter_id)
54
+ pipe.fuse_lora()
55
+
56
+ prompt = "A DSLR photo of a shiny VW van that has a cityscape painted on it. A smiling sloth stands on grass in front of the van and is wearing a leather jacket, a cowboy hat, a kilt and a bowtie. The sloth is holding a quarterstaff and a big book."
57
+ image = pipe(
58
+ prompt=prompt,
59
+ guidance_scale=3.5,
60
+ height=1024,
61
+ width=1024,
62
+ num_inference_steps=8,
63
+ max_sequence_length=512).images[0]
64
+ ```
65
+
66
+ ## comfyui
67
+
68
+ - T2I turbo workflow: [click here](./workflows/t2I_flux_turbo.json)
69
+ - Inpainting controlnet turbo workflow: [click here](./workflows/alimama_flux_inpainting_turbo_8step.json)
70
+
71
+
72
+ # Training Details
73
+
74
+ The model is trained on 1M open source and internal sources images, with the aesthetic 6.3+ and resolution greater than 800. We use adversarial training to improve the quality. Our method fix the original FLUX.1-dev transformer as the discriminator backbone, and add multi heads to every transformer layer. We fix the guidance scale as 3.5 during training, and use the time shift as 3.
75
+
76
+ Mixed precision: bf16
77
+
78
+ Learning rate: 2e-5
79
+
80
+ Batch size: 64
81
+
82
+ Image size: 1024x1024
README_ZH.txt ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: flux-1-dev-non-commercial-license
4
+ license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
5
+ language:
6
+ - en
7
+ base_model: black-forest-labs/FLUX.1-dev
8
+ library_name: diffusers
9
+ tags:
10
+ - Text-to-Image
11
+ - FLUX
12
+ - Stable Diffusion
13
+ pipeline_tag: text-to-image
14
+ ---
15
+
16
+ <div style="display: flex; justify-content: center; align-items: center;">
17
+ <img src="./images/images_alibaba.png" alt="alibaba" style="width: 20%; height: auto; margin-right: 5%;">
18
+ <img src="./images/images_alimama.png" alt="alimama" style="width: 20%; height: auto;">
19
+ </div>
20
+
21
+ 本仓库包含了由阿里妈妈创意团队开发的基于[FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)模型的8步蒸馏版。
22
+
23
+ # 介绍
24
+
25
+ 该模型是基于FLUX.1-dev模型的8步蒸馏版lora。我们使用特殊设计的判别器来提高蒸馏质量。该模型可以用于T2I、Inpainting controlnet和其他FLUX相关模型。建议guidance_scale=3.5和lora_scale=1。我们的更低步数的版本将在后续发布。
26
+
27
+ - Text-to-Image.
28
+
29
+ ![](./images/T2I.png)
30
+
31
+ - 配合[alimama-creative/FLUX.1-dev-Controlnet-Inpainting-Beta](https://huggingface.co/alimama-creative/FLUX.1-dev-Controlnet-Inpainting-Beta)。我们模型可以很好地适配Inpainting controlnet,并与原始输出保持相似的结果。
32
+
33
+ ![](./images/inpaint.png)
34
+
35
+ # 使用指南
36
+ ## diffusers
37
+ 该模型可以直接与diffusers一起使用
38
+
39
+ ```python
40
+ import torch
41
+ from diffusers.pipelines import FluxPipeline
42
+
43
+ model_id = "black-forest-labs/FLUX.1-dev"
44
+ adapter_id = "alimama-creative/FLUX.1-Turbo-Alpha"
45
+
46
+ pipe = FluxPipeline.from_pretrained(
47
+ model_id,
48
+ torch_dtype=torch.bfloat16
49
+ )
50
+ pipe.to("cuda")
51
+
52
+ pipe.load_lora_weights(adapter_id)
53
+ pipe.fuse_lora()
54
+
55
+ prompt = "A DSLR photo of a shiny VW van that has a cityscape painted on it. A smiling sloth stands on grass in front of the van and is wearing a leather jacket, a cowboy hat, a kilt and a bowtie. The sloth is holding a quarterstaff and a big book."
56
+ image = pipe(
57
+ prompt=prompt,
58
+ guidance_scale=3.5,
59
+ height=1024,
60
+ width=1024,
61
+ num_inference_steps=8,
62
+ max_sequence_length=512).images[0]
63
+ ```
64
+
65
+ ## comfyui
66
+
67
+ - 文生图加速链路: [点击这里](./workflows/t2I_flux_turbo.json)
68
+ - Inpainting controlnet 加速链路: [点击这里](./workflows/alimama_flux_inpainting_turbo_8step.json)
69
+
70
+
71
+ # 训练细节
72
+
73
+ 该模型在1M公开数据集和内部源图片上进行训练,这些数据美学评分6.3+而且分辨率大于800。我们使用对抗训练来提高质量,我们的方法将原始FLUX.1-dev transformer固定为判别器的特征提取器,并在每个transformer层中添加判别头网络。在训练期间,我们将guidance scale固定为3.5,并使用时间偏移量3。
74
+
75
+ 混合精度: bf16
76
+
77
+ 学习率: 2e-5
78
+
79
+ 批大小: 64
80
+
81
+ 训练分辨率: 1024x1024
downloadfile(1).txt ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ T2I.png filter=lfs diff=lfs merge=lfs -text
37
+ inpaint.png filter=lfs diff=lfs merge=lfs -text
downloadfile(2).txt ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ T2I.png filter=lfs diff=lfs merge=lfs -text
37
+ inpaint.png filter=lfs diff=lfs merge=lfs -text
downloadfile.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b02ff7dc7382030157947c4a14d46be97252f39bdcfd067229218b04ef04fca
3
+ size 6148