File size: 3,265 Bytes
3127e17 251c1b1 3127e17 cb9cf9c 191ae1f 455a2c1 cc7bcfb 251c1b1 191ae1f cb9cf9c c206367 191ae1f cb9cf9c c206367 191ae1f cb9cf9c c206367 191ae1f cb9cf9c c206367 191ae1f cb9cf9c c206367 191ae1f cb9cf9c c206367 191ae1f cb9cf9c 3127e17 191ae1f 3127e17 c206367 3127e17 b67c492 3127e17 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
---
license: other
license_name: flux-1-dev-non-commercial-license
license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
tags:
- Text-to-Image
- ControlNet
- Diffusers
- Stable Diffusion
base_model: black-forest-labs/FLUX.1-dev
---
# FLUX.1-dev Controlnet
## Diffusers version
until the next Diffusers pypi release, please install Diffusers from source and use [this PR](xxxxxx) to be able to use.
TODO: change when new version.
## Checkpoint
The training of union controlnet requires a significant amount of computational power.
The current release is only an alpha version checkpoint that has not been fully trained.
The beta version is in the training process.
We have conducted ablation studies that have demonstrated the validity of the code.
The open-source release of the alpha version is solely to facilitate the rapid growth of the open-source community and the Flux ecosystem;
it is common to encounter bad cases (please accept my apologies).
It is worth noting that we have found that even a fully trained Union model may not perform as well as specialized models, such as pose control.
However, as training progresses, the performance of the Union model will continue to approach that of specialized models.
## Control Mode
| Control Mode | Description | Current Model Validity |
|:------------:|:-----------:|:-----------:|
|0|canny|🟢high|
|1|tile|🟢high|
|2|depth|🟡medium|
|3|blur|🟢high|
|4|pose|🔴low|
|5|gray|🔴low|
|6|lq|🟢high|
| Canny | Tile |
|:------------:|:------------:|
|<img src="./images/image_demo_canny.jpg" width = "300" />|<img src="./images/image_demo_tile.jpg" width = "300" />|
### canny
<img src="./images/image_demo_canny.jpg" width = "300" />
### tile
<img src="./images/image_demo_tile.jpg" width = "300" />
### depth
<img src="./images/image_demo_depth.jpg" width = "300" />
### blur
<img src="./images/image_demo_blur.jpg" width = "300" />
### pose
<img src="./images/image_demo_pose.jpg" width = "300" />
### gray
<img src="./images/image_demo_gray.jpg" width = "300" />
### low quality
<img src="./images/image_demo_lq.jpg" width = "300" />
# Demo
```python
import torch
from diffusers.utils import load_image
from diffusers.pipelines.flux.pipeline_flux_controlnet import FluxControlNetPipeline
from diffusers.models.controlnet_flux import FluxControlNetModel
# load
base_model = 'black-forest-labs/FLUX.1-dev'
controlnet_model = 'InstantX/FLUX.1-dev-Controlnet-Union-alpha'
controlnet = FluxControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
pipe = FluxControlNetPipeline.from_pretrained(base_model, controlnet=controlnet, torch_dtype=torch.bfloat16)
pipe.to("cuda")
# image cfg
width, height = 1024, 1024
controlnet_conditioning_scale = 0.6
seed = 2024
# canny
control_image = load_image("https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Union-alpha/resolve/main/images/canny.jpg")
prompt = "A girl in city, 25 years old, cool, futuristic."
control_mode = 0
image = pipe(
prompt,
control_image=control_image,
control_mode=control_mode,
controlnet_conditioning_scale=controlnet_conditioning_scale,
num_inference_steps=28,
guidance_scale=3.5,
).images[0]
image.save("image.jpg")
```
|