F go
#4
by
Cletrason
- opened
- README.md +6 -55
- config.json +0 -47
- control_v2p_sd15_mediapipe_face.full.ckpt +0 -3
- control_v2p_sd15_mediapipe_face.pth +0 -3
- control_v2p_sd15_mediapipe_face.safetensors +0 -3
- control_v2p_sd15_mediapipe_face.yaml +0 -79
- diffusion_pytorch_model.bin +0 -3
- diffusion_pytorch_model.fp16.bin +0 -3
- diffusion_pytorch_model.fp16.safetensors +0 -3
- diffusion_pytorch_model.safetensors +0 -3
- diffusion_sd15/config.json +0 -42
- diffusion_sd15/diffusion_pytorch_model.bin +0 -3
- diffusion_sd15/diffusion_pytorch_model.fp16.bin +0 -3
- diffusion_sd15/diffusion_pytorch_model.fp16.safetensors +0 -3
- gradio_face2image.py +2 -2
- control_v2p_sd21_mediapipe_face.yaml β models/cldm_v21.yaml +0 -0
- control_v2p_sd21_mediapipe_face.full.ckpt β models/controlnet_sd21_laion_face_v2_full.ckpt +0 -0
- control_v2p_sd21_mediapipe_face.pth β models/controlnet_sd21_laion_face_v2_pruned.pth +0 -0
- control_v2p_sd21_mediapipe_face.safetensors β models/controlnet_sd21_laion_face_v2_pruned.safetensors +0 -0
README.md
CHANGED
@@ -1,19 +1,16 @@
|
|
1 |
---
|
2 |
-
language:
|
3 |
-
- en
|
4 |
-
thumbnail:
|
5 |
tags:
|
6 |
- controlnet
|
7 |
- laion
|
8 |
- face
|
9 |
- mediapipe
|
10 |
-
|
11 |
-
license: openrail
|
12 |
-
base_model: stabilityai/stable-diffusion-2-1-base
|
13 |
datasets:
|
14 |
- LAION-Face
|
15 |
- LAION
|
16 |
-
pipeline_tag: image-to-image
|
17 |
---
|
18 |
|
19 |
# ControlNet LAION Face Dataset
|
@@ -107,58 +104,12 @@ python ./train_laion_face_sd15.py
|
|
107 |
We have provided `gradio_face2image.py`. Update the following two lines to point them to your trained model.
|
108 |
|
109 |
```
|
110 |
-
model = create_model('./models/cldm_v21.yaml').cpu() # If you fine-
|
111 |
model.load_state_dict(load_state_dict('./models/control_sd21_openpose.pth', location='cuda'))
|
112 |
```
|
113 |
|
114 |
The model has some limitations: while it is empirically better at tracking gaze and mouth poses than previous attempts, it may still ignore controls. Adding details to the prompt like, "looking right" can abate bad behavior.
|
115 |
|
116 |
-
## 𧨠Diffusers
|
117 |
-
|
118 |
-
It is recommended to use the checkpoint with [Stable Diffusion 2.1 - Base](stabilityai/stable-diffusion-2-1-base) as the checkpoint has been trained on it.
|
119 |
-
Experimentally, the checkpoint can be used with other diffusion models such as dreamboothed stable diffusion.
|
120 |
-
|
121 |
-
To use with Stable Diffusion 1.5, insert `subfolder="diffusion_sd15"` into the from_pretrained arguments. A v1.5 half-precision variant is provided but untested.
|
122 |
-
|
123 |
-
1. Install `diffusers` and related packages:
|
124 |
-
```
|
125 |
-
$ pip install diffusers transformers accelerate
|
126 |
-
```
|
127 |
-
|
128 |
-
2. Run code:
|
129 |
-
```py
|
130 |
-
from PIL import Image
|
131 |
-
import numpy as np
|
132 |
-
import torch
|
133 |
-
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel, UniPCMultistepScheduler
|
134 |
-
from diffusers.utils import load_image
|
135 |
-
|
136 |
-
image = load_image(
|
137 |
-
"https://huggingface.co/CrucibleAI/ControlNetMediaPipeFace/resolve/main/samples_laion_face_dataset/family_annotation.png"
|
138 |
-
)
|
139 |
-
|
140 |
-
# Stable Diffusion 2.1-base:
|
141 |
-
controlnet = ControlNetModel.from_pretrained("CrucibleAI/ControlNetMediaPipeFace", torch_dtype=torch.float16, variant="fp16")
|
142 |
-
pipe = StableDiffusionControlNetPipeline.from_pretrained(
|
143 |
-
"stabilityai/stable-diffusion-2-1-base", controlnet=controlnet, safety_checker=None, torch_dtype=torch.float16
|
144 |
-
)
|
145 |
-
# OR
|
146 |
-
# Stable Diffusion 1.5:
|
147 |
-
controlnet = ControlNetModel.from_pretrained("CrucibleAI/ControlNetMediaPipeFace", subfolder="diffusion_sd15")
|
148 |
-
pipe = StableDiffusionControlNetPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None)
|
149 |
-
|
150 |
-
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
|
151 |
-
|
152 |
-
# Remove if you do not have xformers installed
|
153 |
-
# see https://huggingface.co/docs/diffusers/v0.13.0/en/optimization/xformers#installing-xformers
|
154 |
-
# for installation instructions
|
155 |
-
pipe.enable_xformers_memory_efficient_attention()
|
156 |
-
pipe.enable_model_cpu_offload()
|
157 |
-
|
158 |
-
image = pipe("a happy family at a dentist advertisement", image=image, num_inference_steps=30).images[0]
|
159 |
-
image.save('./images.png')
|
160 |
-
```
|
161 |
-
|
162 |
|
163 |
# License:
|
164 |
|
@@ -209,4 +160,4 @@ Sample images for this document were obtained from Unsplash and are CC0.
|
|
209 |
}
|
210 |
```
|
211 |
|
212 |
-
This project was made possible by Crucible AI.
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
thumbnail: ""
|
5 |
tags:
|
6 |
- controlnet
|
7 |
- laion
|
8 |
- face
|
9 |
- mediapipe
|
10 |
+
license: "openrail"
|
|
|
|
|
11 |
datasets:
|
12 |
- LAION-Face
|
13 |
- LAION
|
|
|
14 |
---
|
15 |
|
16 |
# ControlNet LAION Face Dataset
|
|
|
104 |
We have provided `gradio_face2image.py`. Update the following two lines to point them to your trained model.
|
105 |
|
106 |
```
|
107 |
+
model = create_model('./models/cldm_v21.yaml').cpu() # If you fine-tuned on SD2.1 base, this does not need to change.
|
108 |
model.load_state_dict(load_state_dict('./models/control_sd21_openpose.pth', location='cuda'))
|
109 |
```
|
110 |
|
111 |
The model has some limitations: while it is empirically better at tracking gaze and mouth poses than previous attempts, it may still ignore controls. Adding details to the prompt like, "looking right" can abate bad behavior.
|
112 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
113 |
|
114 |
# License:
|
115 |
|
|
|
160 |
}
|
161 |
```
|
162 |
|
163 |
+
This project was made possible by Crucible AI.
|
config.json
DELETED
@@ -1,47 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"_class_name": "ControlNetModel",
|
3 |
-
"_diffusers_version": "0.15.0.dev0",
|
4 |
-
"_name_or_path": "/home/patrick_huggingface_co/temp_control",
|
5 |
-
"act_fn": "silu",
|
6 |
-
"attention_head_dim": [
|
7 |
-
5,
|
8 |
-
10,
|
9 |
-
20,
|
10 |
-
20
|
11 |
-
],
|
12 |
-
"block_out_channels": [
|
13 |
-
320,
|
14 |
-
640,
|
15 |
-
1280,
|
16 |
-
1280
|
17 |
-
],
|
18 |
-
"class_embed_type": null,
|
19 |
-
"conditioning_embedding_out_channels": [
|
20 |
-
16,
|
21 |
-
32,
|
22 |
-
96,
|
23 |
-
256
|
24 |
-
],
|
25 |
-
"controlnet_conditioning_channel_order": "rgb",
|
26 |
-
"cross_attention_dim": 1024,
|
27 |
-
"down_block_types": [
|
28 |
-
"CrossAttnDownBlock2D",
|
29 |
-
"CrossAttnDownBlock2D",
|
30 |
-
"CrossAttnDownBlock2D",
|
31 |
-
"DownBlock2D"
|
32 |
-
],
|
33 |
-
"downsample_padding": 1,
|
34 |
-
"flip_sin_to_cos": true,
|
35 |
-
"freq_shift": 0,
|
36 |
-
"in_channels": 4,
|
37 |
-
"layers_per_block": 2,
|
38 |
-
"mid_block_scale_factor": 1,
|
39 |
-
"norm_eps": 1e-05,
|
40 |
-
"norm_num_groups": 32,
|
41 |
-
"num_class_embeds": null,
|
42 |
-
"only_cross_attention": false,
|
43 |
-
"projection_class_embeddings_input_dim": null,
|
44 |
-
"resnet_time_scale_shift": "default",
|
45 |
-
"upcast_attention": false,
|
46 |
-
"use_linear_projection": true
|
47 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
control_v2p_sd15_mediapipe_face.full.ckpt
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:a2a71953d7372d5585899b44693a7532ebbf80c091108ae2b8987ca93cc2dac2
|
3 |
-
size 8601300183
|
|
|
|
|
|
|
|
control_v2p_sd15_mediapipe_face.pth
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:2f2ccead3a8c0b9fbf9cad7b8eaa29834983ced916c766a92fb84db34ff29e43
|
3 |
-
size 1445239863
|
|
|
|
|
|
|
|
control_v2p_sd15_mediapipe_face.safetensors
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:5be501156709895f0b14a7ec76faae7cf0a105f76895252a2c69db541629628f
|
3 |
-
size 1445154814
|
|
|
|
|
|
|
|
control_v2p_sd15_mediapipe_face.yaml
DELETED
@@ -1,79 +0,0 @@
|
|
1 |
-
model:
|
2 |
-
target: cldm.cldm.ControlLDM
|
3 |
-
params:
|
4 |
-
linear_start: 0.00085
|
5 |
-
linear_end: 0.0120
|
6 |
-
num_timesteps_cond: 1
|
7 |
-
log_every_t: 200
|
8 |
-
timesteps: 1000
|
9 |
-
first_stage_key: "jpg"
|
10 |
-
cond_stage_key: "txt"
|
11 |
-
control_key: "hint"
|
12 |
-
image_size: 64
|
13 |
-
channels: 4
|
14 |
-
cond_stage_trainable: false
|
15 |
-
conditioning_key: crossattn
|
16 |
-
monitor: val/loss_simple_ema
|
17 |
-
scale_factor: 0.18215
|
18 |
-
use_ema: False
|
19 |
-
only_mid_control: False
|
20 |
-
|
21 |
-
control_stage_config:
|
22 |
-
target: cldm.cldm.ControlNet
|
23 |
-
params:
|
24 |
-
image_size: 32 # unused
|
25 |
-
in_channels: 4
|
26 |
-
hint_channels: 3
|
27 |
-
model_channels: 320
|
28 |
-
attention_resolutions: [ 4, 2, 1 ]
|
29 |
-
num_res_blocks: 2
|
30 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
31 |
-
num_heads: 8
|
32 |
-
use_spatial_transformer: True
|
33 |
-
transformer_depth: 1
|
34 |
-
context_dim: 768
|
35 |
-
use_checkpoint: True
|
36 |
-
legacy: False
|
37 |
-
|
38 |
-
unet_config:
|
39 |
-
target: cldm.cldm.ControlledUnetModel
|
40 |
-
params:
|
41 |
-
image_size: 32 # unused
|
42 |
-
in_channels: 4
|
43 |
-
out_channels: 4
|
44 |
-
model_channels: 320
|
45 |
-
attention_resolutions: [ 4, 2, 1 ]
|
46 |
-
num_res_blocks: 2
|
47 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
48 |
-
num_heads: 8
|
49 |
-
use_spatial_transformer: True
|
50 |
-
transformer_depth: 1
|
51 |
-
context_dim: 768
|
52 |
-
use_checkpoint: True
|
53 |
-
legacy: False
|
54 |
-
|
55 |
-
first_stage_config:
|
56 |
-
target: ldm.models.autoencoder.AutoencoderKL
|
57 |
-
params:
|
58 |
-
embed_dim: 4
|
59 |
-
monitor: val/rec_loss
|
60 |
-
ddconfig:
|
61 |
-
double_z: true
|
62 |
-
z_channels: 4
|
63 |
-
resolution: 256
|
64 |
-
in_channels: 3
|
65 |
-
out_ch: 3
|
66 |
-
ch: 128
|
67 |
-
ch_mult:
|
68 |
-
- 1
|
69 |
-
- 2
|
70 |
-
- 4
|
71 |
-
- 4
|
72 |
-
num_res_blocks: 2
|
73 |
-
attn_resolutions: []
|
74 |
-
dropout: 0.0
|
75 |
-
lossconfig:
|
76 |
-
target: torch.nn.Identity
|
77 |
-
|
78 |
-
cond_stage_config:
|
79 |
-
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
diffusion_pytorch_model.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:36dcd318d499df44b35432599a1b70f598e7bb42b479e4e67d4adf7b7e87e87d
|
3 |
-
size 1457051321
|
|
|
|
|
|
|
|
diffusion_pytorch_model.fp16.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:7f70c38860e0d1fcd0f5ed38bc34e61c7337b9001bed57f7bff6eba6471406f0
|
3 |
-
size 728596455
|
|
|
|
|
|
|
|
diffusion_pytorch_model.fp16.safetensors
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:02b3a8e04154b4c3d11f5210217f0dbf3fac8612d62d015cd059f2b9fe4c3364
|
3 |
-
size 728496846
|
|
|
|
|
|
|
|
diffusion_pytorch_model.safetensors
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:a683e98e2427fd6242edc9af6620708f2f8fc84bfc049fafe549e350f8d42d73
|
3 |
-
size 1456953564
|
|
|
|
|
|
|
|
diffusion_sd15/config.json
DELETED
@@ -1,42 +0,0 @@
|
|
1 |
-
{
|
2 |
-
"_class_name": "ControlNetModel",
|
3 |
-
"_diffusers_version": "0.15.0.dev0",
|
4 |
-
"_name_or_path": "/home/josephcatrambone/ControlNet/models",
|
5 |
-
"act_fn": "silu",
|
6 |
-
"attention_head_dim": 8,
|
7 |
-
"block_out_channels": [
|
8 |
-
320,
|
9 |
-
640,
|
10 |
-
1280,
|
11 |
-
1280
|
12 |
-
],
|
13 |
-
"class_embed_type": null,
|
14 |
-
"conditioning_embedding_out_channels": [
|
15 |
-
16,
|
16 |
-
32,
|
17 |
-
96,
|
18 |
-
256
|
19 |
-
],
|
20 |
-
"controlnet_conditioning_channel_order": "rgb",
|
21 |
-
"cross_attention_dim": 768,
|
22 |
-
"down_block_types": [
|
23 |
-
"CrossAttnDownBlock2D",
|
24 |
-
"CrossAttnDownBlock2D",
|
25 |
-
"CrossAttnDownBlock2D",
|
26 |
-
"DownBlock2D"
|
27 |
-
],
|
28 |
-
"downsample_padding": 1,
|
29 |
-
"flip_sin_to_cos": true,
|
30 |
-
"freq_shift": 0,
|
31 |
-
"in_channels": 4,
|
32 |
-
"layers_per_block": 2,
|
33 |
-
"mid_block_scale_factor": 1,
|
34 |
-
"norm_eps": 1e-05,
|
35 |
-
"norm_num_groups": 32,
|
36 |
-
"num_class_embeds": null,
|
37 |
-
"only_cross_attention": false,
|
38 |
-
"projection_class_embeddings_input_dim": null,
|
39 |
-
"resnet_time_scale_shift": "default",
|
40 |
-
"upcast_attention": null,
|
41 |
-
"use_linear_projection": false
|
42 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
diffusion_sd15/diffusion_pytorch_model.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:f63de389f776b75bb11f10487a187573aea84f9a51debd08f314bd084e7fb362
|
3 |
-
size 1445254969
|
|
|
|
|
|
|
|
diffusion_sd15/diffusion_pytorch_model.fp16.bin
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:0c37b3dd41e956160909129b50f84fd938116550727b491192cbdbe6f896cd7b
|
3 |
-
size 722696633
|
|
|
|
|
|
|
|
diffusion_sd15/diffusion_pytorch_model.fp16.safetensors
DELETED
@@ -1,3 +0,0 @@
|
|
1 |
-
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:9fb50465b4fd7e15f0dc7df8031767e57309cfda2917082485bcf6c11bedb540
|
3 |
-
size 722598642
|
|
|
|
|
|
|
|
gradio_face2image.py
CHANGED
@@ -13,8 +13,8 @@ from laion_face_common import generate_annotation
|
|
13 |
from share import *
|
14 |
|
15 |
|
16 |
-
model = create_model('./
|
17 |
-
model.load_state_dict(load_state_dict('./
|
18 |
model = model.cuda()
|
19 |
ddim_sampler = DDIMSampler(model) # ControlNet _only_ works with DDIM.
|
20 |
|
|
|
13 |
from share import *
|
14 |
|
15 |
|
16 |
+
model = create_model('./models/cldm_v21.yaml').cpu()
|
17 |
+
model.load_state_dict(load_state_dict('./models/controlnet_face_condition_epoch_4_0percent.ckpt', location='cuda'))
|
18 |
model = model.cuda()
|
19 |
ddim_sampler = DDIMSampler(model) # ControlNet _only_ works with DDIM.
|
20 |
|
control_v2p_sd21_mediapipe_face.yaml β models/cldm_v21.yaml
RENAMED
File without changes
|
control_v2p_sd21_mediapipe_face.full.ckpt β models/controlnet_sd21_laion_face_v2_full.ckpt
RENAMED
File without changes
|
control_v2p_sd21_mediapipe_face.pth β models/controlnet_sd21_laion_face_v2_pruned.pth
RENAMED
File without changes
|
control_v2p_sd21_mediapipe_face.safetensors β models/controlnet_sd21_laion_face_v2_pruned.safetensors
RENAMED
File without changes
|