raulc0399 commited on
Commit
1e4a04f
1 Parent(s): 65372ba

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +97 -0
README.md ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: flux-1-dev-non-commercial-license
4
+ license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.
5
+ datasets:
6
+ - raulc0399/open_pose_controlnet
7
+ language:
8
+ - en
9
+ pipeline_tag: text-to-image
10
+ tags:
11
+ - Stable Diffusion
12
+ - image-generation
13
+ - Flux
14
+ - diffusers
15
+ - controlnet
16
+ ---
17
+
18
+ # openpose controlnet for flux.dev
19
+
20
+ ## inference
21
+
22
+ an openpose controlnet for flux-dev, trained on https://huggingface.co/datasets/raulc0399/open_pose_controlnet
23
+
24
+ the controlnet model is trained for the xlabs ai pipeline https://github.com/XLabs-AI/x-flux
25
+
26
+ to install the pipeline, execute the following:
27
+
28
+ ```
29
+ git clone https://github.com/XLabs-AI/x-flux.git
30
+ cd x-flux
31
+ python3 -m venv xflux_env
32
+ source xflux_env/bin/activate
33
+ pip install -r requirements.txt
34
+ ```
35
+
36
+ to run the pipeline with controlnet:
37
+
38
+ ```
39
+ python3 main.py \
40
+ --prompt "person enjoying a day at the park, full hd, cinematic" \
41
+ --image ~/open_pose_controlnet_dataset/validation_images/pose/3_pose_1024.jpg --control_type openpose \
42
+ --local_path ./model.safetensors \
43
+ --use_controlnet --model_type flux-dev \
44
+ --width 1024 --height 1024 --timestep_to_start_cfg 2 \
45
+ --num_steps 50 --true_gs 4 --guidance 4 \
46
+ --save_path ~/gen_imgs
47
+ ```
48
+
49
+ if the image has already been preprocessed comment out the line #146 from src/flux/xflux_pipeline.py
50
+ ```
51
+ # self.annotator = Annotator(control_type, self.other_device)
52
+ ```
53
+
54
+ ## training
55
+
56
+ ```
57
+ oxen clone https://hub.oxen.ai/raulc/open_pose_controlnet_dataset
58
+ git clone https://github.com/raulc0399/x-flux.git
59
+ cd x-flux
60
+ git checkout open_pose_training
61
+ python3 -m venv xflux_env
62
+ source xflux_env/bin/activate
63
+ pip install -r requirements.txt
64
+ huggingface-cli login
65
+ accelerate config
66
+ mkdir images
67
+ rsync -r ~/open_pose_controlnet_dataset/train/images/ images/
68
+ cp train_configs/test_openpose_controlnet.yaml train_configs/openpose_controlnet.yaml
69
+ accelerate launch train_flux_deepspeed_controlnet.py --config "train_configs/openpose_controlnet.yaml"
70
+ ```
71
+ note 1: check the file train_configs/openpose_controlnet.yaml before starting
72
+
73
+ note 2: rsync is needed, cp does not work with that many files
74
+
75
+ note 3: the oxen repo has the caption files as json as expected by the training script
76
+
77
+ ## results
78
+
79
+ using these 2 images:
80
+
81
+ ![control image 1](https://huggingface.co/raulc0399/flux_dev_openpose_controlnet/resolve/main/2_pose_1024.jpg "control image 1" )
82
+ ![control image 2](https://huggingface.co/raulc0399/flux_dev_openpose_controlnet/resolve/main/3_pose_1024.jpg "control image 2")
83
+
84
+ with these prompts:
85
+
86
+ "two friends sitting by each other enjoying a day at the park, full hd, cinematic"
87
+ "person enjoying a day at the park, full hd, cinematic"
88
+
89
+ resulted in these images:
90
+
91
+ ![result image 1](https://huggingface.co/raulc0399/flux_dev_openpose_controlnet/resolve/main/prev_result_1_100.png "result image 1" )
92
+ ![result image 2](https://huggingface.co/raulc0399/flux_dev_openpose_controlnet/resolve/main/prev_result_0_100.png "result image 2")
93
+
94
+
95
+ ## License
96
+
97
+ Weights fall under the [FLUX.1 [dev]](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md) Non-Commercial License<br/>