Spaces:
Runtime error
Runtime error
Commit
·
f7c5396
1
Parent(s):
5aca2b0
update
Browse files- README.md +67 -20
- assets/class-level/bear.gif +3 -0
- assets/class-level/car-1.gif +3 -0
- assets/class-level/husky.gif +3 -0
- assets/class-level/pig.gif +3 -0
- assets/class-level/posche.gif +3 -0
- assets/class-level/tennis.gif +3 -0
- assets/class-level/tennis_1cls.gif +3 -0
- assets/class-level/tennis_3cls.gif +3 -0
- assets/class-level/tiger.gif +3 -0
- assets/class-level/wolf.gif +3 -0
- assets/{bear_weight.gif → vis/bear_weight.gif} +0 -0
- config/part_level/adding_new_object/run_two_man/{running_spider_polar_sunglass.yaml → spider_polar_sunglass.yaml} +0 -0
- test.sh +1 -1
README.md
CHANGED
@@ -108,32 +108,20 @@ python image_util/sample_video2frames.py --video_path 'your video path' --output
|
|
108 |
We segment videos using our ReLER lab's [SAM-Track](https://github.com/z-x-yang/Segment-and-Track-Anything). I suggest using the `app.py` in SAM-Track for `graio` mode to manually select which region in the video your want to edit. Here, we also provided an script ` image_util/process_webui_mask.py` to process masks from SAM-Track path to VideoGrain path.
|
109 |
|
110 |
|
111 |
-
##
|
112 |
|
113 |
-
### Inference
|
114 |
-
|
115 |
-
**🔛prepare your config**
|
116 |
-
|
117 |
-
VideoGrain is a training-free framework. To run VideoGrain on your video, modify `./config/demo_config.yaml` based on your needs:
|
118 |
-
|
119 |
-
1. Replace your pretrained model path and controlnet path in your config. you can change the control_type to `dwpose` or `depth_zoe` or `depth`(midas).
|
120 |
-
2. Prepare your video frames and layout masks (edit regions) using SAM-Track or SAM2 in dataset config.
|
121 |
-
3. Change the `prompt`, and extract each `local prompt` in the editing prompts. the local prompt order should be same as layout masks order.
|
122 |
-
4. Your can change flatten resolution with 1->64, 2->16, 4->8. (commonly, flatten at 64 worked best)
|
123 |
-
5. To ensure temporal consistency, you can set `use_pnp: True` and `inject_step:5/10`. (Note: pnp>10 steps will be bad for multi-regions editing)
|
124 |
-
6. If you want to visualize the cross attn weight, set `vis_cross_attn: True`
|
125 |
-
7. If you want to cluster DDIM Inversion spatial temporal video feature, set `cluster_inversion_feature: True`
|
126 |
-
|
127 |
-
**😍Editing your video**
|
128 |
|
129 |
```bash
|
130 |
bash test.sh
|
131 |
#or
|
132 |
-
CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config
|
133 |
```
|
134 |
|
135 |
-
|
136 |
|
|
|
137 |
```
|
138 |
result
|
139 |
├── run_two_man
|
@@ -150,6 +138,28 @@ result
|
|
150 |
```
|
151 |
</details>
|
152 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
153 |
## 🚀Multi-Grained Video Editing Results
|
154 |
|
155 |
### 🌈 Multi-Grained Definition
|
@@ -207,7 +217,7 @@ CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/instance_level
|
|
207 |
</tr>
|
208 |
</table>
|
209 |
|
210 |
-
## 🕺
|
211 |
You can get part-level video editing results, using the following command:
|
212 |
```bash
|
213 |
CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/part_level/modification/man_text_message/blue_shirt.yaml
|
@@ -246,6 +256,43 @@ CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/part_level/modi
|
|
246 |
<td width=15% style="text-align:center;">superman </td>
|
247 |
<td width=15% style="text-align:center;">superman + sunglasses</td>
|
248 |
</tr>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
249 |
</table>
|
250 |
|
251 |
|
@@ -284,7 +331,7 @@ CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/instance_level/
|
|
284 |
<td><img src="assets/soely_edit/input.gif"></td>
|
285 |
<td><img src="assets/vis/edit.gif"></td>
|
286 |
<td><img src="assets/vis/spiderman_weight.gif"></td>
|
287 |
-
<td><img src="assets/bear_weight.gif"></td>
|
288 |
<td><img src="/assets/vis/cherry_weight.gif"></td>
|
289 |
</tr>
|
290 |
<tr>
|
|
|
108 |
We segment videos using our ReLER lab's [SAM-Track](https://github.com/z-x-yang/Segment-and-Track-Anything). I suggest using the `app.py` in SAM-Track for `graio` mode to manually select which region in the video your want to edit. Here, we also provided an script ` image_util/process_webui_mask.py` to process masks from SAM-Track path to VideoGrain path.
|
109 |
|
110 |
|
111 |
+
## 🔥🔥🔥 VideoGrain Editing
|
112 |
|
113 |
+
### 🎨 Inference
|
114 |
+
Your can reproduce the instance + part level results in our teaser by running:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
115 |
|
116 |
```bash
|
117 |
bash test.sh
|
118 |
#or
|
119 |
+
CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/part_level/adding_new_object/run_two_man/spider_polar_sunglass.yaml
|
120 |
```
|
121 |
|
122 |
+
For other instance/part/class results in VideoGrain project page or teaser, we provide all the data (video frames and layout masks) and corresponding configs to reproduce, the results is shown in [🚀Multi-Grained Video Editing Results](#multi-grained-video-editing-results).
|
123 |
|
124 |
+
<details><summary>The result is saved at `./result` . (Click for directory structure) </summary>
|
125 |
```
|
126 |
result
|
127 |
├── run_two_man
|
|
|
138 |
```
|
139 |
</details>
|
140 |
|
141 |
+
|
142 |
+
## Editing guidance for YOUR Video
|
143 |
+
### 🔛prepare your config**
|
144 |
+
|
145 |
+
VideoGrain is a training-free framework. To run VideoGrain on your video, modify `./config/demo_config.yaml` based on your needs:
|
146 |
+
|
147 |
+
1. Replace your pretrained model path and controlnet path in your config. you can change the control_type to `dwpose` or `depth_zoe` or `depth`(midas).
|
148 |
+
2. Prepare your video frames and layout masks (edit regions) using SAM-Track or SAM2 in dataset config.
|
149 |
+
3. Change the `prompt`, and extract each `local prompt` in the editing prompts. the local prompt order should be same as layout masks order.
|
150 |
+
4. Your can change flatten resolution with 1->64, 2->16, 4->8. (commonly, flatten at 64 worked best)
|
151 |
+
5. To ensure temporal consistency, you can set `use_pnp: True` and `inject_step:5/10`. (Note: pnp>10 steps will be bad for multi-regions editing)
|
152 |
+
6. If you want to visualize the cross attn weight, set `vis_cross_attn: True`
|
153 |
+
7. If you want to cluster DDIM Inversion spatial temporal video feature, set `cluster_inversion_feature: True`
|
154 |
+
|
155 |
+
### 😍Editing your video**
|
156 |
+
|
157 |
+
```bash
|
158 |
+
bash test.sh
|
159 |
+
#or
|
160 |
+
CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config /path/to/the/config
|
161 |
+
```
|
162 |
+
|
163 |
## 🚀Multi-Grained Video Editing Results
|
164 |
|
165 |
### 🌈 Multi-Grained Definition
|
|
|
217 |
</tr>
|
218 |
</table>
|
219 |
|
220 |
+
## 🕺 Part-level Video Editing
|
221 |
You can get part-level video editing results, using the following command:
|
222 |
```bash
|
223 |
CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/part_level/modification/man_text_message/blue_shirt.yaml
|
|
|
256 |
<td width=15% style="text-align:center;">superman </td>
|
257 |
<td width=15% style="text-align:center;">superman + sunglasses</td>
|
258 |
</tr>
|
259 |
+
</table>
|
260 |
+
|
261 |
+
## 🥳 Class-level Video Editing
|
262 |
+
You can get class-level video editing results, using the following command:
|
263 |
+
```bash
|
264 |
+
CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/class_level/wolf/wolf.yaml
|
265 |
+
```
|
266 |
+
|
267 |
+
<table class="center">
|
268 |
+
<tr>
|
269 |
+
<td><img src="assets/class-level/wolf.gif"></td>
|
270 |
+
<td><img src="assets/class-level/pig.gif"></td>
|
271 |
+
<td><img src="assets/class-level/husky.gif"></td>
|
272 |
+
<td><img src="assets/class-level/bear.gif"></td>
|
273 |
+
<td><img src="assets/class-level/tiger.gif"></td>
|
274 |
+
</tr>
|
275 |
+
<tr>
|
276 |
+
<td width=15% style="text-align:center;">input</td>
|
277 |
+
<td width=15% style="text-align:center;">pig</td>
|
278 |
+
<td width=15% style="text-align:center;">husky</td>
|
279 |
+
<td width=15% style="text-align:center;">bear</td>
|
280 |
+
<td width=15% style="text-align:center;">tiger</td>
|
281 |
+
</tr>
|
282 |
+
<tr>
|
283 |
+
<td><img src="assets/class-level/tennis.gif"></td>
|
284 |
+
<td><img src="assets/class-level/tennis_1cls.gif"></td>
|
285 |
+
<td><img src="assets/class-level/tennis_3cls.gif"></td>
|
286 |
+
<td><img src="assets/class-level/car-1.gif"></td>
|
287 |
+
<td><img src="assets/class-level/posche.gif"></td>
|
288 |
+
</tr>
|
289 |
+
<tr>
|
290 |
+
<td width=15% style="text-align:center;">input</td>
|
291 |
+
<td width=15% style="text-align:center;">iron man</td>
|
292 |
+
<td width=15% style="text-align:center;">Batman + snow court + iced wall</td>
|
293 |
+
<td width=15% style="text-align:center;">input </td>
|
294 |
+
<td width=15% style="text-align:center;">posche</td>
|
295 |
+
</tr>
|
296 |
</table>
|
297 |
|
298 |
|
|
|
331 |
<td><img src="assets/soely_edit/input.gif"></td>
|
332 |
<td><img src="assets/vis/edit.gif"></td>
|
333 |
<td><img src="assets/vis/spiderman_weight.gif"></td>
|
334 |
+
<td><img src="assets/vis/bear_weight.gif"></td>
|
335 |
<td><img src="/assets/vis/cherry_weight.gif"></td>
|
336 |
</tr>
|
337 |
<tr>
|
assets/class-level/bear.gif
ADDED
![]() |
Git LFS Details
|
assets/class-level/car-1.gif
ADDED
![]() |
Git LFS Details
|
assets/class-level/husky.gif
ADDED
![]() |
Git LFS Details
|
assets/class-level/pig.gif
ADDED
![]() |
Git LFS Details
|
assets/class-level/posche.gif
ADDED
![]() |
Git LFS Details
|
assets/class-level/tennis.gif
ADDED
![]() |
Git LFS Details
|
assets/class-level/tennis_1cls.gif
ADDED
![]() |
Git LFS Details
|
assets/class-level/tennis_3cls.gif
ADDED
![]() |
Git LFS Details
|
assets/class-level/tiger.gif
ADDED
![]() |
Git LFS Details
|
assets/class-level/wolf.gif
ADDED
![]() |
Git LFS Details
|
assets/{bear_weight.gif → vis/bear_weight.gif}
RENAMED
File without changes
|
config/part_level/adding_new_object/run_two_man/{running_spider_polar_sunglass.yaml → spider_polar_sunglass.yaml}
RENAMED
File without changes
|
test.sh
CHANGED
@@ -1,2 +1,2 @@
|
|
1 |
export CUDA_VISIBLE_DEVICES=0
|
2 |
-
accelerate launch test.py --config config/part_level/adding_new_object/run_two_man/
|
|
|
1 |
export CUDA_VISIBLE_DEVICES=0
|
2 |
+
accelerate launch test.py --config config/part_level/adding_new_object/run_two_man/spider_polar_sunglass.yaml
|