XiangpengYang commited on
Commit
f7c5396
·
1 Parent(s): 5aca2b0
README.md CHANGED
@@ -108,32 +108,20 @@ python image_util/sample_video2frames.py --video_path 'your video path' --output
108
  We segment videos using our ReLER lab's [SAM-Track](https://github.com/z-x-yang/Segment-and-Track-Anything). I suggest using the `app.py` in SAM-Track for `graio` mode to manually select which region in the video your want to edit. Here, we also provided an script ` image_util/process_webui_mask.py` to process masks from SAM-Track path to VideoGrain path.
109
 
110
 
111
- ## 🔥 VideoGrain Editing
112
 
113
- ### Inference
114
-
115
- **🔛prepare your config**
116
-
117
- VideoGrain is a training-free framework. To run VideoGrain on your video, modify `./config/demo_config.yaml` based on your needs:
118
-
119
- 1. Replace your pretrained model path and controlnet path in your config. you can change the control_type to `dwpose` or `depth_zoe` or `depth`(midas).
120
- 2. Prepare your video frames and layout masks (edit regions) using SAM-Track or SAM2 in dataset config.
121
- 3. Change the `prompt`, and extract each `local prompt` in the editing prompts. the local prompt order should be same as layout masks order.
122
- 4. Your can change flatten resolution with 1->64, 2->16, 4->8. (commonly, flatten at 64 worked best)
123
- 5. To ensure temporal consistency, you can set `use_pnp: True` and `inject_step:5/10`. (Note: pnp>10 steps will be bad for multi-regions editing)
124
- 6. If you want to visualize the cross attn weight, set `vis_cross_attn: True`
125
- 7. If you want to cluster DDIM Inversion spatial temporal video feature, set `cluster_inversion_feature: True`
126
-
127
- **😍Editing your video**
128
 
129
  ```bash
130
  bash test.sh
131
  #or
132
- CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config /path/to/the/config
133
  ```
134
 
135
- <details><summary>The result is saved at `./result` . (Click for directory structure) </summary>
136
 
 
137
  ```
138
  result
139
  ├── run_two_man
@@ -150,6 +138,28 @@ result
150
  ```
151
  </details>
152
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
153
  ## 🚀Multi-Grained Video Editing Results
154
 
155
  ### 🌈 Multi-Grained Definition
@@ -207,7 +217,7 @@ CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/instance_level
207
  </tr>
208
  </table>
209
 
210
- ## 🕺 Part-level Video Editing
211
  You can get part-level video editing results, using the following command:
212
  ```bash
213
  CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/part_level/modification/man_text_message/blue_shirt.yaml
@@ -246,6 +256,43 @@ CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/part_level/modi
246
  <td width=15% style="text-align:center;">superman </td>
247
  <td width=15% style="text-align:center;">superman + sunglasses</td>
248
  </tr>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
249
  </table>
250
 
251
 
@@ -284,7 +331,7 @@ CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/instance_level/
284
  <td><img src="assets/soely_edit/input.gif"></td>
285
  <td><img src="assets/vis/edit.gif"></td>
286
  <td><img src="assets/vis/spiderman_weight.gif"></td>
287
- <td><img src="assets/bear_weight.gif"></td>
288
  <td><img src="/assets/vis/cherry_weight.gif"></td>
289
  </tr>
290
  <tr>
 
108
  We segment videos using our ReLER lab's [SAM-Track](https://github.com/z-x-yang/Segment-and-Track-Anything). I suggest using the `app.py` in SAM-Track for `graio` mode to manually select which region in the video your want to edit. Here, we also provided an script ` image_util/process_webui_mask.py` to process masks from SAM-Track path to VideoGrain path.
109
 
110
 
111
+ ## 🔥🔥🔥 VideoGrain Editing
112
 
113
+ ### 🎨 Inference
114
+ Your can reproduce the instance + part level results in our teaser by running:
 
 
 
 
 
 
 
 
 
 
 
 
 
115
 
116
  ```bash
117
  bash test.sh
118
  #or
119
+ CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/part_level/adding_new_object/run_two_man/spider_polar_sunglass.yaml
120
  ```
121
 
122
+ For other instance/part/class results in VideoGrain project page or teaser, we provide all the data (video frames and layout masks) and corresponding configs to reproduce, the results is shown in [🚀Multi-Grained Video Editing Results](#multi-grained-video-editing-results).
123
 
124
+ <details><summary>The result is saved at `./result` . (Click for directory structure) </summary>
125
  ```
126
  result
127
  ├── run_two_man
 
138
  ```
139
  </details>
140
 
141
+
142
+ ## Editing guidance for YOUR Video
143
+ ### 🔛prepare your config**
144
+
145
+ VideoGrain is a training-free framework. To run VideoGrain on your video, modify `./config/demo_config.yaml` based on your needs:
146
+
147
+ 1. Replace your pretrained model path and controlnet path in your config. you can change the control_type to `dwpose` or `depth_zoe` or `depth`(midas).
148
+ 2. Prepare your video frames and layout masks (edit regions) using SAM-Track or SAM2 in dataset config.
149
+ 3. Change the `prompt`, and extract each `local prompt` in the editing prompts. the local prompt order should be same as layout masks order.
150
+ 4. Your can change flatten resolution with 1->64, 2->16, 4->8. (commonly, flatten at 64 worked best)
151
+ 5. To ensure temporal consistency, you can set `use_pnp: True` and `inject_step:5/10`. (Note: pnp>10 steps will be bad for multi-regions editing)
152
+ 6. If you want to visualize the cross attn weight, set `vis_cross_attn: True`
153
+ 7. If you want to cluster DDIM Inversion spatial temporal video feature, set `cluster_inversion_feature: True`
154
+
155
+ ### 😍Editing your video**
156
+
157
+ ```bash
158
+ bash test.sh
159
+ #or
160
+ CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config /path/to/the/config
161
+ ```
162
+
163
  ## 🚀Multi-Grained Video Editing Results
164
 
165
  ### 🌈 Multi-Grained Definition
 
217
  </tr>
218
  </table>
219
 
220
+ ## 🕺 Part-level Video Editing
221
  You can get part-level video editing results, using the following command:
222
  ```bash
223
  CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/part_level/modification/man_text_message/blue_shirt.yaml
 
256
  <td width=15% style="text-align:center;">superman </td>
257
  <td width=15% style="text-align:center;">superman + sunglasses</td>
258
  </tr>
259
+ </table>
260
+
261
+ ## 🥳 Class-level Video Editing
262
+ You can get class-level video editing results, using the following command:
263
+ ```bash
264
+ CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/class_level/wolf/wolf.yaml
265
+ ```
266
+
267
+ <table class="center">
268
+ <tr>
269
+ <td><img src="assets/class-level/wolf.gif"></td>
270
+ <td><img src="assets/class-level/pig.gif"></td>
271
+ <td><img src="assets/class-level/husky.gif"></td>
272
+ <td><img src="assets/class-level/bear.gif"></td>
273
+ <td><img src="assets/class-level/tiger.gif"></td>
274
+ </tr>
275
+ <tr>
276
+ <td width=15% style="text-align:center;">input</td>
277
+ <td width=15% style="text-align:center;">pig</td>
278
+ <td width=15% style="text-align:center;">husky</td>
279
+ <td width=15% style="text-align:center;">bear</td>
280
+ <td width=15% style="text-align:center;">tiger</td>
281
+ </tr>
282
+ <tr>
283
+ <td><img src="assets/class-level/tennis.gif"></td>
284
+ <td><img src="assets/class-level/tennis_1cls.gif"></td>
285
+ <td><img src="assets/class-level/tennis_3cls.gif"></td>
286
+ <td><img src="assets/class-level/car-1.gif"></td>
287
+ <td><img src="assets/class-level/posche.gif"></td>
288
+ </tr>
289
+ <tr>
290
+ <td width=15% style="text-align:center;">input</td>
291
+ <td width=15% style="text-align:center;">iron man</td>
292
+ <td width=15% style="text-align:center;">Batman + snow court + iced wall</td>
293
+ <td width=15% style="text-align:center;">input </td>
294
+ <td width=15% style="text-align:center;">posche</td>
295
+ </tr>
296
  </table>
297
 
298
 
 
331
  <td><img src="assets/soely_edit/input.gif"></td>
332
  <td><img src="assets/vis/edit.gif"></td>
333
  <td><img src="assets/vis/spiderman_weight.gif"></td>
334
+ <td><img src="assets/vis/bear_weight.gif"></td>
335
  <td><img src="/assets/vis/cherry_weight.gif"></td>
336
  </tr>
337
  <tr>
assets/class-level/bear.gif ADDED

Git LFS Details

  • SHA256: 29be8413f7278c1d266357d13cd295fb05722ead8d4ed6703b7c738e8b59c3fd
  • Pointer size: 132 Bytes
  • Size of remote file: 2.39 MB
assets/class-level/car-1.gif ADDED

Git LFS Details

  • SHA256: 72acea1c5d5097e2e878f339a72f0b8cb0f293fd1bca71f6a65a9e9344474519
  • Pointer size: 132 Bytes
  • Size of remote file: 1.09 MB
assets/class-level/husky.gif ADDED

Git LFS Details

  • SHA256: 842375b1c6bcd1a37cc0c16fd0161af1e2c3e946d05bd0a98012735208a32273
  • Pointer size: 132 Bytes
  • Size of remote file: 2.29 MB
assets/class-level/pig.gif ADDED

Git LFS Details

  • SHA256: 6797aa3ed46daac96e62bcd844836e638e3ffe4685df97b247bbc2a2c7074b04
  • Pointer size: 132 Bytes
  • Size of remote file: 1.76 MB
assets/class-level/posche.gif ADDED

Git LFS Details

  • SHA256: 329c13d401fcc62cee9d0632857024a55d5865c752cf8228e34bb2b9afdf039c
  • Pointer size: 132 Bytes
  • Size of remote file: 1.08 MB
assets/class-level/tennis.gif ADDED

Git LFS Details

  • SHA256: b97b2d87eba4706b75038034585defb78439e81a740e7d2881b447f640352239
  • Pointer size: 132 Bytes
  • Size of remote file: 2.91 MB
assets/class-level/tennis_1cls.gif ADDED

Git LFS Details

  • SHA256: bad43c22e6e29d67809b5fb7b6fd43a1a8578b0a44268da05bf6f6fedd3f1ca8
  • Pointer size: 132 Bytes
  • Size of remote file: 2.92 MB
assets/class-level/tennis_3cls.gif ADDED

Git LFS Details

  • SHA256: cc1527e1cf680339e8f2d9b8929bb2058d1b0aed6d5f89bd024182ec257d02b9
  • Pointer size: 132 Bytes
  • Size of remote file: 3.32 MB
assets/class-level/tiger.gif ADDED

Git LFS Details

  • SHA256: 6af763e70a53f116c7fa81ebbea927ec5ebefc46d73edbd864bed75e96f0ad54
  • Pointer size: 132 Bytes
  • Size of remote file: 2.75 MB
assets/class-level/wolf.gif ADDED

Git LFS Details

  • SHA256: 651458a1ebc192a73a482f03e8d2961f694892b56a9e83eb844439ac0ba314fc
  • Pointer size: 132 Bytes
  • Size of remote file: 2.59 MB
assets/{bear_weight.gif → vis/bear_weight.gif} RENAMED
File without changes
config/part_level/adding_new_object/run_two_man/{running_spider_polar_sunglass.yaml → spider_polar_sunglass.yaml} RENAMED
File without changes
test.sh CHANGED
@@ -1,2 +1,2 @@
1
  export CUDA_VISIBLE_DEVICES=0
2
- accelerate launch test.py --config config/part_level/adding_new_object/run_two_man/running_spider_polar_sunglass.yaml
 
1
  export CUDA_VISIBLE_DEVICES=0
2
+ accelerate launch test.py --config config/part_level/adding_new_object/run_two_man/spider_polar_sunglass.yaml