Spaces:
Running
on
Zero
Running
on
Zero
Update README.md
Browse files
README.md
CHANGED
@@ -1,22 +1,226 @@
|
|
1 |
-
|
2 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
|
4 |
-
|
5 |
-
|
|
|
|
|
6 |
|
7 |
-
https://github.com/user-attachments/assets/f72f287d-f848-4982-8f91-43c49d037007
|
8 |
|
9 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
|
11 |
## 🧰 Models
|
12 |
|
13 |
|Model|Resolution|GPU Mem. & Inference Time (A100, ddim 50steps)|Checkpoint|
|
14 |
|:---------|:---------|:--------|:--------|
|
15 |
|ToonCrafter_512|320x512| TBD (`perframe_ae=True`)|[Hugging Face](https://huggingface.co/Doubiiu/ToonCrafter/blob/main/model.ckpt)|
|
16 |
-
|SketchEncoder|TBD| TBD |[Hugging Face](https://huggingface.co/Doubiiu/ToonCrafter/blob/main/sketch_encoder.ckpt)|
|
17 |
|
18 |
|
19 |
-
Currently, ToonCrafter can support generating videos of up to 16 frames with a resolution of 512x320. The inference time can be reduced by using fewer DDIM steps.
|
20 |
|
21 |
|
22 |
|
@@ -31,11 +235,35 @@ pip install -r requirements.txt
|
|
31 |
|
32 |
|
33 |
## 💫 Inference
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
-
### 1. Local Gradio demo
|
36 |
-
1. Download pretrained ToonCrafter_512 and put the model.ckpt in checkpoints/tooncrafter_512_interp_v1/model.ckpt.
|
37 |
-
2. Download pretrained SketchEncoder and put the model.ckpt in control_models/sketch_encoder.ckpt.
|
38 |
|
|
|
|
|
|
|
39 |
```bash
|
40 |
python gradio_app.py
|
41 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
title: ToonCrafter
|
3 |
+
emoji: 😻
|
4 |
+
colorFrom: purple
|
5 |
+
colorTo: purple
|
6 |
+
sdk: gradio
|
7 |
+
sdk_version: 4.31.5
|
8 |
+
app_file: gradio_app.py
|
9 |
+
pinned: false
|
10 |
+
license: mit
|
11 |
+
---
|
12 |
|
13 |
+
## ___***ToonCrafter: Generative Cartoon Interpolation***___
|
14 |
+
<!-- ![](./assets/logo_long.png#gh-light-mode-only){: width="50%"} -->
|
15 |
+
<!-- ![](./assets/logo_long_dark.png#gh-dark-mode-only=100x20) -->
|
16 |
+
<div align="center">
|
17 |
|
|
|
18 |
|
19 |
|
20 |
+
</div>
|
21 |
+
|
22 |
+
## 🔆 Introduction
|
23 |
+
|
24 |
+
⚠️ Please check our [disclaimer](#disc) first.
|
25 |
+
|
26 |
+
🤗 ToonCrafter can interpolate two cartoon images by leveraging the pre-trained image-to-video diffusion priors. Please check our project page and paper for more information. <br>
|
27 |
+
|
28 |
+
|
29 |
+
|
30 |
+
|
31 |
+
|
32 |
+
|
33 |
+
|
34 |
+
### 1.1 Showcases (512x320)
|
35 |
+
<table class="center">
|
36 |
+
<tr style="font-weight: bolder;text-align:center;">
|
37 |
+
<td>Input starting frame</td>
|
38 |
+
<td>Input ending frame</td>
|
39 |
+
<td>Generated video</td>
|
40 |
+
</tr>
|
41 |
+
<tr>
|
42 |
+
<td>
|
43 |
+
<img src=assets/72109_125.mp4_00-00.png width="250">
|
44 |
+
</td>
|
45 |
+
<td>
|
46 |
+
<img src=assets/72109_125.mp4_00-01.png width="250">
|
47 |
+
</td>
|
48 |
+
<td>
|
49 |
+
<img src=assets/00.gif width="250">
|
50 |
+
</td>
|
51 |
+
</tr>
|
52 |
+
|
53 |
+
|
54 |
+
<tr>
|
55 |
+
<td>
|
56 |
+
<img src=assets/Japan_v2_2_062266_s2_frame1.png width="250">
|
57 |
+
</td>
|
58 |
+
<td>
|
59 |
+
<img src=assets/Japan_v2_2_062266_s2_frame3.png width="250">
|
60 |
+
</td>
|
61 |
+
<td>
|
62 |
+
<img src=assets/03.gif width="250">
|
63 |
+
</td>
|
64 |
+
</tr>
|
65 |
+
<tr>
|
66 |
+
<td>
|
67 |
+
<img src=assets/Japan_v2_1_070321_s3_frame1.png width="250">
|
68 |
+
</td>
|
69 |
+
<td>
|
70 |
+
<img src=assets/Japan_v2_1_070321_s3_frame3.png width="250">
|
71 |
+
</td>
|
72 |
+
<td>
|
73 |
+
<img src=assets/02.gif width="250">
|
74 |
+
</td>
|
75 |
+
</tr>
|
76 |
+
<tr>
|
77 |
+
<td>
|
78 |
+
<img src=assets/74302_1349_frame1.png width="250">
|
79 |
+
</td>
|
80 |
+
<td>
|
81 |
+
<img src=assets/74302_1349_frame3.png width="250">
|
82 |
+
</td>
|
83 |
+
<td>
|
84 |
+
<img src=assets/01.gif width="250">
|
85 |
+
</td>
|
86 |
+
</tr>
|
87 |
+
</table>
|
88 |
+
|
89 |
+
### 1.2 Sparse sketch guidance
|
90 |
+
<table class="center">
|
91 |
+
<tr style="font-weight: bolder;text-align:center;">
|
92 |
+
<td>Input starting frame</td>
|
93 |
+
<td>Input ending frame</td>
|
94 |
+
<td>Input sketch guidance</td>
|
95 |
+
<td>Generated video</td>
|
96 |
+
</tr>
|
97 |
+
<tr>
|
98 |
+
<td>
|
99 |
+
<img src=assets/72105_388.mp4_00-00.png width="200">
|
100 |
+
</td>
|
101 |
+
<td>
|
102 |
+
<img src=assets/72105_388.mp4_00-01.png width="200">
|
103 |
+
</td>
|
104 |
+
<td>
|
105 |
+
<img src=assets/06.gif width="200">
|
106 |
+
</td>
|
107 |
+
<td>
|
108 |
+
<img src=assets/07.gif width="200">
|
109 |
+
</td>
|
110 |
+
</tr>
|
111 |
+
|
112 |
+
<tr>
|
113 |
+
<td>
|
114 |
+
<img src=assets/72110_255.mp4_00-00.png width="200">
|
115 |
+
</td>
|
116 |
+
<td>
|
117 |
+
<img src=assets/72110_255.mp4_00-01.png width="200">
|
118 |
+
</td>
|
119 |
+
<td>
|
120 |
+
<img src=assets/12.gif width="200">
|
121 |
+
</td>
|
122 |
+
<td>
|
123 |
+
<img src=assets/13.gif width="200">
|
124 |
+
</td>
|
125 |
+
</tr>
|
126 |
+
|
127 |
+
|
128 |
+
</table>
|
129 |
+
|
130 |
+
|
131 |
+
### 2. Applications
|
132 |
+
#### 2.1 Cartoon Sketch Interpolation (see project page for more details)
|
133 |
+
<table class="center">
|
134 |
+
<tr style="font-weight: bolder;text-align:center;">
|
135 |
+
<td>Input starting frame</td>
|
136 |
+
<td>Input ending frame</td>
|
137 |
+
<td>Generated video</td>
|
138 |
+
</tr>
|
139 |
+
|
140 |
+
<tr>
|
141 |
+
<td>
|
142 |
+
<img src=assets/frame0001_10.png width="250">
|
143 |
+
</td>
|
144 |
+
<td>
|
145 |
+
<img src=assets/frame0016_10.png width="250">
|
146 |
+
</td>
|
147 |
+
<td>
|
148 |
+
<img src=assets/10.gif width="250">
|
149 |
+
</td>
|
150 |
+
</tr>
|
151 |
+
|
152 |
+
|
153 |
+
<tr>
|
154 |
+
<td>
|
155 |
+
<img src=assets/frame0001_11.png width="250">
|
156 |
+
</td>
|
157 |
+
<td>
|
158 |
+
<img src=assets/frame0016_11.png width="250">
|
159 |
+
</td>
|
160 |
+
<td>
|
161 |
+
<img src=assets/11.gif width="250">
|
162 |
+
</td>
|
163 |
+
</tr>
|
164 |
+
|
165 |
+
</table>
|
166 |
+
|
167 |
+
|
168 |
+
#### 2.2 Reference-based Sketch Colorization
|
169 |
+
<table class="center">
|
170 |
+
<tr style="font-weight: bolder;text-align:center;">
|
171 |
+
<td>Input sketch</td>
|
172 |
+
<td>Input reference</td>
|
173 |
+
<td>Colorization results</td>
|
174 |
+
</tr>
|
175 |
+
|
176 |
+
<tr>
|
177 |
+
<td>
|
178 |
+
<img src=assets/04.gif width="250">
|
179 |
+
</td>
|
180 |
+
<td>
|
181 |
+
<img src=assets/frame0001_05.png width="250">
|
182 |
+
</td>
|
183 |
+
<td>
|
184 |
+
<img src=assets/05.gif width="250">
|
185 |
+
</td>
|
186 |
+
</tr>
|
187 |
+
|
188 |
+
|
189 |
+
<tr>
|
190 |
+
<td>
|
191 |
+
<img src=assets/08.gif width="250">
|
192 |
+
</td>
|
193 |
+
<td>
|
194 |
+
<img src=assets/frame0001_09.png width="250">
|
195 |
+
</td>
|
196 |
+
<td>
|
197 |
+
<img src=assets/09.gif width="250">
|
198 |
+
</td>
|
199 |
+
</tr>
|
200 |
+
|
201 |
+
</table>
|
202 |
+
|
203 |
+
|
204 |
+
|
205 |
+
|
206 |
+
|
207 |
+
|
208 |
+
|
209 |
+
## 📝 Changelog
|
210 |
+
- [ ] Add sketch control and colorization function.
|
211 |
+
- __[2024.05.29]__: 🔥🔥 Release code and model weights.
|
212 |
+
- __[2024.05.28]__: Launch the project page and update the arXiv preprint.
|
213 |
+
<br>
|
214 |
+
|
215 |
|
216 |
## 🧰 Models
|
217 |
|
218 |
|Model|Resolution|GPU Mem. & Inference Time (A100, ddim 50steps)|Checkpoint|
|
219 |
|:---------|:---------|:--------|:--------|
|
220 |
|ToonCrafter_512|320x512| TBD (`perframe_ae=True`)|[Hugging Face](https://huggingface.co/Doubiiu/ToonCrafter/blob/main/model.ckpt)|
|
|
|
221 |
|
222 |
|
223 |
+
Currently, our ToonCrafter can support generating videos of up to 16 frames with a resolution of 512x320. The inference time can be reduced by using fewer DDIM steps.
|
224 |
|
225 |
|
226 |
|
|
|
235 |
|
236 |
|
237 |
## 💫 Inference
|
238 |
+
### 1. Command line
|
239 |
+
|
240 |
+
Download pretrained ToonCrafter_512 and put the `model.ckpt` in `checkpoints/tooncrafter_512_interp_v1/model.ckpt`.
|
241 |
+
```bash
|
242 |
+
sh scripts/run.sh
|
243 |
+
```
|
244 |
|
|
|
|
|
|
|
245 |
|
246 |
+
### 2. Local Gradio demo
|
247 |
+
|
248 |
+
Download the pretrained model and put it in the corresponding directory according to the previous guidelines.
|
249 |
```bash
|
250 |
python gradio_app.py
|
251 |
```
|
252 |
+
|
253 |
+
|
254 |
+
|
255 |
+
|
256 |
+
|
257 |
+
|
258 |
+
<!-- ## 🤝 Community Support -->
|
259 |
+
|
260 |
+
|
261 |
+
|
262 |
+
<a name="disc"></a>
|
263 |
+
## 📢 Disclaimer
|
264 |
+
Calm down. Our framework opens up the era of generative cartoon interpolation, but due to the variaity of generative video prior, the success rate is not guaranteed.
|
265 |
+
|
266 |
+
⚠️This is an open-source research exploration, instead of commercial products. It can't meet all your expectations.
|
267 |
+
|
268 |
+
This project strives to impact the domain of AI-driven video generation positively. Users are granted the freedom to create videos using this tool, but they are expected to comply with local laws and utilize it responsibly. The developers do not assume any responsibility for potential misuse by users.
|
269 |
+
****
|