Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,78 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
---
|
4 |
+
|
5 |
+
|
6 |
+
# <span style="font-family: 'Courier New', monospace; font-weight: bold">T2V-Turbo</span>: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
|
7 |
+
|
8 |
+
## 4-step Text-to-video Generation
|
9 |
+
|
10 |
+
<table class="center">
|
11 |
+
<td>
|
12 |
+
<video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0273.mp4" type="video/mp4"></video>
|
13 |
+
</td>
|
14 |
+
<td>
|
15 |
+
<video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0054.mp4" type="video/mp4"></video>
|
16 |
+
</td>
|
17 |
+
<td>
|
18 |
+
<video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0262.mp4" type="video/mp4"></video>
|
19 |
+
</td>
|
20 |
+
|
21 |
+
<tr>
|
22 |
+
<td style="text-align:center;" width="320">With the style of low-poly game art, A majestic, white horse gallops gracefully across a moonlit beach.</td>
|
23 |
+
<td style="text-align:center;" width="320">medium shot of Christine, a beautiful 25-year-old brunette resembling Selena Gomez, anxiously looking up as she walks down a New York street, cinematic style</td>
|
24 |
+
<td style="text-align:center;" width="320">a cartoon pig playing his guitar, Andrew Warhol style</td>
|
25 |
+
<tr>
|
26 |
+
</table >
|
27 |
+
|
28 |
+
<table class="center">
|
29 |
+
<td>
|
30 |
+
<video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0023.mp4" type="video/mp4"></video>
|
31 |
+
</td>
|
32 |
+
<td>
|
33 |
+
<video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0021.mp4" type="video/mp4"></video>
|
34 |
+
</td>
|
35 |
+
<td>
|
36 |
+
<video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0064.mp4" type="video/mp4"></video>
|
37 |
+
</td>
|
38 |
+
|
39 |
+
<tr>
|
40 |
+
<td style="text-align:center;" width="320">a dog wearing vr goggles on a boat</td>
|
41 |
+
<td style="text-align:center;" width="320">Pikachu snowboarding</td>
|
42 |
+
<td style="text-align:center;" width="320">a girl floating underwater </td>
|
43 |
+
<tr>
|
44 |
+
</table >
|
45 |
+
|
46 |
+
## Model description 🚀
|
47 |
+
|
48 |
+
This repository contains `unet_lora.pt` that can turn [VideoCrafter2](https://ailab-cvc.github.io/videocrafter2/) into our <span style="font-family: 'Courier New', monospace; font-weight: bold">T2V-Turbo</span> (VC2). Our <span style="font-family: 'Courier New', monospace; font-weight: bold">T2V-Turbo</span> (VC2) can achieve both fast and high-quality T2V generation. On [VBench](https://vchitect.github.io/VBench-project/), the 4-step generation from our <span style="font-family: 'Courier New', monospace; font-weight: bold">T2V-Turbo</span> (VC2) even outperform proprietary systems, including [Gen-2](https://research.runwayml.com/gen2) and [Pika](https://pika.art/). Please refer to our [GitHub repo](https://github.com/Ji4chenLi/t2v-turbo) for detailed instructions.
|
49 |
+
|
50 |
+
## Usage 👓
|
51 |
+
|
52 |
+
*This checkpoint is obtained by merging the UNet [LoRA weight](https://huggingface.co/jiachenli-ucsb/T2V-Turbo-VC2) to the UNet of [VideoCrafter2](https://huggingface.co/VideoCrafter/VideoCrafter2/). Therefore, the checkpoint here is also under the apache-2.0 license.*
|
53 |
+
|
54 |
+
You need to first clone our [GitHub repo](https://github.com/Ji4chenLi/t2v-turbo). Here are the codes to load the checkpoint.
|
55 |
+
```py
|
56 |
+
from utils.common_utils import load_model_checkpoint
|
57 |
+
from utils.utils import instantiate_from_config
|
58 |
+
|
59 |
+
|
60 |
+
config = OmegaConf.load("configs/inference_t2v_512_v2.0.yaml")
|
61 |
+
model_config = config.pop("model", OmegaConf.create())
|
62 |
+
pretrained_t2v = instantiate_from_config(model_config)
|
63 |
+
|
64 |
+
unet_config = model_config["params"]["unet_config"]
|
65 |
+
unet_config["params"]["time_cond_proj_dim"] = 256
|
66 |
+
unet = instantiate_from_config(unet_config)
|
67 |
+
pretrained_t2v.model.diffusion_model = unet
|
68 |
+
pretrained_t2v = load_model_checkpoint(pretrained_t2v, "checkpoints/t2v_turbo_vc2.pt")
|
69 |
+
|
70 |
+
```
|
71 |
+
|
72 |
+
|
73 |
+
## Misuse, Malicious Use and Excessive Use 📖
|
74 |
+
Our model is meant for research purposes.
|
75 |
+
|
76 |
+
- It is prohibited to generate content that is demeaning or harmful to people or their environment, culture, religion, etc.
|
77 |
+
- Prohibited for pornographic, violent and bloody content generation.
|
78 |
+
- Prohibited for error and false information generation.
|