jiachenli-ucsb
/

T2V-Turbo-VC2-Merged

Model card Files Files and versions Community

jiachenli-ucsb commited on Jun 3, 2024

Commit

97213f1

verified ·

1 Parent(s): 88890c5

Update README.md

Browse files

Files changed (1) hide show

README.md +78 -3

README.md CHANGED Viewed

@@ -1,3 +1,78 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+---
+# <span style="font-family: 'Courier New', monospace; font-weight: bold">T2V-Turbo</span>: Breaking the Quality Bottleneck of Video Consistency Model with Mixed Reward Feedback
+## 4-step Text-to-video Generation
+<table class="center">
+  <td>
+  <video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0273.mp4" type="video/mp4"></video>
+  </td>
+  <td>
+  <video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0054.mp4" type="video/mp4"></video>
+  </td>
+  <td>
+  <video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0262.mp4" type="video/mp4"></video>
+  </td>
+  <tr>
+  <td style="text-align:center;" width="320">With the style of low-poly game art, A majestic, white horse gallops gracefully across a moonlit beach.</td>
+  <td style="text-align:center;" width="320">medium shot of Christine, a beautiful 25-year-old brunette resembling Selena Gomez, anxiously looking up as she walks down a New York street, cinematic style</td>
+  <td style="text-align:center;" width="320">a cartoon pig playing his guitar, Andrew Warhol style</td>
+  <tr>
+</table >
+<table class="center">
+  <td>
+  <video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0023.mp4" type="video/mp4"></video>
+  </td>
+  <td>
+  <video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0021.mp4" type="video/mp4"></video>
+  </td>
+  <td>
+  <video poster="" autoplay controls muted loop height="150%" playbackRate=2.0><source src="https://rg-lcd.s3.amazonaws.com/t2v-turbo-demo/t2v-turbo-vc2/4steps/0064.mp4" type="video/mp4"></video>
+  </td>
+  <tr>
+  <td style="text-align:center;" width="320">a dog wearing vr goggles on a boat</td>
+  <td style="text-align:center;" width="320">Pikachu snowboarding</td>
+  <td style="text-align:center;" width="320">a girl floating underwater </td>
+  <tr>
+</table >
+## Model description 🚀
+This repository contains `unet_lora.pt` that can turn [VideoCrafter2](https://ailab-cvc.github.io/videocrafter2/) into our <span style="font-family: 'Courier New', monospace; font-weight: bold">T2V-Turbo</span> (VC2). Our <span style="font-family: 'Courier New', monospace; font-weight: bold">T2V-Turbo</span> (VC2) can achieve both fast and high-quality T2V generation. On [VBench](https://vchitect.github.io/VBench-project/), the 4-step generation from our <span style="font-family: 'Courier New', monospace; font-weight: bold">T2V-Turbo</span> (VC2) even outperform proprietary systems, including [Gen-2](https://research.runwayml.com/gen2) and [Pika](https://pika.art/). Please refer to our [GitHub repo](https://github.com/Ji4chenLi/t2v-turbo) for detailed instructions.
+## Usage 👓
+*This checkpoint is obtained by merging the UNet [LoRA weight](https://huggingface.co/jiachenli-ucsb/T2V-Turbo-VC2) to the UNet of [VideoCrafter2](https://huggingface.co/VideoCrafter/VideoCrafter2/). Therefore, the checkpoint here is also under the apache-2.0 license.*
+You need to first clone our [GitHub repo](https://github.com/Ji4chenLi/t2v-turbo). Here are the codes to load the checkpoint.
+```py
+from utils.common_utils import load_model_checkpoint
+from utils.utils import instantiate_from_config
+config = OmegaConf.load("configs/inference_t2v_512_v2.0.yaml")
+model_config = config.pop("model", OmegaConf.create())
+pretrained_t2v = instantiate_from_config(model_config)
+unet_config = model_config["params"]["unet_config"]
+unet_config["params"]["time_cond_proj_dim"] = 256
+unet = instantiate_from_config(unet_config)
+pretrained_t2v.model.diffusion_model = unet
+pretrained_t2v = load_model_checkpoint(pretrained_t2v, "checkpoints/t2v_turbo_vc2.pt")
+```
+## Misuse, Malicious Use and Excessive Use 📖
+Our model is meant for research purposes.
+- It is prohibited to generate content that is demeaning or harmful to people or their environment, culture, religion, etc.
+- Prohibited for pornographic, violent and bloody content generation.
+- Prohibited for error and false information generation.