maxin-cn commited on
Commit
cdc732c
Β·
verified Β·
1 Parent(s): 1d2e3e2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -3
README.md CHANGED
@@ -1,3 +1,41 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ ---
6
+ license: mit
7
+ ---
8
+
9
+ ## Latte: Latent Diffusion Transformer for Video Generation
10
+
11
+ This repo contains text-to-video generation pre-trained weights for our paper exploring latent diffusion models with transformers (Latte). You can find more visualizations on our [project page](https://maxin-cn.github.io/latte_project/).
12
+
13
+ ## News
14
+ - (πŸ”₯ New) May. 23, 2024. πŸ’₯ The updated LatteT2V model is released at [here](https://huggingface.co/maxin-cn/Latte/blob/main/t2v_v20240523.pt). If you want to use the updated model to generate images directly, please make sure `video_length=1`, `enable_temporal_attentions=True` and `enable_vae_temporal_decoder=False` in [t2v_sample.yaml](configs/t2v/t2v_sample.yaml).
15
+
16
+ - (πŸ”₯ New) Mar. 20, 2024. πŸ’₯ An updated LatteT2V model is coming soon, stay tuned!
17
+
18
+ - (πŸ”₯ New) Feb. 24, 2024. πŸ’₯ We are very grateful that researchers and developers like our work. We will continue to update our LatteT2V model, hoping that our efforts can help the community develop. Our Latte [discord](https://discord.gg/RguYqhVU92) channel is created for discussions. Coders are welcome to contribute.
19
+
20
+ - (πŸ”₯ New) Jan. 9, 2024. πŸ’₯ An updated LatteT2V model initialized with the [PixArt-Ξ±](https://github.com/PixArt-alpha/PixArt-alpha) is released, the checkpoint can be found [here](https://huggingface.co/maxin-cn/Latte/resolve/main/t2v.pt?download=true).
21
+
22
+ - (πŸ”₯ New) Oct. 31, 2023. πŸ’₯ The training and inference code is released. All checkpoints (including FaceForensics, SkyTimelapse, UCF101, and Taichi-HD) can be found [here](https://huggingface.co/maxin-cn/Latte/tree/main). In addition, the LatteT2V inference code is provided.
23
+
24
+ ## Contact Us
25
+ **Yaohui Wang**: [wangyaohui@pjlab.org.cn](mailto:wangyaohui@pjlab.org.cn)
26
+ **Xin Ma**: [xin.ma1@monash.edu](mailto:xin.ma1@monash.edu)
27
+
28
+ ## Citation
29
+ If you find this work useful for your research, please consider citing it.
30
+ ```bibtex
31
+ @article{ma2024latte,
32
+ title={Latte: Latent Diffusion Transformer for Video Generation},
33
+ author={Ma, Xin and Wang, Yaohui and Jia, Gengyun and Chen, Xinyuan and Liu, Ziwei and Li, Yuan-Fang and Chen, Cunjian and Qiao, Yu},
34
+ journal={arXiv preprint arXiv:2401.03048},
35
+ year={2024}
36
+ }
37
+ ```
38
+
39
+
40
+ ## Acknowledgments
41
+ Latte has been greatly inspired by the following amazing works and teams: [DiT](https://github.com/facebookresearch/DiT) and [PixArt-Ξ±](https://github.com/PixArt-alpha/PixArt-alpha), we thank all the contributors for open-sourcing.