Flowing from Words to Pixels: A Framework for Cross-Modality Evolution Paper • 2412.15213 • Published 1 day ago • 17
The effectiveness of MAE pre-pretraining for billion-scale pretraining Paper • 2303.13496 • Published Mar 23, 2023
Revisiting Weakly Supervised Pre-Training of Visual Perception Models Paper • 2201.08371 • Published Jan 20, 2022
Self-supervised Pretraining of Visual Features in the Wild Paper • 2103.01988 • Published Mar 2, 2021
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning Paper • 2311.10709 • Published Nov 17, 2023 • 24