Cosmos Tokenizer Collection A suite of image and video tokenizers • 13 items • Updated 2 days ago • 36
Generalized Gaussian Model for Learned Image Compression Paper • 2411.19320 • Published Nov 28, 2024 • 1
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token Paper • 2412.06676 • Published about 1 month ago • 9
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model Paper • 2411.17459 • Published Nov 26, 2024 • 10
Occam's Razor for Self Supervised Learning: What is Sufficient to Learn Good Representations? Paper • 2406.10743 • Published Jun 15, 2024 • 1
Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings Paper • 2411.08017 • Published Nov 12, 2024 • 11
Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models Paper • 2411.07126 • Published Nov 11, 2024 • 28
AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation Paper • 2411.04967 • Published Nov 7, 2024 • 1
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens Paper • 2410.13863 • Published Oct 17, 2024 • 37
You Don't Need Data-Augmentation in Self-Supervised Learning Paper • 2406.09294 • Published Jun 13, 2024 • 1
JPEG-LM: LLMs as Image Generators with Canonical Codec Representations Paper • 2408.08459 • Published Aug 15, 2024 • 45
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels Paper • 2406.09415 • Published Jun 13, 2024 • 50
Transformer Explainer: Interactive Learning of Text-Generative Models Paper • 2408.04619 • Published Aug 8, 2024 • 156