|
--- |
|
license: mit |
|
--- |
|
|
|
# 🌿 Vanilla Autoregressive Models are Scalable Image Generators |
|
|
|
|
|
We introduce 🌿 Vanilla, a new family of image generation models that apply next-token prediction paradigm of large language models to visual generation domain. It is an affirmative answer to whether vanilla autoregressive models without inductive biases on visual signals can achieve state-of-the-art image generation performance if scaling properly. We reexamine the design spaces of image tokenizer and image generation models, and their scalability properties. |
|
|
|
|
|
This repo is used for hosting Vanilla's checkpoints. |
|
|
|
For more details or tutorials see https://github.com/FoundationVision/Vanilla |