LlamaGen / README.md
peizesun's picture
Update README.md
a86ad64 verified
|
raw
history blame
691 Bytes
metadata
license: mit

🌿 Vanilla Autoregressive Models are Scalable Image Generators

We introduce 🌿 Vanilla, a new family of image generation models that apply next-token prediction paradigm of large language models to visual generation domain. It is an affirmative answer to whether vanilla autoregressive models without inductive biases on visual signals can achieve state-of-the-art image generation performance if scaling properly. We reexamine the design spaces of image tokenizer and image generation models, and their scalability properties.

This repo is used for hosting Vanilla's checkpoints.

For more details or tutorials see https://github.com/FoundationVision/Vanilla