jadechoghari commited on
Commit
1dff7ce
·
verified ·
1 Parent(s): d983752

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -4
README.md CHANGED
@@ -1,5 +1,76 @@
1
- ---
2
- library_name: diffusers
3
- ---
 
4
 
5
- This will be the Hugging Face implementation of MAR
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: diffusers
3
+ license: mit
4
+ ---
5
 
6
+ # Autoregressive Image Generation without Vector Quantization
7
+
8
+ ## About
9
+ This model (MAR) introduces a novel approach to autoregressive image generation by eliminating the need for vector quantization.
10
+ Instead of relying on discrete tokens, the model operates in a continuous-valued space using a diffusion process to model the per-token probability distribution.
11
+ By employing a Diffusion Loss function, the model achieves efficient and high-quality image generation while benefiting from the speed advantages of autoregressive sequence modeling.
12
+ This approach simplifies the generation process, making it applicable to broader continuous-valued domains beyond just image synthesis.
13
+ It is based on [this paper](https://arxiv.org/abs/2406.11838)
14
+
15
+ ## Usage:
16
+ You can easily load it through the Hugging Face `DiffusionPipeline` and optionally customize various parameters such as the model type, number of steps, and class labels.
17
+
18
+ ```python
19
+ from diffusers import DiffusionPipeline
20
+
21
+ # load the pretrained model
22
+ pipeline = DiffusionPipeline.from_pretrained("jadechoghari/mar", trust_remote_code=True, custom_pipeline="jadechoghari/mar")
23
+
24
+ # generate an image with the model
25
+ generated_image = pipeline(
26
+ model_type="mar_base", # choose from 'mar_base', 'mar_large', or 'mar_huge'
27
+ seed=42, # set a seed for reproducibility
28
+ num_ar_steps=64, # number of autoregressive steps
29
+ class_labels=[207, 360, 388], # provide valid ImageNet class labels
30
+ cfg_scale=4, # classifier-free guidance scale
31
+ output_dir="./images", # directory to save generated images
32
+ )
33
+
34
+ # display the generated image
35
+ generated_image.show()
36
+ ```
37
+
38
+ <p align="center">
39
+ <img src="https://github.com/LTH14/mar/raw/main/demo/visual.png" width="500">
40
+ </p>
41
+
42
+ This code loads the model, configures it for image generation, and saves the output to a specified directory.
43
+
44
+ We offer three pre-trained MAR models in `safetensors` format:
45
+ - `mar-base.safetensors`
46
+ - `mar-large.safetensors`
47
+ - `mar-huge.safetensors`
48
+
49
+
50
+ <!-- <p align="center">
51
+ <img src="https://github.com/LTH14/mar/raw/main/demo/visual.png" width="720">
52
+ </p> -->
53
+
54
+ This is a Hugging Face Diffusers/GPU implementation of the paper [Autoregressive Image Generation without Vector Quantization](https://arxiv.org/abs/2406.11838)
55
+
56
+ The Official PyTorch Implementation is released in [this repository](https://github.com/LTH14/mar)
57
+
58
+ ```
59
+ @article{li2024autoregressive,
60
+ title={Autoregressive Image Generation without Vector Quantization},
61
+ author={Li, Tianhong and Tian, Yonglong and Li, He and Deng, Mingyang and He, Kaiming},
62
+ journal={arXiv preprint arXiv:2406.11838},
63
+ year={2024}
64
+ }
65
+ ```
66
+
67
+ ## Acknowledgements
68
+ We thank Congyue Deng and Xinlei Chen for helpful discussion. We thank
69
+ Google TPU Research Cloud (TRC) for granting us access to TPUs, and Google Cloud Platform for
70
+ supporting GPU resources.
71
+
72
+ A large portion of codes in this repo is based on [MAE](https://github.com/facebookresearch/mae), [MAGE](https://github.com/LTH14/mage) and [DiT](https://github.com/facebookresearch/DiT).
73
+
74
+ ## Contact
75
+
76
+ If you have any questions, feel free to contact me through email (tianhong@mit.edu). Enjoy!