U-DiT Models
This is the official U-DiT model from our work "U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers". The model is trained for 400K iterations on the ImageNet 256x256 dataset.Model Details
Model Name | FLOPs (G) | Training Iters | FID |
---|---|---|---|
U-DiT-S | 6.04 | 400K | 31.51 |
U-DiT-B | 22.22 | 400K | 16.64 |
U-DiT-L | 85.00 | 400K | 10.08 |
U-DiT-B | 22.22 | 1M | 12.87 |
U-DiT-L | 85.00 | 1M | 7.54 |
Citation
If you find this model useful, please cite:
@misc{tian2024udits,
title={U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers},
author={Yuchuan Tian and Zhijun Tu and Hanting Chen and Jie Hu and Chao Xu and Yunhe Wang},
year={2024},
eprint={2405.02730},
archivePrefix={arXiv},
primaryClass={cs.CV}
}