MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning
Abstract
Diffusion models have emerged as frontrunners in text-to-image generation for their impressive capabilities. Nonetheless, their fixed image resolution during training often leads to challenges in high-resolution image generation, such as semantic inaccuracies and object replication. This paper introduces MegaFusion, a novel approach that extends existing diffusion-based text-to-image generation models towards efficient higher-resolution generation without additional fine-tuning or extra adaptation. Specifically, we employ an innovative truncate and relay strategy to bridge the denoising processes across different resolutions, allowing for high-resolution image generation in a coarse-to-fine manner. Moreover, by integrating dilated convolutions and noise re-scheduling, we further adapt the model's priors for higher resolution. The versatility and efficacy of MegaFusion make it universally applicable to both latent-space and pixel-space diffusion models, along with other derivative models. Extensive experiments confirm that MegaFusion significantly boosts the capability of existing models to produce images of megapixels and various aspect ratios, while only requiring about 40% of the original computational cost.
Community
Project Page: https://haoningwu3639.github.io/MegaFusion/
Paper: https://arxiv.org/abs/2408.11001/
Code: https://github.com/haoningwu3639/MegaFusion
We are in the process of standardizing and gradually open-sourcing our code in the near future, so please stay tuned.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance (2024)
- ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance (2024)
- AccDiffusion: An Accurate Method for Higher-Resolution Image Generation (2024)
- UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks (2024)
- SpotDiffusion: A Fast Approach For Seamless Panorama Generation Over Time (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper