ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion
Abstract
Diffusion models have revolutionized image editing but often generate images that violate physical laws, particularly the effects of objects on the scene, e.g., occlusions, shadows, and reflections. By analyzing the limitations of self-supervised approaches, we propose a practical solution centered on a counterfactual dataset. Our method involves capturing a scene before and after removing a single object, while minimizing other changes. By fine-tuning a diffusion model on this dataset, we are able to not only remove objects but also their effects on the scene. However, we find that applying this approach for photorealistic object insertion requires an impractically large dataset. To tackle this challenge, we propose bootstrap supervision; leveraging our object removal model trained on a small counterfactual dataset, we synthetically expand this dataset considerably. Our approach significantly outperforms prior methods in photorealistic object removal and insertion, particularly at modeling the effects of objects on the scene.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos (2024)
- ReplaceAnything3D: Text-Guided 3D Scene Editing with Compositional Neural Radiance Fields (2024)
- Outline-Guided Object Inpainting with Diffusion Models (2024)
- Repositioning the Subject within Image (2024)
- LoMOE: Localized Multi-Object Editing via Multi-Diffusion (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Fantastic paper! Eager to see which genius will bring it to life.
ObjectDrop: Revolutionizing Photorealistic Object Editing with Counterfactual Supervision
Links ๐:
๐ Subscribe: https://www.youtube.com/@Arxflix
๐ Twitter: https://x.com/arxflix
๐ LMNT (Partner): https://lmnt.com/
Excellent work! Do you have any plans to open-source the pre-trained weights or the dataset?
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper