DDPM Project
This repository contains the implementation of Denoising Diffusion Probabilistic Models (DDPM).
Table of Contents
Denoising Diffusion Probabilistic Models (DDPM) are a class of generative models that learn to generate data by reversing a diffusion process. This repository provides a comprehensive implementation of DDPM.
To install the necessary dependencies, run:
pip install -r requirements.txt
To train the model, use the following command:
python train.py
To generate samples, use:
python generate.py
To understand the model and it's workings, we're working on a cool cute little game where the user is the UNET reverser/diffusion model and is tasked to denoise the images with noise made of grids of lines.
Use learndiffusion.vercel.app to access the primitive version of the game. You can also contribute to the game by checking out at the diffusion_game branch. A new model showcase will also be added such that the model's weights are loaded from the internet, model's files are installed and loaded into a gradio interface for direct use/inference on the vercel. Feel free to make changes for the same, issue is opened.
Explanations and Mathematics
- slides from presentation :
- notes/explanations : HERE
- a cute lab talk ppt:
- plato's allegory : <link to REPUBLIC>
- Original Paper : https://arxiv.org/pdf/2006.11239
- Improvement Paper : https://arxiv.org/abs/2102.09672
- Improvement by OpenAI : https://arxiv.org/pdf/2105.05233
- Stable Diffusion Paper : https://arxiv.org/abs/2112.10752
Papers for background
- UNET Paper for Biomedical Segmentation
- Autoencooder
- Variational Autoencoder
- Markov Hierarchical VAE
- Introductory Lectures on Diffusion Process
Youtube videos and courses
- Outliers
- Omar Jahil
Pytorch Implementation
Pretrained Weights
weights from the model can be found in pretrained_weights
For loading the pretrained weights:
model2 = SimpleUnet()
model2.load_state_dict(torch.load("/content/drive/MyDrive/Research Work/mlsa/DDPM/model_weights.pth"))
For making inferences TODO: Errors in the sampling function, boolean errors and etc. Will open issues for solving by others as exercise if needed.
num_samples = 8 # Number of images to generate
image_size = (3, 32, 32) # Example for CIFAR10
noise = torch.randn(num_samples, *image_size).to("cuda")
# Generate images by denoising
with torch.no_grad():
generated_images = model2.sample(noise)
# Save the generated images
save_image(generated_images, "generated_images.png", nrow=4, normalize=True)
Contributions are welcome! Please open an issue or submit a pull request.
Future Ideas
- Make the model onnx compatible for training and inferencing on Intel GPUs
- Build a Stable Diffusion model Text2Img using CLIP implementationnnnn !!!
- Train the current model for a much larger dataset with more generalizations and nuances