Spaces:
Sleeping
Sleeping
# Dreambooth for the inpainting model | |
This script was added by @thedarkzeno . | |
Please note that this script is not actively maintained, you can open an issue and tag @thedarkzeno or @patil-suraj though. | |
```bash | |
export MODEL_NAME="runwayml/stable-diffusion-inpainting" | |
export INSTANCE_DIR="path-to-instance-images" | |
export OUTPUT_DIR="path-to-save-model" | |
accelerate launch train_dreambooth_inpaint.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--instance_prompt="a photo of sks dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--gradient_accumulation_steps=1 \ | |
--learning_rate=5e-6 \ | |
--lr_scheduler="constant" \ | |
--lr_warmup_steps=0 \ | |
--max_train_steps=400 | |
``` | |
### Training with prior-preservation loss | |
Prior-preservation is used to avoid overfitting and language-drift. Refer to the paper to learn more about it. For prior-preservation we first generate images using the model with a class prompt and then use those during training along with our data. | |
According to the paper, it's recommended to generate `num_epochs * num_samples` images for prior-preservation. 200-300 works well for most cases. | |
```bash | |
export MODEL_NAME="runwayml/stable-diffusion-inpainting" | |
export INSTANCE_DIR="path-to-instance-images" | |
export CLASS_DIR="path-to-class-images" | |
export OUTPUT_DIR="path-to-save-model" | |
accelerate launch train_dreambooth_inpaint.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--class_data_dir=$CLASS_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--with_prior_preservation --prior_loss_weight=1.0 \ | |
--instance_prompt="a photo of sks dog" \ | |
--class_prompt="a photo of dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--gradient_accumulation_steps=1 \ | |
--learning_rate=5e-6 \ | |
--lr_scheduler="constant" \ | |
--lr_warmup_steps=0 \ | |
--num_class_images=200 \ | |
--max_train_steps=800 | |
``` | |
### Training with gradient checkpointing and 8-bit optimizer: | |
With the help of gradient checkpointing and the 8-bit optimizer from bitsandbytes it's possible to run train dreambooth on a 16GB GPU. | |
To install `bitandbytes` please refer to this [readme](https://github.com/TimDettmers/bitsandbytes#requirements--installation). | |
```bash | |
export MODEL_NAME="runwayml/stable-diffusion-inpainting" | |
export INSTANCE_DIR="path-to-instance-images" | |
export CLASS_DIR="path-to-class-images" | |
export OUTPUT_DIR="path-to-save-model" | |
accelerate launch train_dreambooth_inpaint.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--class_data_dir=$CLASS_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--with_prior_preservation --prior_loss_weight=1.0 \ | |
--instance_prompt="a photo of sks dog" \ | |
--class_prompt="a photo of dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--gradient_accumulation_steps=2 --gradient_checkpointing \ | |
--use_8bit_adam \ | |
--learning_rate=5e-6 \ | |
--lr_scheduler="constant" \ | |
--lr_warmup_steps=0 \ | |
--num_class_images=200 \ | |
--max_train_steps=800 | |
``` | |
### Fine-tune text encoder with the UNet. | |
The script also allows to fine-tune the `text_encoder` along with the `unet`. It's been observed experimentally that fine-tuning `text_encoder` gives much better results especially on faces. | |
Pass the `--train_text_encoder` argument to the script to enable training `text_encoder`. | |
___Note: Training text encoder requires more memory, with this option the training won't fit on 16GB GPU. It needs at least 24GB VRAM.___ | |
```bash | |
export MODEL_NAME="runwayml/stable-diffusion-inpainting" | |
export INSTANCE_DIR="path-to-instance-images" | |
export CLASS_DIR="path-to-class-images" | |
export OUTPUT_DIR="path-to-save-model" | |
accelerate launch train_dreambooth_inpaint.py \ | |
--pretrained_model_name_or_path=$MODEL_NAME \ | |
--train_text_encoder \ | |
--instance_data_dir=$INSTANCE_DIR \ | |
--class_data_dir=$CLASS_DIR \ | |
--output_dir=$OUTPUT_DIR \ | |
--with_prior_preservation --prior_loss_weight=1.0 \ | |
--instance_prompt="a photo of sks dog" \ | |
--class_prompt="a photo of dog" \ | |
--resolution=512 \ | |
--train_batch_size=1 \ | |
--use_8bit_adam \ | |
--gradient_checkpointing \ | |
--learning_rate=2e-6 \ | |
--lr_scheduler="constant" \ | |
--lr_warmup_steps=0 \ | |
--num_class_images=200 \ | |
--max_train_steps=800 | |
``` | |