metadata
license: creativeml-openrail-m
base_model: stabilityai/stable-diffusion-xl-base-1.0
dataset: lambdalabs/pokemon-blip-captions
tags:
- stable-diffusion-xl
- stable-diffusion-xl-diffusers
- text-to-image
- diffusers
- lora
inference: false
These are LoRA adaption weights for stabilityai/stable-diffusion-xl-base-1.0. The weights were fine-tuned on the lambdalabs/pokemon-blip-captions dataset.
Special VAE used for training: madebyollin/sdxl-vae-fp16-fix.
🧨 Diffusers Usage
import torch
from diffusers import DiffusionPipeline, AutoencoderKL
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", vae=vae, torch_dtype=torch.float16, variant="fp16", use_safetensors=True)
pipe.load_lora_weights("sshh12/sdxl-lora-pokemon")
pipe.to("cuda")
prompt = "..."
image = pipe(prompt=prompt).images[0]
image
Training
MODEL_NAME="stabilityai/stable-diffusion-xl-base-1.0"
DATASET_NAME="lambdalabs/pokemon-blip-captions"
!accelerate launch train_text_to_image_lora_sdxl.py \
--pretrained_model_name_or_path="$MODEL_NAME" \
--pretrained_vae_model_name_or_path="madebyollin/sdxl-vae-fp16-fix" \
--dataset_name="$DATASET_NAME" \
--caption_column="text" \
--resolution=1024 \
--random_flip \
--mixed_precision="fp16" \
--use_8bit_adam \
--train_batch_size=1 \
--gradient_accumulation_steps=8 \
--num_train_epochs=200 \
--checkpointing_steps=500 \
--learning_rate=1e-04 \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--seed=0 \
--validation_prompt="cute dragon creature" \
--enable_xformers_memory_efficient_attention \
--report_to="wandb"