metadata
license: creativeml-openrail-m
language:
- en
thumbnail: >-
https://huggingface.co/Norod78/Norod78/sd2-cartoon-blip/raw/main/example/Norod78/sd2-cartoon-blip-sample_tile-0.jpg
tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
datasets:
- Norod78/cartoon-blip-captions
inference: true
Cartoon diffusion v2.0
*Stable Diffusion v2.0 fine tuned on images from various cartoon shows
If you want more details on how to generate your own blip cpationed dataset see this colab
Training was done using a slightly modified version of Hugging-Face's text to image training example script
About
Put in a text prompt and generate cartoony images
AUTOMATIC1111 webui checkpoint
The main folder contains a .ckpt and a .yaml file to be put in stable-diffusion-webui "stable-diffusion-webui/models/Stable-diffusion" folder and used to generate images
Sample code
from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler
import torch
# this will substitute the default PNDM scheduler for K-LMS
lms = LMSDiscreteScheduler(
beta_start=0.00085,
beta_end=0.012,
beta_schedule="scaled_linear"
)
guidance_scale=8.5
steps=50
cartoon_model_path = "Norod78/sd2-cartoon-blip"
cartoon_pipe = StableDiffusionPipeline.from_pretrained(cartoon_model_path, scheduler=lms, torch_dtype=torch.float16)
cartoon_pipe.to("cuda")
def generate(prompt, file_prefix ,samples, seed=42):
torch.manual_seed(seed)
prompt += ", Very detailed, clean, high quality, sharp image"
cartoon_images = cartoon_pipe([prompt] * samples, num_inference_steps=steps, guidance_scale=guidance_scale)["images"]
for idx, image in enumerate(cartoon_images):
image.save(f"{file_prefix}-{idx}-{seed}-sd2-cartoon-blip.jpg")
generate("An oil on canvas portrait of Snoop Dogg, Mark Ryden", "01_SnoopDog", 2, 777)
generate("A flemish baroque painting of Kermit from the muppet show", "02_KermitFlemishBaroque", 2, 42)
generate("Gal Gadot in Avatar", "03_GalGadotAvatar", 2, 777)
generate("Ninja turtles, Naoto Hattori", "04_TMNT", 2, 312)
generate("An anime town", "05_AnimeTown", 2, 777)
generate("Family guy taking selfies at the beach", "06_FamilyGuy", 2, 555)
generate("Pikachu as Rick and morty, Eric Wallis", "07_PikachuRnM", 2, 777)
generate("Pikachu as Spongebob, Eric Wallis", "08_PikachuSpongeBob", 2, 42)
generate("An oil painting of Miss. Piggy from the muppets as the Mona Lisa", "09_MsPiggyMonaLisa", 2, 42)
generate("Rick Sanchez in star wars, Dave Dorman", "10_RickStarWars", 2, 42)
generate("An paiting of Southpark with rainbow", "11_Southpark", 2, 777)
generate("An oil painting of Phineas and Pherb hamering on a new machine, Eric Wallis", "12_PhineasPherb", 2, 777)
generate("Bender, Saturno Butto", "13_Bender", 2, 777)
generate("A psychedelic image of Bojack Horseman", "14_Bojack", 2, 777)
generate("A movie poster for Gravity Falls Cthulhu stories", "15_GravityFalls", 2, 777)
generate("A vibrant oil painting portrait of She-Ra", "16_Shira", 2, 512)
#
Dataset and Training
Finetuned for 25,000 iterations upon stabilityai/stable-diffusion-2-base on BLIP captioned cartoon images using 1xA5000 GPU on my home desktop computer
Trained by @Norod78