metadata

language:
  - en
thumbnail: https://staticassetbucket.s3.us-west-1.amazonaws.com/GOT_naruto.png
tags:
  - stable-diffusion
  - stable-diffusion-diffusers
  - text-to-image
datasets:
  - lambdalabs/naruto-blip-captions

Naruto diffusion

Stable Diffusion fine tuned on Naruto by Lambda Labs.

Try the live text-to-naruto demo here!
If you want more details on how to train your own Stable Diffusion variants, see this example.

About

Put in a text prompt and generate your own Naruto style image!

Game of Thrones to Naruto

Marvel to Naruto

Prompt engineering matters

We find that prompt engineering does help produce compelling and consistent Naruto style portraits. For example, writing prompts such as 'person_name ninja portrait' or 'person_name in the style of Naruto' tends to produce results that are closer to the style of Naruto character with the characteristic headband and other elements of costume.

Here are a few examples of prompts with and without prompt engineering that will illustrate that point.

Bill Gates:

Without prompt engineering

With prompt engineering

A cute bunny:

Without prompt engineering

With prompt engineering

Usage

To run model locally:

!pip install diffusers==0.3.0
!pip install transformers scipy ftfy

import torch
from diffusers import StableDiffusionPipeline
from torch import autocast

pipe = StableDiffusionPipeline.from_pretrained("lambdalabs/sd-naruto-diffusers", torch_dtype=torch.float16)  
pipe = pipe.to("cuda")

prompt = "Yoda"
scale = 10
n_samples = 4

# Sometimes the nsfw checker is confused by the Naruto images, you can disable
# it at your own risk here
disable_safety = False

if disable_safety:
  def null_safety(images, **kwargs):
      return images, False
  pipe.safety_checker = null_safety

with autocast("cuda"):
  images = pipe(n_samples*[prompt], guidance_scale=scale).images

for idx, im in enumerate(images):
  im.save(f"{idx:06}.png")

Model description

Trained on BLIP captioned Naruto images using 2xA6000 GPUs on Lambda GPU Cloud for around 30,000 step (about 12 hours, at a cost of about $20).

Links

Trained by Eole Cervenka after the work of Justin Pinkney (@Buntworthy) at Lambda Labs.