SigmaJourney / README.md
toilaluan's picture
Trained for 5 epochs and 5500 steps.
d29036f verified
|
raw
history blame
8.72 kB
metadata
license: creativeml-openrail-m
base_model: PixArt-alpha/PixArt-Sigma-XL-2-1024-MS
tags:
  - stable-diffusion
  - stable-diffusion-diffusers
  - text-to-image
  - diffusers
  - full
inference: true
widget:
  - text: unconditional (blank prompt)
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_0_0.png
  - text: a woman sitting on the grass
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_1_0.png
  - text: a professional photo headshot of a man in studio lighting
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_2_0.png
  - text: a person holding a sign that reads 'SOON'
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_3_0.png
  - text: >-
      Alien marketplace, bizarre creatures, exotic goods, vibrant colors,
      otherworldly atmosphere
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_4_0.png
  - text: >-
      Child holding a balloon, happy expression, colorful balloons, sunny day,
      high detail
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_5_0.png
  - text: >-
      a 4-panel comic strip showing an orange cat saying the words 'HELP' and
      'LASAGNA'
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_6_0.png
  - text: >-
      a hand is holding a comic book with a cover that reads 'The Adventures of
      Superhero'
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_7_0.png
  - text: >-
      Underground cave filled with crystals, glowing lights, reflective
      surfaces, fantasy environment, high detail
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_8_0.png
  - text: >-
      Bustling cyberpunk bazaar, vendors, neon signs, advanced tech, crowded,
      high detail
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_9_0.png
  - text: >-
      Cyberpunk hacker in a dark room, neon glow, multiple screens, intense
      focus, high detail
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_10_0.png
  - text: >-
      a cybernetic anne of green gables with neural implant and bio mech
      augmentations
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_11_0.png
  - text: >-
      Post-apocalyptic cityscape, ruined buildings, overgrown vegetation, dark
      and gritty, high detail
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_12_0.png
  - text: >-
      Magical castle in a lush forest, glowing windows, fantasy architecture,
      high resolution, detailed textures
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_13_0.png
  - text: >-
      Ruins of an ancient temple in an enchanted forest, glowing runes, mystical
      creatures, high detail
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_14_0.png
  - text: >-
      Mystical forest, glowing plants, fairies, magical creatures, fantasy art,
      high detail
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_15_0.png
  - text: >-
      Magical garden with glowing flowers, fairies, serene atmosphere, detailed
      plants, high resolution
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_16_0.png
  - text: >-
      Whimsical garden filled with fairies, magical plants, sparkling lights,
      serene atmosphere, high detail
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_17_0.png
  - text: >-
      Majestic dragon soaring through the sky, detailed scales, dynamic pose,
      fantasy art, high resolution
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_18_0.png
  - text: >-
      Fantasy world, floating islands in the sky, waterfalls, lush vegetation,
      detailed landscape, high resolution
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_19_0.png
  - text: >-
      Futuristic city skyline at night, neon lights, cyberpunk style, high
      contrast, sharp focus
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_20_0.png
  - text: >-
      Space battle scene, starships fighting, laser beams, explosions, cosmic
      background
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_21_0.png
  - text: >-
      Abandoned fairground at night, eerie rides, ghostly figures, fog, dark
      atmosphere, high detail
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_22_0.png
  - text: >-
      Spooky haunted mansion on a hill, dark and eerie, glowing windows, ghostly
      atmosphere, high detail
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_23_0.png
  - text: a hardcover physics textbook that is called PHYSICS FOR DUMMIES
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_24_0.png
  - text: >-
      Epic medieval battle, knights in armor, dynamic action, detailed
      landscape, high resolution
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_25_0.png
  - text: >-
      Bustling medieval market with merchants, knights, and jesters, vibrant
      colors, detailed
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_26_0.png
  - text: >-
      Cozy medieval tavern, warm firelight, adventurers drinking, detailed
      interior, rustic atmosphere
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_27_0.png
  - text: >-
      Futuristic city skyline at night, neon lights, cyberpunk style, high
      contrast, sharp focus
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_28_0.png
  - text: >-
      Forest with neon-lit trees, glowing plants, bioluminescence, surreal
      atmosphere, high detail
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_29_0.png
  - text: >-
      Bright neon sign in a busy city street, 'Open 24 Hours', bold typography,
      glowing lights
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_30_0.png
  - text: >-
      Retro diner sign, 'Joe's Diner', classic 1950s design, neon lights,
      weathered look
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_31_0.png
  - text: >-
      Vintage store sign with elaborate typography, 'Antique Shop',
      hand-painted, weathered look
    parameters:
      negative_prompt: blurry, cropped, ugly
    output:
      url: ./assets/image_32_0.png

pixart-training

This is a full rank finetune derived from PixArt-alpha/PixArt-Sigma-XL-2-1024-MS.

No validation prompt was used during training.

None

Validation settings

  • CFG: 7.5
  • CFG Rescale: 0.0
  • Steps: 30
  • Sampler: euler
  • Seed: 42
  • Resolution: 1024

Note: The validation settings are not necessarily the same as the training settings.

You can find some example images in the following gallery:

Prompt
A blonde sexy girl, wearing glasses at latex shirt and a blue beanie with a tattoo, blue and white, highly detailed, sublime, extremely beautiful, sharp focus, refined, cinematic, intricate, elegant, dynamic, rich deep colors, bright color, shining light, attractive, cute, pretty, background full, epic composition, dramatic atmosphere, radiant, professional, stunning
Negative Prompt
blurry, cropped, ugly
Prompt
a wizard with a glowing staff and a glowing hat, colorful magic, dramatic atmosphere, sharp focus, highly detailed, cinematic, original composition, fine detail, intricate, elegant, creative, color spread, shiny, amazing, symmetry, illuminated, inspired, pretty, attractive, artistic, dynamic background, relaxed, professional, extremely inspirational, beautiful, determined, cute, adorable, best
Negative Prompt
blurry, cropped, ugly
Prompt
girl in modern car, intricate, elegant, highly detailed, extremely complimentary colors, beautiful, glowing aesthetic, pretty, dramatic light, sharp focus, perfect composition, clear artistic color, calm professional background, precise, joyful, emotional, unique, cute, best, gorgeous, great delicate, expressive, thought, iconic, fine, awesome, creative, winning, charming, enhanced
Negative Prompt
blurry, cropped, ugly
Prompt
A girl stands amidst scattered glass shards, surrounded by a beautifully crafted and expansive world. The scene is depicted from a dynamic angle, emphasizing her determined expression. The background features vast landscapes with floating crystals and soft, glowing lights that create a mystical and grand atmosphere.
Negative Prompt
blurry, cropped, ugly
Prompt
A girl stands amidst scattered glass shards, surrounded by a beautifully crafted and expansive world. The scene is depicted from a dynamic angle, emphasizing her determined expression. The background features vast landscapes with floating crystals and soft, glowing lights that create a mystical and grand atmosphere.
Negative Prompt
blurry, cropped, ugly
Prompt
A close-up shot of a beautiful girl in a serene world. She has white hair and is blindfolded, with a calm expression. Her hands are pressed together in a prayer pose, with fingers interlaced and palms touching. The background is softly blurred, enhancing her ethereal presence.
Negative Prompt
blurry, cropped, ugly

The text encoder was not trained. You may reuse the base model text encoder for inference.

Training settings

  • Training epochs: 5
  • Training steps: 5500
  • Learning rate: 8e-06
  • Effective batch size: 128
    • Micro-batch size: 32
    • Gradient accumulation steps: 4
    • Number of GPUs: 1
  • Prediction type: epsilon
  • Rescaled betas zero SNR: False
  • Optimizer: AdamW, stochastic bf16
  • Precision: Pure BF16
  • Xformers: Enabled

Datasets

mj-v6

  • Repeats: 0
  • Total number of images: 134144
  • Total number of aspect buckets: 1
  • Resolution: 1.0 megapixels
  • Cropped: False
  • Crop style: None
  • Crop aspect: None

Inference

import torch
from diffusers import DiffusionPipeline



model_id = "pixart-training"
prompt = "An astronaut is riding a horse through the jungles of Thailand."
negative_prompt = "malformed, disgusting, overexposed, washed-out"

pipeline = DiffusionPipeline.from_pretrained(model_id)
pipeline.to('cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu')
image = pipeline(
    prompt=prompt,
    negative_prompt='blurry, cropped, ugly',
    num_inference_steps=30,
    generator=torch.Generator(device='cuda' if torch.cuda.is_available() else 'mps' if torch.backends.mps.is_available() else 'cpu').manual_seed(1641421826),
    width=1152,
    height=768,
    guidance_scale=7.5,
    guidance_rescale=0.0,
).images[0]
image.save("output.png", format="PNG")