SDXL-512 / README.md
John M
first commit
3aa9e0a
|
raw
history blame
1.19 kB
metadata
license: openrail++
tags:
  - text-to-image
  - stable-diffusion

image/gif

Model Description

  • Developed by: Natural Synthetics Inc.
  • Model type: Diffusion-based text-to-image generative model
  • License: CreativeML Open RAIL++-M License
  • Model Description: This is a model that can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses two fixed, pretrained text encoders (OpenCLIP-ViT/G and CLIP-ViT/L).
  • Resources for more information: Check out our GitHub Repository.

Limitations and Bias

Limitations

  • The model does not achieve perfect photorealism
  • The model cannot render legible text
  • The model struggles with more difficult tasks which involve compositionality, such as rendering an image corresponding to “A red cube on top of a blue sphere”
  • Faces and people in general may not be generated properly.

Bias

While the capabilities of video generation models are impressive, they can also reinforce or exacerbate social biases.