Spaces:
Running
on
Zero
Running
on
Zero
File size: 6,365 Bytes
effc0a0 ba33983 f24703f ba33983 effc0a0 ba33983 edead93 effc0a0 edead93 f70898c edead93 ba33983 effc0a0 f70898c effc0a0 f70898c ba33983 effc0a0 1a688bc 6360e64 ba33983 7e65847 6360e64 f70898c c348e53 effc0a0 c348e53 6360e64 edead93 51fab87 c348e53 effc0a0 7e65847 effc0a0 579e8d0 6360e64 effc0a0 60849d7 51fab87 98afd85 effc0a0 61ad3d2 effc0a0 61ad3d2 effc0a0 61ad3d2 effc0a0 60849d7 effc0a0 98afd85 effc0a0 98afd85 effc0a0 98afd85 effc0a0 98afd85 effc0a0 98afd85 effc0a0 ba33983 effc0a0 ba33983 9edebae 7b8e908 ba33983 effc0a0 23f4f95 9edebae 23f4f95 effc0a0 23f4f95 7b8e908 23f4f95 effc0a0 ba33983 61ad3d2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
## Usage
TL;DR: Enter a prompt or roll the `🎲` and press `Generate`.
### Prompting
Positive and negative prompts are embedded by [Compel](https://github.com/damian0815/compel) for weighting. See [syntax features](https://github.com/damian0815/compel/blob/main/doc/syntax.md) to learn more.
Use `+` or `-` to increase the weight of a token. The weight grows exponentially when chained. For example, `blue+` means 1.1x more attention is given to `blue`, while `blue++` means 1.1^2 more, and so on. The same applies to `-`.
Groups of tokens can be weighted together by wrapping in parantheses and multiplying by a float between 0 and 2. For example, `(masterpiece, best quality)1.2` will increase the weight of both `masterpiece` and `best quality` by 1.2x.
This is the same syntax used in [InvokeAI](https://invoke-ai.github.io/InvokeAI/features/PROMPTS/) and it differs from AUTOMATIC1111:
| Compel | AUTOMATIC1111 |
| ----------- | ------------- |
| `blue++` | `((blue))` |
| `blue--` | `[[blue]]` |
| `(blue)1.2` | `(blue:1.2)` |
| `(blue)0.8` | `(blue:0.8)` |
### Models
Each model checkpoint has a different aesthetic:
* [Comfy-Org/stable-diffusion-v1-5](https://huggingface.co/Comfy-Org/stable-diffusion-v1-5-archive): base
* [cyberdelia/CyberRealistic_V5](https://huggingface.co/cyberdelia/CyberRealistic): realistic
* [Lykon/dreamshaper-8](https://huggingface.co/Lykon/dreamshaper-8): general purpose (default)
* [fluently/Fluently-v4](https://huggingface.co/fluently/Fluently-v4): general purpose stylized
* [Linaqruf/anything-v3-1](https://huggingface.co/Linaqruf/anything-v3-1): anime
* [prompthero/openjourney-v4](https://huggingface.co/prompthero/openjourney-v4): Midjourney art style
* [SG161222/Realistic_Vision_V5](https://huggingface.co/SG161222/Realistic_Vision_V5.1_noVAE): realistic
* [XpucT/Deliberate_v6](https://huggingface.co/XpucT/Deliberate): general purpose stylized
### LoRA
Apply up to 2 LoRA (low-rank adaptation) adapters with adjustable strength:
* [Perfection Style](https://civitai.com/models/411088?modelVersionId=486099): attempts to improve aesthetics, use high strength
* [Detailed Style](https://civitai.com/models/421162?modelVersionId=486110): attempts to improve details, use low strength
> NB: The trigger words are automatically appended to the positive prompt for you.
### Embeddings
Select one or more [textual inversion](https://huggingface.co/docs/diffusers/en/using-diffusers/textual_inversion_inference) embeddings:
* [`fast_negative`](https://civitai.com/models/71961?modelVersionId=94057): all-purpose (default, **recommended**)
* [`cyberrealistic_negative`](https://civitai.com/models/77976?modelVersionId=82745): realistic add-on (for CyberRealistic)
* [`unrealistic_dream`](https://civitai.com/models/72437?modelVersionId=77173): realistic add-on (for RealisticVision)
> NB: The trigger token is automatically appended to the negative prompt for you.
### Styles
[Styles](https://huggingface.co/spaces/adamelliotfields/diffusion/blob/main/data/styles.json) are prompt templates that wrap your positive and negative prompts. They were originally derived from the [twri/sdxl_prompt_styler](https://github.com/twri/sdxl_prompt_styler) Comfy node, but have since been entirely rewritten.
Start by framing a simple subject like `portrait of a cat` or `landscape of a mountain range` and experiment.
#### Anime
The `Anime: *` styles work the best with Dreamshaper. When using the anime-specific Anything model, you should use the `Anime: Anything` style with the following settings:
* Scheduler: `DEIS 2M` or `DPM++ 2M`
* Guidance: `10`
* Steps: `50`
You subject should be a few simple tokens like `girl, brunette, blue eyes, armor, nebula, celestial`. Experiment with `Clip Skip` and `Karras`. Finish with the `Perfection Style` LoRA on a moderate setting and upscale.
### Scale
Rescale up to 4x using [Real-ESRGAN](https://github.com/xinntao/Real-ESRGAN) with weights from [ai-forever](ai-forever/Real-ESRGAN). Necessary for high-resolution images.
### Image-to-Image
The `Image-to-Image` settings allows you to provide input images for the initial latents, ControlNet, and IP-Adapter.
#### Strength
Initial image strength (known as _denoising strength_) is essentially how much the generation will differ from the input image. A value of `0` will be identical to the original, while `1` will be a completely new image. You may want to also increase the number of inference steps.
> 💡 Denoising strength only applies to the `Initial Image` input; it doesn't affect ControlNet or IP-Adapter.
#### ControlNet
In [ControlNet](https://github.com/lllyasviel/ControlNet), the input image is used to get a feature map from an _annotator_. These are computer vision models used for tasks like edge detection and pose estimation. ControlNet models are trained to understand these feature maps. Read the [Diffusers docs](https://huggingface.co/docs/diffusers/using-diffusers/controlnet) to learn more.
Currently, the only annotator available is [Canny](https://huggingface.co/lllyasviel/control_v11p_sd15_canny) (edge detection).
#### IP-Adapter
In an image-to-image pipeline, the input image is used as the initial latent. With [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter), the input image is processed by a separate image encoder and the encoded features are used as conditioning along with the text prompt.
For capturing faces, enable `IP-Adapter Face` to use the full-face model. You should use an input image that is mostly a face and it should be high quality. You can generate fake portraits with Realistic Vision to experiment.
### Advanced
#### DeepCache
[DeepCache](https://github.com/horseee/DeepCache) caches lower UNet layers and reuses them every `Interval` steps. Trade quality for speed:
* `1`: no caching (default)
* `2`: more quality
* `3`: balanced
* `4`: more speed
#### FreeU
[FreeU](https://github.com/ChenyangSi/FreeU) re-weights the contributions sourced from the UNet’s skip connections and backbone feature maps. Can sometimes improve image quality.
#### Clip Skip
When enabled, the last CLIP layer is skipped. Can sometimes improve image quality.
#### Tiny VAE
Enable [madebyollin/taesd](https://github.com/madebyollin/taesd) for near-instant latent decoding with a minor loss in detail. Useful for development.
|