File size: 7,594 Bytes
f50e351 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 |
---
license: openrail++
language:
- en
pipeline_tag: text-to-image
tags:
- stable-diffusion
- stable-diffusion-diffusers
- stable-diffusion-xl
inference: true
widget:
- text: >-
face focus, cute, masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck
example_title: example 1girl
- text: >-
face focus, bishounen, masterpiece, best quality, 1boy, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck
example_title: example 1boy
library_name: diffusers
datasets:
- Linaqruf/animagine-datasets
---
<style>
.title-container {
display: flex;
justify-content: center;
align-items: center;
height: 100vh; /* Adjust this value to position the title vertically */
}
.title {
font-size: 3em;
text-align: center;
color: #333;
font-family: 'Helvetica Neue', sans-serif;
text-transform: uppercase;
letter-spacing: 0.1em;
padding: 0.5em 0;
background: transparent;
}
.title span {
background: -webkit-linear-gradient(45deg, #7ed56f, #28b485);
-webkit-background-clip: text;
-webkit-text-fill-color: transparent;
}
.custom-table {
table-layout: fixed;
width: 100%;
border-collapse: collapse;
margin-top: 2em;
}
.custom-table td {
width: 50%;
vertical-align: top;
padding: 10px;
box-shadow: 0px 0px 10px 0px rgba(0,0,0,0.15);
}
.custom-image {
width: 100%;
height: auto;
object-fit: cover;
border-radius: 10px;
transition: transform .2s;
margin-bottom: 1em;
}
.custom-image:hover {
transform: scale(1.05);
}
</style>
<h1 class="title"><span>Animagine XL</span></h1>
<table class="custom-table">
<tr>
<td>
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image1.png">
<img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image1.png" alt="sample1">
</a>
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image3.png">
<img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image3.png" alt="sample3">
</a>
</td>
<td>
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image2.png">
<img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image2.png" alt="sample2">
</a>
<a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image4.png">
<img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image4.png" alt="sample4">
</a>
</td>
</tr>
</table>
<hr>
## Overview
**Animagine** XL is a high-resolution, latent text-to-image diffusion model. The model has been fine-tuned using a learning rate of `4e-7` over 27000 global steps with a batch size of 16 on a curated dataset of superior-quality anime-style images. This model is derived from Stable Diffusion XL 1.0.
- Use it with the [`Stable Diffusion Webui`](https://github.com/AUTOMATIC1111/stable-diffusion-webui)
- Use it with 🧨 [`diffusers`](https://huggingface.co/docs/diffusers/index)
- Use it with the [`ComfyUI`](https://github.com/comfyanonymous/ComfyUI) **(recommended)**
Like other anime-style Stable Diffusion models, it also supports Danbooru tags to generate images.
e.g. _**face focus, cute, masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck**_
## Features
1. High-Resolution Images: The model trained with 1024x1024 resolution. The model is trained using [NovelAI Aspect Ratio Bucketing Tool](https://github.com/NovelAI/novelai-aspect-ratio-bucketing) so that it can be trained at non-square resolutions.
2. Anime-styled Generation: Based on given text prompts, the model can create high quality anime-styled images.
3. Fine-Tuned Diffusion Process: The model utilizes a fine-tuned diffusion process to ensure high quality and unique image output.
<hr>
## Model Details
- **Developed by:** [Linaqruf](https://github.com/Linaqruf)
- **Model type:** Diffusion-based text-to-image generative model
- **Model Description:** This is a model that can be used to generate and modify high quality anime-themed images based on text prompts.
- **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL)
- **Finetuned from model:** [Stable Diffusion XL 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
<hr>
## How to Use:
- Download `Animagine XL` [here](https://huggingface.co/Linaqruf/animagine-xl/resolve/main/animagine-xl.safetensors), the model is in `.safetensors` format.
- You need to use Danbooru-style tag as prompt instead of natural language, otherwise you will get realistic result instead of anime
- You can use any generic negative prompt or use the following suggested negative prompt to guide the model towards high aesthetic generationse:
```
lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
```
- And, the following should also be prepended to prompts to get high aesthetic results:
```
masterpiece, best quality, illustration, beautiful detailed, finely detailed, dramatic light, intricate details
```
- Use this cheat sheet to find the best resolution:
```
768 x 1344: Vertical (9:16)
915 x 1144: Portrait (4:5)
1024 x 1024: Square (1:1)
1182 x 886: Photo (4:3)
1254 x 836: Landscape (3:2)
1365 x 768: Widescreen (16:9)
1564 x 670: Cinematic (21:9)
```
<hr>
## 🧨 Diffusers
Make sure to upgrade diffusers to >= 0.18.2:
```
pip install diffusers --upgrade
```
In addition make sure to install `transformers`, `safetensors`, `accelerate` as well as the invisible watermark:
```
pip install invisible_watermark transformers accelerate safetensors
```
Running the pipeline (if you don't swap the scheduler it will run with the default **EulerDiscreteScheduler** in this example we are swapping it to **EulerAncestralDiscreteScheduler**:
```py
import torch
from torch import autocast
from diffusers.models import AutoencoderKL
from diffusers import StableDiffusionXLPipeline, EulerAncestralDiscreteScheduler
model = "Linaqruf/animagine-xl"
vae = AutoencoderKL.from_pretrained("stabilityai/sdxl-vae")
pipe = StableDiffusionXLPipeline.from_pretrained(
model,
torch_dtype=torch.float16,
use_safetensors=True,
variant="fp16",
vae=vae
)
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
pipe.to('cuda')
prompt = "face focus, cute, masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck"
negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry"
image = pipe(
prompt,
negative_prompt=negative_prompt,
width=1024,
height=1024,
guidance_scale=12,
target_size=(1024,1024),
original_size=(4096,4096),
num_inference_steps=50
).images[0]
image.save("anime_girl.png")
```
<hr>
## Limitation
This model inherit Stable Diffusion XL 1.0 [limitation](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0#limitations)
|