--- license: openrail++ language: - en pipeline_tag: text-to-image tags: - stable-diffusion - stable-diffusion-diffusers - stable-diffusion-xl inference: true widget: - text: >- face focus, cute, masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck example_title: example 1girl - text: >- face focus, bishounen, masterpiece, best quality, 1boy, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck example_title: example 1boy library_name: diffusers datasets: - Linaqruf/animagine-datasets ---

Animagine XL

sample1 sample3 sample2 sample4

## Overview **Animagine** XL is a high-resolution, latent text-to-image diffusion model. The model has been fine-tuned using a learning rate of `4e-7` over 27000 global steps with a batch size of 16 on a curated dataset of superior-quality anime-style images. This model is derived from Stable Diffusion XL 1.0. - Use it with the [`Stable Diffusion Webui`](https://github.com/AUTOMATIC1111/stable-diffusion-webui) - Use it with 🧨 [`diffusers`](https://huggingface.co/docs/diffusers/index) - Use it with the [`ComfyUI`](https://github.com/comfyanonymous/ComfyUI) **(recommended)** Like other anime-style Stable Diffusion models, it also supports Danbooru tags to generate images. e.g. _**face focus, cute, masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck**_ ## Features 1. High-Resolution Images: The model trained with 1024x1024 resolution. The model is trained using [NovelAI Aspect Ratio Bucketing Tool](https://github.com/NovelAI/novelai-aspect-ratio-bucketing) so that it can be trained at non-square resolutions. 2. Anime-styled Generation: Based on given text prompts, the model can create high quality anime-styled images. 3. Fine-Tuned Diffusion Process: The model utilizes a fine-tuned diffusion process to ensure high quality and unique image output.
## Model Details - **Developed by:** [Linaqruf](https://github.com/Linaqruf) - **Model type:** Diffusion-based text-to-image generative model - **Model Description:** This is a model that can be used to generate and modify high quality anime-themed images based on text prompts. - **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL) - **Finetuned from model:** [Stable Diffusion XL 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
## How to Use: - Download `Animagine XL` [here](https://huggingface.co/Linaqruf/animagine-xl/resolve/main/animagine-xl.safetensors), the model is in `.safetensors` format. - You need to use Danbooru-style tag as prompt instead of natural language, otherwise you will get realistic result instead of anime - You can use any generic negative prompt or use the following suggested negative prompt to guide the model towards high aesthetic generationse: ``` lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry ``` - And, the following should also be prepended to prompts to get high aesthetic results: ``` masterpiece, best quality, illustration, beautiful detailed, finely detailed, dramatic light, intricate details ``` - Use this cheat sheet to find the best resolution: ``` 768 x 1344: Vertical (9:16) 915 x 1144: Portrait (4:5) 1024 x 1024: Square (1:1) 1182 x 886: Photo (4:3) 1254 x 836: Landscape (3:2) 1365 x 768: Widescreen (16:9) 1564 x 670: Cinematic (21:9) ```
## 🧨 Diffusers Make sure to upgrade diffusers to >= 0.18.2: ``` pip install diffusers --upgrade ``` In addition make sure to install `transformers`, `safetensors`, `accelerate` as well as the invisible watermark: ``` pip install invisible_watermark transformers accelerate safetensors ``` Running the pipeline (if you don't swap the scheduler it will run with the default **EulerDiscreteScheduler** in this example we are swapping it to **EulerAncestralDiscreteScheduler**: ```py import torch from torch import autocast from diffusers.models import AutoencoderKL from diffusers import StableDiffusionXLPipeline, EulerAncestralDiscreteScheduler model = "Linaqruf/animagine-xl" vae = AutoencoderKL.from_pretrained("stabilityai/sdxl-vae") pipe = StableDiffusionXLPipeline.from_pretrained( model, torch_dtype=torch.float16, use_safetensors=True, variant="fp16", vae=vae ) pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config) pipe.to('cuda') prompt = "face focus, cute, masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, night, turtleneck" negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry" image = pipe( prompt, negative_prompt=negative_prompt, width=1024, height=1024, guidance_scale=12, target_size=(1024,1024), original_size=(4096,4096), num_inference_steps=50 ).images[0] image.save("anime_girl.png") ```
## Limitation This model inherit Stable Diffusion XL 1.0 [limitation](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0#limitations)