--- library_name: diffusers --- # Mann-E FLUX[Dev] Edition

## How to use the model ### Install needed libraries ``` pip install git+https://github.com/huggingface/diffusers.git transformers==4.42.4 accelerate xformers peft sentencepiece protobuf -q ``` ### Execution code ```python import numpy as np import random import torch from diffusers import DiffusionPipeline, FlowMatchEulerDiscreteScheduler, AutoencoderTiny, AutoencoderKL from transformers import CLIPTextModel, CLIPTokenizer,T5EncoderModel, T5TokenizerFast dtype = torch.bfloat16 device = "cuda" if torch.cuda.is_available() else "cpu" taef1 = AutoencoderTiny.from_pretrained("madebyollin/taef1", torch_dtype=dtype).to(device) pipe = DiffusionPipeline.from_pretrained("mann-e/mann-e_flux", torch_dtype=dtype, vae=taef1).to(device) torch.cuda.empty_cache() MAX_SEED = np.iinfo(np.int32).max MAX_IMAGE_SIZE = 2048 seed = random.randint(0, MAX_SEED) generator = torch.Generator().manual_seed(seed) prompt = "an astronaut riding a horse" pipe( prompt=f"{prompt}", guidance_scale=3.5, num_inference_steps=10, width=720, height=1280, generator=generator, output_type="pil" ).images[0].save("output.png") ``` ## Tips and Tricks 1. Adding `mj-v6.1-style` to the prompts specially the cinematic and photo realistic prompts can make the result quality high as hell! Give it a try. 2. The best `guidance_scale` is somewhere between 3.5 and 5.0 3. Inference steps between 8 and 16 are working very well.