Text-to-Image

Flux GameLabel Lora

This model is intended purely for research purposes as a demonstration of the the quality of data labeled by random video game players. It achieves its purpose (higher prompt adherence), but suffers from a variety of issues due to being fine tuned on synthetic outputs.

Inference code that runs on a 24GB consumer card is below. More details are in the paper at https://arxiv.org/abs/2409.19830

from diffusers import FlowMatchEulerDiscreteScheduler, AutoencoderKL
from diffusers.models.transformers.transformer_flux import FluxTransformer2DModel
from diffusers.pipelines.flux.pipeline_flux import FluxPipeline
from transformers import CLIPTextModel, CLIPTokenizer,T5EncoderModel, T5TokenizerFast
import torch
from huggingface_hub import hf_hub_download
from torchao.quantization.quant_api import (
    quantize_,
    int8_weight_only
)
dtype = torch.bfloat16
flux_repo = "black-forest-labs/FLUX.1-schnell"
revision = "refs/pr/1"

tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14", torch_dtype=dtype)
tokenizer_2 = T5TokenizerFast.from_pretrained(flux_repo, subfolder="tokenizer_2", torch_dtype=dtype, revision=revision)
scheduler = FlowMatchEulerDiscreteScheduler.from_pretrained(flux_repo, subfolder="scheduler", revision=revision)
transformer = FluxTransformer2DModel.from_pretrained(flux_repo, subfolder="transformer", torch_dtype=dtype, revision=revision)
lora_file_path = hf_hub_download(repo_id = "Jonathan-Zhou/Flux-GameLabel-Lora", filename = "lora.safetensors")
text_encoder = CLIPTextModel.from_pretrained("openai/clip-vit-large-patch14", torch_dtype=dtype)
text_encoder_2 = T5EncoderModel.from_pretrained(flux_repo, subfolder="text_encoder_2", torch_dtype=dtype, revision=revision)
vae = AutoencoderKL.from_pretrained(flux_repo, subfolder="vae", torch_dtype=dtype, revision=revision)


pipe = FluxPipeline(
    scheduler=scheduler,
    text_encoder=text_encoder,
    tokenizer=tokenizer,
    text_encoder_2=text_encoder_2,
    tokenizer_2=tokenizer_2,
    vae=vae,
    transformer=transformer,
)

# If you want to compare the lora with the bsae model, you can comment out these two lines
pipe.load_lora_weights(lora_file_path, adapter_name="lora1")
pipe.fuse_lora()

# Quantization needed if run on a GPU with 24 GB VRAM
quantize_(transformer, int8_weight_only()) 
quantize_(text_encoder, int8_weight_only())
quantize_(text_encoder_2, int8_weight_only())
quantize_(vae, int8_weight_only())


pipe.to("cuda")
torch.cuda.empty_cache()
generator = torch.Generator().manual_seed(12345)
output = pipe(
            prompt="a man showing off his cool new t shirt at the beach, a shark is jumping out of the water in the background", 
            width=1024,
            height=1024,
            num_inference_steps=6, 
            num_images_per_prompt = 1,
            generator=generator,
            guidance_scale=3.5,
        )
image = output.images[0]
image.show()
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for Jonathan-Zhou/Flux-GameLabel-Lora

Finetuned
(36)
this model

Dataset used to train Jonathan-Zhou/Flux-GameLabel-Lora