Jonathan-Zhou
/

Flux-GameLabel-Lora

Model card Files Files and versions Community

Flux-GameLabel-Lora / README.md

Jonathan-Zhou's picture

Update README.md

926faab verified 4 months ago

|

history blame contribute delete

3.12 kB

	---
	license: apache-2.0
	datasets:
	- Jonathan-Zhou/GameLabel-10k
	base_model:
	- black-forest-labs/FLUX.1-schnell
	pipeline_tag: text-to-image
	---
	# Flux GameLabel Lora

	This model is intended purely for research purposes as a demonstration of the the quality of data labeled by random video game players. It achieves its purpose (higher prompt adherence), but suffers from a variety of issues due to being fine tuned on synthetic outputs.

	Inference code that runs on a 24GB consumer card is below. More details are in the paper at [https://arxiv.org/abs/2409.19830](https://arxiv.org/abs/2409.19830)


	```python3
	from diffusers import FlowMatchEulerDiscreteScheduler, AutoencoderKL
	from diffusers.models.transformers.transformer_flux import FluxTransformer2DModel
	from diffusers.pipelines.flux.pipeline_flux import FluxPipeline
	from transformers import CLIPTextModel, CLIPTokenizer,T5EncoderModel, T5TokenizerFast
	import torch
	from huggingface_hub import hf_hub_download
	from torchao.quantization.quant_api import (
	quantize_,
	int8_weight_only
	)
	dtype = torch.bfloat16
	flux_repo = "black-forest-labs/FLUX.1-schnell"
	revision = "refs/pr/1"

	tokenizer = CLIPTokenizer.from_pretrained("openai/clip-vit-large-patch14", torch_dtype=dtype)
	tokenizer_2 = T5TokenizerFast.from_pretrained(flux_repo, subfolder="tokenizer_2", torch_dtype=dtype, revision=revision)
	scheduler = FlowMatchEulerDiscreteScheduler.from_pretrained(flux_repo, subfolder="scheduler", revision=revision)
	transformer = FluxTransformer2DModel.from_pretrained(flux_repo, subfolder="transformer", torch_dtype=dtype, revision=revision)
	lora_file_path = hf_hub_download(repo_id = "Jonathan-Zhou/Flux-GameLabel-Lora", filename = "lora.safetensors")
	text_encoder = CLIPTextModel.from_pretrained("openai/clip-vit-large-patch14", torch_dtype=dtype)
	text_encoder_2 = T5EncoderModel.from_pretrained(flux_repo, subfolder="text_encoder_2", torch_dtype=dtype, revision=revision)
	vae = AutoencoderKL.from_pretrained(flux_repo, subfolder="vae", torch_dtype=dtype, revision=revision)


	pipe = FluxPipeline(
	scheduler=scheduler,
	text_encoder=text_encoder,
	tokenizer=tokenizer,
	text_encoder_2=text_encoder_2,
	tokenizer_2=tokenizer_2,
	vae=vae,
	transformer=transformer,
	)

	# If you want to compare the lora with the bsae model, you can comment out these two lines
	pipe.load_lora_weights(lora_file_path, adapter_name="lora1")
	pipe.fuse_lora()

	# Quantization needed if run on a GPU with 24 GB VRAM
	quantize_(transformer, int8_weight_only())
	quantize_(text_encoder, int8_weight_only())
	quantize_(text_encoder_2, int8_weight_only())
	quantize_(vae, int8_weight_only())


	pipe.to("cuda")
	torch.cuda.empty_cache()
	generator = torch.Generator().manual_seed(12345)
	output = pipe(
	prompt="a man showing off his cool new t shirt at the beach, a shark is jumping out of the water in the background",
	width=1024,
	height=1024,
	num_inference_steps=6,
	num_images_per_prompt = 1,
	generator=generator,
	guidance_scale=3.5,
	)
	image = output.images[0]
	image.show()
	```