--- datasets: - yuvalkirstain/pickapic_v2 library_name: diffusers --- # Diffusion-KTO: Aligning Diffusion Models by Optimizing Human Utility


This model is fine-tuned from Stable Diffusion v1-5 on Pick-a-Pic v2 dataset using KTO. ### Usage ```python import torch from diffusers import AutoencoderKL, UNet2DConditionModel, DiffusionPipeline vae_path = model_name = "runwayml/stable-diffusion-v1-5" device = 'cuda' weight_dtype = torch.float16 vae = AutoencoderKL.from_pretrained( vae_path, subfolder="vae", ) unet = UNet2DConditionModel.from_pretrained( "jacklishufan/diffusion-kto", subfolder="unet", ) pipeline = DiffusionPipeline.from_pretrained( model_name, vae=vae, unet=unet, device=device, ).to(device).to(weight_dtype) result = pipeline( prompt="Self-portrait oil painting, a beautiful cyborg with golden hair, 8k", num_inference_steps=50, guidance_scale=7.0 ) img = result[0][0] ``` ### Code The code is available [here](https://github.com/jacklishufan/diffusion-kto) ### Citation ``` @misc{li2024aligning, title={Aligning Diffusion Models by Optimizing Human Utility}, author={Shufan Li and Konstantinos Kallidromitis and Akash Gokul and Yusuke Kato and Kazuki Kozuka}, year={2024}, eprint={2404.04465}, archivePrefix={arXiv}, primaryClass={cs.CV} } ```