JW17's picture
Add project website link
b2dc57c verified
---
license: openrail++
library_name: diffusers
tags:
- text-to-image
- text-to-image
- diffusers-training
- diffusers
- stable-diffusion-xl
- stable-diffusion-xl-diffusers
base_model: stabilityai/stable-diffusion-xl-base-1.0
---
# Margin-aware Preference Optimization for Aligning Diffusion Models without Reference
<div align="center">
<img src="assets/mapo_overview.jpg" width=750/>
</div><br>
We propose **MaPO**, a reference-free, sample-efficient, memory-friendly alignment technique for text-to-image diffusion models. For more details on the technique, please refer to our paper [here](https://arxiv.org/abs/2406.06424).
## Developed by
* Jiwoo Hong<sup>*</sup> (KAIST AI)
* Sayak Paul<sup>*</sup> (Hugging Face)
* Noah Lee (KAIST AI)
* Kashif Rasul (Hugging Face)
* James Thorne (KAIST AI)
* Jongheon Jeong (Korea University)
## Dataset
This model was fine-tuned from [Stable Diffusion XL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) on the [cartoon split of Pick-Style](mapo-t2i/pick-style-cartoon).
## Training Code
Refer to our code repository [here](https://github.com/mapo-t2i/mapo).
## Inference
```python
from diffusers import DiffusionPipeline, AutoencoderKL, UNet2DConditionModel
import torch
sdxl_id = "stabilityai/stable-diffusion-xl-base-1.0"
vae_id = "madebyollin/sdxl-vae-fp16-fix"
unet_id = "mapo-t2i/mapo-pick-style-cartoon"
vae = AutoencoderKL.from_pretrained(vae_id, torch_dtype=torch.float16)
unet = UNet2DConditionModel.from_pretrained(unet_id, subfolder='unet', torch_dtype=torch.float16)
pipeline = DiffusionPipeline.from_pretrained(sdxl_id, vae=vae, unet=unet, torch_dtype=torch.float16).to("cuda")
prompt = "portrait of gorgeous cyborg with golden hair, high resolution"
image = pipeline(prompt=prompt, num_inference_steps=30).images[0]
```
For qualitative results, please visit our [project website](https://mapo-t2i.github.io/).
## Citation
```bibtex
@misc{todo,
title={Margin-aware Preference Optimization for Aligning Diffusion Models without Reference},
author={Jiwoo Hong and Sayak Paul and Noah Lee and Kashif Rasuland James Thorne and Jongheon Jeong},
year={2024},
eprint={todo},
archivePrefix={arXiv},
primaryClass={cs.CV,cs.LG}
}
```