gokay aydogan PRO

gokaygokay

AI & ML interests

Vision Language Models

Organizations

Posts 1

view post
Post
4853
I've fine-tuned three types of PaliGemma image captioner models for generating prompts for Text2Image models. They generate captions similar to prompts we give to the image generation models. I used google/docci and google/imageinwords datasets for fine-tuning.

This one gives you longer captions.

gokaygokay/SD3-Long-Captioner

This one gives you middle size captions.

gokaygokay/SD3-Long-Captioner-V2

And this one gives you shorter captions.

gokaygokay/SDXL-Captioner