view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 2 days ago • 232
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24, 2024 • 189
view article Article Distilling from Dialogues: Finding Meaning in LLM Interactions By chansung • 17 days ago • 4
Remote VAE Inference Endpoints Collection Models and handler code used in https://huggingface.co/blog/remote_vae • 5 items • Updated 4 days ago • 4
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 22 days ago • 129
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 32 items • Updated 2 days ago • 145
view article Article Let’s make a generation of amazing image generation models By burtenshaw and 4 others • Nov 26, 2024 • 33
view article Article Advanced Flux Dreambooth LoRA Training with 🧨 diffusers By linoyts and 1 other • Oct 21, 2024 • 34
CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion Paper • 2403.05121 • Published Mar 8, 2024 • 24
Enhancing Training Efficiency Using Packing with Flash Attention Paper • 2407.09105 • Published Jul 12, 2024 • 15
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer Paper • 2408.06072 • Published Aug 12, 2024 • 39
Flan-T5 release Collection The Flan-T5 covers 4 checkpoints of different sizes each time. It also includes upgrades versions trained using Universal sampling • 7 items • Updated 2 days ago • 23