STIV: Scalable Text and Image Conditioned Video Generation Paper • 2412.07730 • Published 12 days ago • 68
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 3 days ago • 69
Arbitrary-steps Image Super-resolution via Diffusion Inversion Paper • 2412.09013 • Published 10 days ago • 10
view article Article Building an AI-powered search engine from scratch By as-cle-bert • 11 days ago • 8
view article Article Financial Analysis with Langchain and CrewAI Agents By herooooooooo • Jun 30 • 8
Flux.1 Tools Collection FLUX.1 Tools, a suite of models designed to add control and steerability to base text-to-image models FLUX.1 • 6 items • Updated about 1 month ago • 13
High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching Paper • 2407.03648 • Published Jul 4 • 17
Enhance Your Images Collection Some trending Gradio apps on Spaces that you can use to enhance/upscale your images for free. This collection will be kept uptodate with new releases. • 7 items • Updated Aug 22 • 17
Gradio Spaces for Background Removal Collection Enhance your images by removing the background. Will ensure these Spaces are up and maintained for the community. • 5 items • Updated Aug 20 • 23
Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 Paper • 2408.05147 • Published Aug 9 • 38
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5 • 181
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24 • 179
AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation Paper • 2406.07686 • Published Jun 11 • 14