PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated 21 days ago • 121
SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters Paper • 2412.00174 • Published Nov 29, 2024 • 22
view article Article wHy DoNt YoU jUsT uSe ThE lLaMa ToKeNiZeR?? By catherinearnett • Sep 27, 2024 • 38
Llama 3.2 3B & 1B GGUF Quants Collection Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models. • 4 items • Updated Sep 26, 2024 • 46
LVCD: Reference-based Lineart Video Colorization with Diffusion Models Paper • 2409.12960 • Published Sep 19, 2024 • 23
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12, 2024 • 117
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated 28 days ago • 637
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 By manu • Jul 5, 2024 • 182
Sora Reference Papers Collection A collection of all papers referenced in OpenAI's "Video generation models as world simulators" technical report • openai.com/sora • 30 items • Updated Oct 3, 2024 • 52
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 183
Text-to-Image Base Models Collection All text-to-image open source base models, with their respective license • 28 items • Updated May 10, 2024 • 21
SALMONN: Towards Generic Hearing Abilities for Large Language Models Paper • 2310.13289 • Published Oct 20, 2023 • 17
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models Paper • 2402.06178 • Published Feb 9, 2024 • 13
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Nov 28, 2024 • 205