DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding Paper • 2412.10302 • Published 21 days ago • 11
Granite 3.0 Language Models Collection A series of language models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 8 items • Updated 16 days ago • 96
Granite 3.1 Language Models Collection A series of language models with 128K context length trained by IBM licensed under Apache 2.0 license. • 8 items • Updated 16 days ago • 44
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated 15 days ago • 111
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 15 days ago • 75
SigLIP Collection Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 • 10 items • Updated 21 days ago • 48
DataGemma Release Collection A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated 21 days ago • 82
Llama 3.3 (All Versions) Collection Meta's new Llama 3.3 (70B) model in all formats. Includes GGUF, 4-bit bnb and original versions. • 3 items • Updated 10 days ago • 26
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published 30 days ago • 119
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated 21 days ago • 121
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 7 items • Updated 10 days ago • 21
LLM2CLIP Collection LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. • 10 items • Updated 23 days ago • 50