CodePlan: Repository-level Coding using LLMs and Planning Paper • 2309.12499 • Published Sep 21, 2023 • 75
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 27 items • Updated about 15 hours ago • 129
VideoLLaMA3 Collection Frontier Multimodal Foundation Models for Video Understanding • 13 items • Updated 8 days ago • 8
SmolVLM 256M & 500M Collection Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 9 days ago • 63
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 18 days ago • 130
AceMath Collection We are releasing math instruction models, math reward models, general instruction models, all training datasets, and a math reward benchmark. • 11 items • Updated 15 days ago • 9
Deepseek V3 (All Versions) Collection Deepseek V3 - available in bf16, original, and GGUF formats, with support for 2, 3, 4, 5, 6 and 8-bit quantized versions. • 3 items • Updated about 15 hours ago • 29
Maya: An Instruction Finetuned Multilingual Multimodal Model Paper • 2412.07112 • Published Dec 10, 2024 • 27
Reasoning Datasets Collection Reasoning datasets that are trending 🔥 • 10 items • Updated 29 days ago • 24
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated 25 days ago • 80
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 130