Phi-4 (All Versions) Collection Microsoft's new Phi-4 model in all formats. Includes GGUF, 4-bit bnb and original versions. Includes Unsloth's bug fixes. • 4 items • Updated 1 day ago • 39
🧠Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 6 items • Updated 5 days ago • 31
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated about 2 hours ago • 145
Llama 3.2 Collection Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. • 27 items • Updated 1 day ago • 49
view article Article MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era By MiniMax-AI • 21 days ago • 40
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published 28 days ago • 253
Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction Paper • 2501.03218 • Published 30 days ago • 35
view article Article MedEmbed: Fine-Tuned Embedding Models for Medical / Clinical IR By abhinand • Oct 20, 2024 • 34
view article Article Llama can now see and run on your device - welcome Llama 3.2 Sep 25, 2024 • 182
Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models Paper • 2409.12139 • Published Sep 18, 2024 • 12
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Paper • 2408.16532 • Published Aug 29, 2024 • 48
view article Article Introducing AuraFace: Open-Source Face Recognition and Identity Preservation Models By isidentical • Aug 26, 2024 • 42
FocusLLM: Scaling LLM's Context by Parallel Decoding Paper • 2408.11745 • Published Aug 21, 2024 • 24
LongVILA: Scaling Long-Context Visual Language Models for Long Videos Paper • 2408.10188 • Published Aug 19, 2024 • 51
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19, 2024 • 76
view article Article Welcome FalconMamba: The first strong attention-free 7B model Aug 12, 2024 • 108
view article Article Revolutionizing Video Transcription: Unveiling Gemma-2b-it and Langchain in the Era of Transformers By Andyrasika • Mar 12, 2024 • 3