view article Article Atlaset Dataset for Moroccan Darija: From Data Collection, Analysis, to Model Trainings By atlasia and 1 other • 3 days ago • 17
Gemstone Models Collection Our 22 open source Gemstone models for scaling laws range from 50M to 2B parameters, spanning 11 widths from 256 to 3072 and 18 depths from 3 to 80. • 59 items • Updated 10 days ago • 5
Ovis2 Collection Our latest advancement in multi-modal large language models (MLLMs) • 8 items • Updated 20 days ago • 53
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published 16 days ago • 44
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 8 items • Updated 13 days ago • 389
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 340
view article Article TerjamaBench: A Cultural Benchmark for English-Darija Machine Translation By imomayiz and 4 others • Jan 10 • 30
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 260
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published Jan 7 • 50
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 93
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions Paper • 2412.09596 • Published Dec 12, 2024 • 94