Awaker2.5-VL: Stably Scaling MLLMs with Parameter-Efficient Mixture of Experts Paper • 2411.10669 • Published Nov 16 • 10
Mini-Omni2: Towards Open-source GPT-4o with Vision, Speech and Duplex Capabilities Paper • 2410.11190 • Published Oct 15 • 20
view article Article How to build a custom text classifier without days of human labeling By sdiazlor • Oct 17 • 55
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated Oct 15 • 146
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models Paper • 2410.07985 • Published Oct 10 • 28
🍓 Ichigo v0.3 Collection The experimental family designed to train LLMs to understand sound natively. • 6 items • Updated Nov 11 • 17
Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1 • 144
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated 16 days ago • 545
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning Paper • 2409.12183 • Published Sep 18 • 36
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated 24 days ago • 436
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 224