MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 6 days ago • 262
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published 11 days ago • 77
AI Paper of the Day Collection A collection of papers that I think are interesting, one added each day • 273 items • Updated 2 days ago • 35
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published Dec 13, 2024 • 89
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published Dec 13, 2024 • 139
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 225
Small Language Models: Survey, Measurements, and Insights Paper • 2409.15790 • Published Sep 24, 2024 • 1