Running 2.14k 2.14k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 274
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published Dec 13, 2024 • 140
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published Nov 22, 2024 • 59
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free Paper • 2410.10814 • Published Oct 14, 2024 • 50
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Paper • 2408.11039 • Published Aug 20, 2024 • 59
E5-V: Universal Embeddings with Multimodal Large Language Models Paper • 2407.12580 • Published Jul 17, 2024 • 40