Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Paper • 2409.12191 • Published Sep 18 • 74
InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding Paper • 2401.09149 • Published Jan 17 • 1
LoongTrain: Efficient Training of Long-Sequence LLMs with Head-Context Parallelism Paper • 2406.18485 • Published Jun 26 • 2
LongVILA: Scaling Long-Context Visual Language Models for Long Videos Paper • 2408.10188 • Published Aug 19 • 51
AMSP: Super-Scaling LLM Training via Advanced Model States Partitioning Paper • 2311.00257 • Published Nov 1, 2023 • 8