TroL: Traversal of Layers for Large Language and Vision Models Paper • 2406.12246 • Published 15 days ago • 33
A Closer Look into Mixture-of-Experts in Large Language Models Paper • 2406.18219 • Published 7 days ago • 12