ReAct: Synergizing Reasoning and Acting in Language Models Paper • 2210.03629 • Published Oct 6, 2022 • 15
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Paper • 1810.04805 • Published Oct 11, 2018 • 16
Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts Paper • 2405.11273 • Published May 18 • 17
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality Paper • 2405.21060 • Published May 31 • 63