LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation Paper • 2412.15188 • Published 3 days ago • 1
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published 4 days ago • 47
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN Paper • 2412.13795 • Published 4 days ago • 18
Compressed Chain of Thought: Efficient Reasoning Through Dense Representations Paper • 2412.13171 • Published 5 days ago • 30
Emergence of Abstractions: Concept Encoding and Decoding Mechanism for In-Context Learning in Transformers Paper • 2412.12276 • Published 6 days ago • 14
Byte Latent Transformer: Patches Scale Better Than Tokens Paper • 2412.09871 • Published 10 days ago • 69
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published 9 days ago • 130
The Pitfalls of Memorization: When Memorization Hurts Generalization Paper • 2412.07684 • Published 12 days ago • 1