BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Paper • 1810.04805 • Published Oct 11, 2018 • 14
Transformers Can Achieve Length Generalization But Not Robustly Paper • 2402.09371 • Published Feb 14 • 13
Triple-Encoders: Representations That Fire Together, Wire Together Paper • 2402.12332 • Published Feb 19 • 2