DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging Paper • 2402.02622 • Published Feb 4 • 3