ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions Paper • 2403.07392 • Published Mar 12, 2024 • 1