Align before Fuse: Vision and Language Representation Learning with Momentum Distillation Paper • 2107.07651 • Published Jul 16, 2021 • 1