Submitted by akhaliq 11 DualToken-ViT: Position-aware Efficient Vision Transformer with Dual Token Fusion · 7 authors 2
Submitted by akhaliq 9 MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation · 6 authors 1
Submitted by akhaliq 9 Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model · 10 authors 1
Submitted by akhaliq 8 Robotic Offline RL from Internet Videos via Value-Function Pre-Training · 9 authors