Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published 22 days ago • 141
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling Paper • 2502.09509 • Published 25 days ago • 7
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents Paper • 2502.11357 • Published 21 days ago • 9
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training Paper • 2502.11196 • Published 22 days ago • 22
Learning Getting-Up Policies for Real-World Humanoid Robots Paper • 2502.12152 • Published 21 days ago • 37