FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model Paper • 2410.13925 • Published Oct 17 • 22
SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields Paper • 2408.06697 • Published Aug 13 • 14
SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding Paper • 2401.09340 • Published Jan 17 • 19