-
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features
Paper • 2502.14786 • Published • 101 -
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
Paper • 2502.14846 • Published • 13 -
RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers
Paper • 2502.14377 • Published • 10
Liu
Liudawp
·
AI & ML interests
None yet
Recent Activity
updated
a collection
3 days ago
ai tech
updated
a collection
3 days ago
ai tech
liked
a model
5 days ago
microsoft/OmniParser-v2.0
Organizations
None yet
Collections
1
models
None public yet
datasets
None public yet