arxiv:2311.04901
Rui Sun
ThreeSR
AI & ML interests
Vision and Language Multimodal Learning, CV, NLP, LLM
Recent Activity
upvoted
a
paper
27 days ago
DINO-X: A Unified Vision Model for Open-World Object Detection and
Understanding
upvoted
a
paper
about 2 months ago
Training-free Regional Prompting for Diffusion Transformers
upvoted
a
paper
about 2 months ago
How Far is Video Generation from World Model: A Physical Law Perspective
Organizations
Papers
1
models
None public yet
datasets
None public yet