Zesen Cheng
ClownRat
AI & ML interests
multi-modal foundation model; Segmentation, Detection, and Tracking;
Recent Activity
upvoted
a
paper
about 23 hours ago
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs
upvoted
a
paper
about 23 hours ago
On the Compositional Generalization of Multimodal LLMs for Medical
Imaging
upvoted
a
paper
about 23 hours ago
Are Vision-Language Models Truly Understanding Multi-vision Sensor?