3 4 23

Jihan Yang

jihanyang

https://jihanyang.github.io/

AI & ML interests

Computer Vision, Multimodality, Embodied AI

Recent Activity

liked a dataset 13 days ago

ShareGPT4Video/ShareGPT4Video

liked a dataset 21 days ago

nyu-visionx/VSI-Bench

authored a paper 22 days ago

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

View all activity

Organizations

jihanyang's activity

liked a dataset 13 days ago

ShareGPT4Video/ShareGPT4Video

Viewer • Updated Jul 8, 2024 • 40.2k • 2.33k • 187

liked a dataset 21 days ago

nyu-visionx/VSI-Bench

Viewer • Updated 22 days ago • 5.13k • 880 • 26

authored a paper 22 days ago

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

Paper • 2412.14171 • Published 23 days ago • 24

upvoted a paper 22 days ago

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

Paper • 2412.14171 • Published 23 days ago • 24

updated a dataset 24 days ago

nyu-visionx/VSI-Bench

Viewer • Updated 22 days ago • 5.13k • 880 • 26

updated 2 datasets 6 months ago

nyu-visionx/Cambrian-Alignment

Viewer • Updated Jul 23, 2024 • 292k • 1.67k • 32

jihanyang/RegionPLC_ScanNet200

Updated Jul 5, 2024 • 2

updated a model 7 months ago

jihanyang/RegionPLC

Updated Jun 28, 2024

liked a dataset 7 months ago

lmms-lab/Video-MME

Viewer • Updated Jul 4, 2024 • 2.7k • 10.7k • 32

upvoted a paper 7 months ago

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24, 2024 • 59

authored a paper 7 months ago

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24, 2024 • 59

liked a Space 7 months ago

Running on CPU Upgrade

4.51k

🥇

MTEB Leaderboard

liked a dataset 7 months ago

Enxin/MovieChat-1K-test

Preview • Updated Jun 11, 2024 • 235 • 10

liked a model 7 months ago

Salesforce/blip2-flan-t5-xxl

Image-Text-to-Text • Updated Nov 21, 2024 • 5.54k • 85

updated 2 datasets 7 months ago

jihanyang/V-IRL_Detection

Preview • Updated Jun 4, 2024 • 31

jihanyang/V-IRL_VLN

Updated Jun 4, 2024 • 79

New activity in jihanyang/RegionPLC 8 months ago

add caption files

#2 opened 8 months ago by

jihanyang

add ckpts

#1 opened 8 months ago by

jihanyang

upvoted a paper 8 months ago

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 101

liked a model 8 months ago

HuggingFaceM4/idefics2-8b

Image-Text-to-Text • Updated Oct 14, 2024 • 14.7k • 601