arxiv:2412.02611
Shijia Yang
shijiay
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
11 days ago
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand
Audio-Visual Information?
authored
a paper
11 days ago
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand
Audio-Visual Information?
Organizations
None yet
models
27
shijiay/llava_clip224_stage1
Image-Text-to-Text
•
Updated
•
11
shijiay/llava_clip224_stage2
Image-Text-to-Text
•
Updated
•
18
shijiay/llava_dinov2_stage2
Image-Text-to-Text
•
Updated
•
12
•
1
shijiay/llava_clip_stage1
Image-Text-to-Text
•
Updated
•
20
shijiay/llava_clip_stage2
Image-Text-to-Text
•
Updated
•
11
shijiay/llava_openclip_stage1
Image-Text-to-Text
•
Updated
•
8
shijiay/llava_openclip_stage2
Image-Text-to-Text
•
Updated
•
13
shijiay/llava_siglip_stage1
Image-Text-to-Text
•
Updated
•
9
shijiay/llava_siglip_stage2
Image-Text-to-Text
•
Updated
•
10
shijiay/llava_sdim_stage1
Image-Text-to-Text
•
Updated
•
9
datasets
None public yet