Multimodal Benchmarking IR

university

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

zhangysk authored a paper 17 minutes ago

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models

zhangysk authored a paper 17 minutes ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

zhangysk authored a paper 21 minutes ago

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

View all activity

models

None public yet

datasets 1

MBEIR/M-BEIR_DEV

Viewer • Updated Jan 3, 2024 • 1.07M • 154