Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Multimodal Benchmarking IR
university
Activity Feed
Request to join this org
Follow
6
AI & ML interests
None defined yet.
Recent Activity
zhangysk
authored
a paper
17 minutes ago
SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models
zhangysk
authored
a paper
17 minutes ago
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
zhangysk
authored
a paper
21 minutes ago
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?
View all activity
Team members
5
models
None public yet
datasets
1
MBEIR/M-BEIR_DEV
Viewer
•
Updated
Jan 3, 2024
•
1.07M
•
154