1 15 7

Xinyu Fang

nebulae09

FangXinyu-0913

AI & ML interests

None yet

Recent Activity

liked a Space 5 days ago

opencompass/Open_LMM_Reasoning_Leaderboard

upvoted a paper 5 days ago

Are Your LLMs Capable of Stable Reasoning?

authored a paper 23 days ago

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

View all activity

Organizations

nebulae09's activity

liked a Space 5 days ago

Running

🥇

Open LMM Reasoning Leaderboard

A Leaderboard that demonstrates LMM reasoning capabilities

upvoted a paper 5 days ago

Are Your LLMs Capable of Stable Reasoning?

Paper • 2412.13147 • Published 5 days ago • 82

authored a paper 23 days ago

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

Paper • 2411.15296 • Published about 1 month ago • 19

upvoted a paper 24 days ago

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

Paper • 2411.15296 • Published about 1 month ago • 19

authored 2 papers about 1 month ago

VLMEvalKit: An Open-Source Toolkit for Evaluating Large Multi-Modality Models

Paper • 2407.11691 • Published Jul 16 • 13

ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Paper • 2410.12405 • Published Oct 16 • 13

updated a Space about 1 month ago

Running

🌎

Open VLM Video Leaderboard

VLMEvalKit Eval Results in video understanding benchmark

upvoted 2 papers 2 months ago

CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Paper • 2410.16256 • Published Oct 21 • 58

ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Paper • 2410.12405 • Published Oct 16 • 13

upvoted a collection 2 months ago

LLaVA-Video

Collection

Models focus on video understanding (previously known as LLaVA-NeXT-Video). • 6 items • Updated Oct 5 • 55

liked a model 2 months ago

rhymes-ai/Aria

Image-Text-to-Text • Updated 5 days ago • 18.6k • 598

New activity in allenai/Molmo-7B-D-0924 2 months ago

error when infer

#23 opened 2 months ago by

liu00

upvoted a paper 2 months ago

Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation

Paper • 2410.05363 • Published Oct 7 • 44

updated a dataset 2 months ago

opencompass/MMBench-Video

Preview • Updated Oct 9 • 383 • 7

liked a dataset 2 months ago

opencompass/MMBench-Video

Preview • Updated Oct 9 • 383 • 7

upvoted a paper 3 months ago

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24 • 41

liked a Space 3 months ago

Running

🌎

Open VLM Video Leaderboard

VLMEvalKit Eval Results in video understanding benchmark

upvoted a paper 5 months ago

NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window?

Paper • 2407.11963 • Published Jul 16 • 43

upvoted 2 papers 6 months ago

MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning

Paper • 2406.17770 • Published Jun 25 • 18

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Paper • 2406.04325 • Published Jun 6 • 72