SEED-Bench / file /result.csv
BreakLee's picture
SEED Bench
1edb956
raw
history blame
No virus
2.39 kB
Model Type,Model,Language Model,Scene Understanding,Instance Identity,Instance Attributes,Instance Localization,Instance Counting,Spatial Relation,Instance Interaction,Visual Reasoning,Text Recognition,Avg. Img,Action Recognition,Action Prediction,Procedure Understanding,Avg. Video,Avg. All
LLM,Flan-T5,Flan-T5-XL,23.0,29.0,32.8,31.8,20.5,31.8,33.0,18.2,19.4,27.32,23.2,34.9,25.4,28.57,27.65
LLM,Vicuna,Vicuna-7B,23.4,30.7,29.7,30.9,30.8,28.6,29.8,18.5,13.4,28.16,27.3,34.5,23.8,29.47,28.5
LLM,LLaMA,LLaMA-7B,26.3,27.4,26.2,28.3,25.1,28.8,19.2,37.0,9.0,26.56,33.0,23.1,26.2,27.27,26.75
ImageLLM,BLIP-2,Flan-T5-XL,59.1,53.9,49.2,42.3,43.2,36.7,55.7,45.6,25.9,49.74,32.6,47.5,24.0,36.71,46.35
ImageLLM,InstructBLIP,Flan-T5-XL,60.3,58.5,63.4,40.6,58.4,38.7,51.6,45.9,25.9,57.8,33.1,49.1,27.1,38.31,52.73
ImageLLM,InstructBLIP-Vicuna,Vicuna-7B,60.2,58.9,65.6,43.6,57.2,40.3,52.6,47.7,43.5,58.76,34.5,49.6,23.1,38.05,53.37
ImageLLM,LLaVA,LLaMA-7B,42.7,34.9,33.5,28.4,41.9,30.8,27.8,46.8,27.7,36.96,29.7,21.4,19.1,23.76,33.52
ImageLLM,MiniGPT-4,Flan-T5-XL,56.3,49.2,45.8,37.9,45.3,32.6,47.4,57.1,11.8,47.4,38.2,24.5,27.1,29.89,42.84
ImageLLM,VPGTrans,LLaMA-7B,51.9,44.1,39.9,36.1,33.7,36.4,32.0,53.2,30.6,41.81,39.5,24.3,31.9,31.4,39.1
ImageLLM,MultiModal-GPT,LLaMA-7B,43.6,37.9,31.5,30.8,27.3,30.1,29.9,51.4,18.8,34.54,36.9,25.8,24.0,29.21,33.15
ImageLLM,Otter,LLaMA-7B,44.9,38.6,32.2,30.9,26.3,31.8,32.0,51.4,31.8,35.16,37.9,27.2,24.8,30.35,33.91
ImageLLM,OpenFlamingo,LLaMA-7B,43.9,38.1,31.3,30.1,27.3,30.6,29.9,50.2,20.0,34.51,37.2,25.4,24.2,29.25,33.14
ImageLLM,LLaMA-AdapterV2,LLaMA-7B,45.2,38.5,29.3,33.0,29.7,35.5,39.2,52.0,24.7,35.19,38.6,18.5,19.6,25.75,32.73
ImageLLM,GVT,Vicuna-7B,41.7,35.5,31.8,29.5,36.2,32.0,32.0,51.1,27.1,35.49,33.9,25.4,23.0,27.77,33.48
ImageLLM,mPLUG-Owl,LLaMA-7B,49.7,45.3,32.5,36.7,27.3,32.7,44.3,54.7,28.8,37.88,26.7,17.9,26.5,23.02,34.01
VideoLLM,VideoChat,Vicuna-7B,47.1,43.8,34.9,40.0,32.8,34.6,42.3,50.5,17.7,39.02,34.9,36.4,27.3,33.68,37.63
VideoLLM,Video-ChatGPT,LLaMA-7B,37.2,31.4,33.2,28.4,35.5,29.5,23.7,42.3,25.9,33.88,27.6,21.3,21.1,23.46,31.17
VideoLLM,Valley,LLaMA-13B,39.3,32.9,31.6,27.9,24.2,30.1,27.8,43.8,11.8,32.04,31.3,23.2,20.7,25.41,30.32
LLaMA-7B,test,LLaMA-7B,53.2,45.3,40.0,31.2,39.3,32.6,36.1,51.4,25.6,42.7,42.9,34.7,26.9,35.7,40.9
LLaMA-7B,test2,LLaMA-7B,53.2,45.3,40.0,31.2,39.3,32.6,36.1,51.4,25.6,42.7,42.9,34.7,26.9,35.7,40.9