Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
-
OpenGVLab/InternViT-6B-224px
Image Feature Extraction • Updated • 1.74k • 16 -
OpenGVLab/InternVL-14B-224px
Image Feature Extraction • Updated • 1.9k • 28 -
OpenGVLab/InternVL-Chat-V1-2-Plus
Visual Question Answering • Updated • 383 • 31 -
OpenGVLab/InternVL-Chat-V1-2
Visual Question Answering • Updated • 539 • 12