use internVL2.5 78b in a modular fashion

by Walley - opened 15 days ago

15 days ago

Is it possible to use internVL2.5 78b in a modular fashion? Specifically, I would like to attach the internVL head for multimodal tasks and use only qwen2.5 for language questioning and answering.

czczup

OpenGVLab org 12 days ago

Yes, it’s possible. For multimodal inputs like images and videos, use the full 78B model. For pure text inputs, use only the language model part (Qwen2.5). See the Quick Start section in the README for details.

czczup changed discussion status to closed 1 day ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment