VITA-MLLM
/

VITA-1.5

Video-Text-to-Text

Model card Files Files and versions Community

VITA-1.5 / README.md

nielsr's picture

nielsr HF staff

Add model card

34ce5f2 verified about 1 month ago

|

245 Bytes

metadata

pipeline_tag: video-text-to-text

This repository contains the model of the paper VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction.

Code: https://github.com/VITA-MLLM/VITA