Interact with Qwen2.5-VL-Chat model using text and files
Engage in multi-modal conversations with images and videos
Generate responses to video or image inputs