Visual Question Answering
Transformers
Safetensors
English
videollama2_mistral
text-generation
multimodal large language model
large video-language model
Inference Endpoints

How can I fine-tune the model?

#2
by vigneshwar472 - opened

I want to fine-tune the model checkpoint on Dense video Captioning task. It has prominent role in my research. Please help me to get started.

Sign up or log in to comment