mfarre HF staff commited on
Commit
e083124
·
verified ·
1 Parent(s): f93dfb6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -43,7 +43,7 @@ SmolVLM2-256M-Video is a lightweight multimodal model designed to analyze video
43
 
44
  SmolVLM2 can be used for inference on multimodal (video / image / text) tasks where the input consists of text queries along with video or one or more images. Text and media files can be interleaved arbitrarily, enabling tasks like captioning, visual question answering, and storytelling based on visual content. The model does not support image or video generation.
45
 
46
- To fine-tune SmolVLM2 on a specific task, you can follow [the fine-tuning tutorial](UPDATE).
47
 
48
  ## Evaluation
49
 
 
43
 
44
  SmolVLM2 can be used for inference on multimodal (video / image / text) tasks where the input consists of text queries along with video or one or more images. Text and media files can be interleaved arbitrarily, enabling tasks like captioning, visual question answering, and storytelling based on visual content. The model does not support image or video generation.
45
 
46
+ To fine-tune SmolVLM2 on a specific task, you can follow [the fine-tuning tutorial](https://github.com/huggingface/smollm/blob/main/vision/finetuning/Smol_VLM_FT.ipynb).
47
 
48
  ## Evaluation
49