mfarre HF staff SaisExperiments commited on
Commit
54cb403
·
verified ·
1 Parent(s): 06ac4f8

SmolVLM2-500M-Video -> SmolVLM2-2.2B (#3)

Browse files

- SmolVLM2-500M-Video -> SmolVLM2-2.2B (eccc2482edff325f6d227d57791385e8a196fb1d)


Co-authored-by: Sai <SaisExperiments@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -26,7 +26,7 @@ base_model:
26
 
27
  # SmolVLM2 2.2B
28
 
29
- SmolVLM2-500M-Video is a lightweight multimodal model designed to analyze video content. The model processes videos, images, and text inputs to generate text outputs - whether answering questions about media files, comparing visual content, or transcribing text from images. Despite its compact size, requiring only 5.2GB of GPU RAM for video inference, it delivers robust performance on complex multimodal tasks. This efficiency makes it particularly well-suited for on-device applications where computational resources may be limited.
30
  ## Model Summary
31
 
32
  - **Developed by:** Hugging Face 🤗
 
26
 
27
  # SmolVLM2 2.2B
28
 
29
+ SmolVLM2-2.2B is a lightweight multimodal model designed to analyze video content. The model processes videos, images, and text inputs to generate text outputs - whether answering questions about media files, comparing visual content, or transcribing text from images. Despite its compact size, requiring only 5.2GB of GPU RAM for video inference, it delivers robust performance on complex multimodal tasks. This efficiency makes it particularly well-suited for on-device applications where computational resources may be limited.
30
  ## Model Summary
31
 
32
  - **Developed by:** Hugging Face 🤗