Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Edit Models filters
Tasks
1
Libraries
Datasets
Languages
Licenses
Other
Reset Tasks
Multimodal
Image-Text-to-Text
Visual Question Answering
Document Question Answering
Video-Text-to-Text
Any-to-Any
Computer Vision
Depth Estimation
Image Classification
Object Detection
Image Segmentation
Text-to-Image
Image-to-Text
Image-to-Image
Image-to-Video
Unconditional Image Generation
Video Classification
Text-to-Video
Zero-Shot Image Classification
Mask Generation
Zero-Shot Object Detection
Text-to-3D
Image-to-3D
Image Feature Extraction
Keypoint Detection
Natural Language Processing
Text Classification
Token Classification
Table Question Answering
Question Answering
Zero-Shot Classification
Translation
Summarization
Feature Extraction
Text Generation
Text2Text Generation
Fill-Mask
Sentence Similarity
Audio
Text-to-Speech
Text-to-Audio
Automatic Speech Recognition
Audio-to-Audio
Audio Classification
Voice Activity Detection
Tabular
Tabular Classification
Tabular Regression
Time Series Forecasting
Reinforcement Learning
Reinforcement Learning
Robotics
Other
Graph Machine Learning
Apply filters
Models
603
Full-text search
Edit filters
Sort: Trending
Active filters:
image-text-to-text
Clear all
deepseek-ai/deepseek-vl-7b-chat
Image-Text-to-Text
•
Updated
Mar 15
•
9.14k
•
220
llava-hf/llava-v1.6-34b-hf
Image-Text-to-Text
•
Updated
Aug 16
•
20.2k
•
68
llava-hf/llava-v1.6-vicuna-13b-hf
Image-Text-to-Text
•
Updated
Aug 16
•
42.6k
•
15
Xenova/moondream2
Image-Text-to-Text
•
Updated
May 17
•
62
•
18
xtuner/llava-llama-3-8b-v1_1
Image-Text-to-Text
•
Updated
Apr 28
•
220
•
118
xtuner/llava-llama-3-8b-v1_1-hf
Image-Text-to-Text
•
Updated
Apr 28
•
1.79k
•
23
xtuner/llava-llama-3-8b-v1_1-transformers
Image-Text-to-Text
•
Updated
Apr 28
•
3.33k
•
48
xtuner/llava-llama-3-8b-transformers
Image-Text-to-Text
•
Updated
Apr 26
•
126
•
4
nullt3r/llava-llama-3-8b-v1_1-Q8_0-GGUF
Image-Text-to-Text
•
Updated
Apr 27
•
16
•
1
qresearch/llama-3-vision-alpha-hf
Image-Text-to-Text
•
Updated
Jul 21
•
714
•
57
HuggingFaceM4/idefics2-8b-chatty-AWQ
Image-Text-to-Text
•
Updated
May 6
•
11
•
4
google/paligemma-3b-ft-ocrvqa-896-jax
Image-Text-to-Text
•
Updated
Jul 19
•
4
•
2
google/paligemma-3b-ft-docvqa-224
Image-Text-to-Text
•
Updated
Jul 19
•
82
•
1
google/paligemma-3b-ft-ocrvqa-448
Image-Text-to-Text
•
Updated
Jul 19
•
85
•
5
google/paligemma-3b-ft-vqav2-448
Image-Text-to-Text
•
Updated
Jul 19
•
83
•
11
google/paligemma-3b-mix-224
Image-Text-to-Text
•
Updated
Jul 19
•
152k
•
53
google/paligemma-3b-ft-refcoco-seg-896
Image-Text-to-Text
•
Updated
Jul 19
•
39
•
5
google/paligemma-3b-ft-docvqa-896
Image-Text-to-Text
•
Updated
Jul 19
•
1.63k
•
4
google/paligemma-3b-ft-widgetcap-448
Image-Text-to-Text
•
Updated
Jul 19
•
12
•
3
google/paligemma-3b-pt-896
Image-Text-to-Text
•
Updated
Jul 19
•
65.7k
•
106
google/paligemma-3b-pt-448
Image-Text-to-Text
•
Updated
Jul 19
•
17k
•
23
tinyllava/TinyLLaVA-Phi-2-SigLIP-3.1B
Image-Text-to-Text
•
Updated
May 18
•
2.23k
•
12
mlx-community/paligemma-3b-mix-448-8bit
Image-Text-to-Text
•
Updated
May 24
•
13
•
7
lamm-mit/Cephalo-Phi-3-vision-128k-4b-alpha
Image-Text-to-Text
•
Updated
Jun 2
•
14
•
6
hiyouga/PaliGemma-3B-Chat-v0.1
Image-Text-to-Text
•
Updated
Jul 1
•
8
•
11
Reverb/Idefics2-8b-docVQA-finetuned
Image-Text-to-Text
•
Updated
May 25
•
24
•
2
lamm-mit/Cephalo-Phi-3-vision-128k-4b-beta
Image-Text-to-Text
•
Updated
Jun 2
•
171
•
2
OpenGVLab/Mini-InternVL-Chat-4B-V1-5
Image-Text-to-Text
•
Updated
28 days ago
•
1.97k
•
57
lamm-mit/Cephalo-Phi-3-MoE-vision-128k-3x4b-beta
Image-Text-to-Text
•
Updated
Jun 4
•
28
•
1
yifanzhang114/SliME-vicuna-13B
Image-Text-to-Text
•
Updated
Jun 2
•
153
•
2
Previous
1
2
3
4
5
...
21
Next