Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference status
Reset Inference status
Warm
Cold
Frozen
Misc
Reset Misc
Inference Endpoints
text-generation-inference
image-text-to-text
custom_code
AutoTrain Compatible
4-bit precision
8-bit precision
Eval Results
Merge
Mixture of Experts
Misc with no match
text-embeddings-inference
Carbon Emissions
Apply filters
Models
5,703
Full-text search
Edit filters
Sort: Trending
Active filters:
image-text-to-text
Clear all
microsoft/OmniParser
Image-Text-to-Text
•
Updated
23 days ago
•
11.3k
•
1.34k
Xkev/Llama-3.2V-11B-cot
Image-Text-to-Text
•
Updated
3 days ago
•
3.03k
•
68
meta-llama/Llama-3.2-11B-Vision-Instruct
Image-Text-to-Text
•
Updated
Sep 30
•
2.25M
•
•
984
microsoft/Florence-2-large
Image-Text-to-Text
•
Updated
9 days ago
•
818k
•
1.25k
stepfun-ai/GOT-OCR2_0
Image-Text-to-Text
•
Updated
Sep 18
•
617k
•
1.21k
AskUI/PTA-1
Image-Text-to-Text
•
Updated
5 days ago
•
643
•
23
vikhyatk/moondream2
Image-Text-to-Text
•
Updated
9 days ago
•
222k
•
724
openbmb/MiniCPM-V-2_6
Image-Text-to-Text
•
Updated
9 days ago
•
117k
•
830
Qwen/Qwen2-VL-7B-Instruct
Image-Text-to-Text
•
Updated
Sep 21
•
1.7M
•
•
840
meta-llama/Llama-3.2-11B-Vision
Image-Text-to-Text
•
Updated
Sep 27
•
95.2k
•
362
Salesforce/blip-image-captioning-large
Image-to-Text
•
Updated
Dec 7, 2023
•
2.21M
•
•
1.17k
Qwen/Qwen2-VL-2B-Instruct
Image-Text-to-Text
•
Updated
Sep 21
•
933k
•
280
Qwen/Qwen2-VL-72B-Instruct
Image-Text-to-Text
•
Updated
Sep 21
•
79.5k
•
172
mPLUG/DocOwl2
Image-Text-to-Text
•
Updated
Sep 27
•
2.69k
•
56
microsoft/Phi-3.5-vision-instruct
Image-Text-to-Text
•
Updated
Sep 26
•
1.01M
•
576
meta-llama/Llama-3.2-90B-Vision-Instruct
Image-Text-to-Text
•
Updated
Sep 30
•
301k
•
273
allenai/Molmo-7B-D-0924
Image-Text-to-Text
•
Updated
Oct 10
•
71.8k
•
441
OpenFace-CQUPT/Human_LLaVA
Visual Question Answering
•
Updated
18 days ago
•
848
•
33
unsloth/Llama-3.2-11B-Vision-Instruct-bnb-4bit
Image-Text-to-Text
•
Updated
3 days ago
•
38.6k
•
51
google/deplot
Visual Question Answering
•
Updated
Sep 6, 2023
•
12.3k
•
258
microsoft/trocr-base-handwritten
Image-to-Text
•
Updated
May 27
•
695k
•
338
openbmb/MiniCPM-Llama3-V-2_5
Image-Text-to-Text
•
Updated
Sep 25
•
37.1k
•
1.37k
meta-llama/Llama-3.2-90B-Vision
Image-Text-to-Text
•
Updated
Sep 27
•
6.2k
•
106
OpenGVLab/InternVL2-8B-MPO
Image-Text-to-Text
•
Updated
6 days ago
•
697
•
16
nlpconnect/vit-gpt2-image-captioning
Image-to-Text
•
Updated
Feb 27, 2023
•
1.94M
•
•
840
microsoft/git-base
Image-to-Text
•
Updated
Apr 24, 2023
•
2.05M
•
77
microsoft/Florence-2-base
Image-Text-to-Text
•
Updated
20 days ago
•
347k
•
180
OpenGVLab/InternVL2-2B
Image-Text-to-Text
•
Updated
3 days ago
•
90.3k
•
60
llava-hf/llava-onevision-qwen2-0.5b-ov-hf
Image-Text-to-Text
•
Updated
5 days ago
•
79.4k
•
17
unsloth/Llama-3.2-11B-Vision-Instruct
Image-Text-to-Text
•
Updated
3 days ago
•
42.4k
•
59
Previous
1
2
3
...
100
Next