view post Post 1654 Reply 🤗 transformers pipelines now support vision language models for easy local inference 🫰🏻 h/t @yonigozlan for shipping this 🎩👏you can also use inference API to infer hosted vision LMs (via Python, JS and cURL) https://huggingface.co/docs/api-inference/en/tasks/image-text-to-text
Nov 15 Releases 🍂 microsoft/LLM2CLIP-EVA02-L-14-336 Zero-Shot Image Classification • Updated 5 days ago • 1.01k • 36 microsoft/LLM2CLIP-EVA02-B-16 Updated 11 days ago • 238 • 6 PleIAs/common_corpus Viewer • Updated 6 days ago • 397M • 31.2k • 147 Qwen/Qwen2.5-Coder-32B-Instruct Text Generation • Updated 4 days ago • 49.9k • • 886
Nov 1 Releases Running on Zero 64 🌖 LongVU facebook/MobileLLM-1B Text Generation • Updated 20 days ago • 9.02k • 108 Vision-CAIR/LongVU_Qwen2_7B Video-Text-to-Text • Updated 22 days ago • 1.18k • 55 Vision-CAIR/LongVU_Llama3_2_3B_img Updated 29 days ago • 99 • 6