3 18 19

Andrea Gemelli

andreagemelli

https://www.andreagemelli.me

AI & ML interests

Natural Language Processing, Computer Vision, Generative Models, Document Analysis

Recent Activity

upvoted an article 1 day ago

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

upvoted a collection 15 days ago

Qwen2.5-VL

liked a Space 17 days ago

huggingface/ai-deadlines

View all activity

Organizations

andreagemelli's activity

upvoted an article 1 day ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

2 days ago

• 206

upvoted a collection 15 days ago

Qwen2.5-VL

Collection

Vision-language model series based on Qwen2.5 • 8 items • Updated 18 days ago • 396

liked a Space 17 days ago

320

AI Deadlines

⚡

Schedule tasks efficiently using AI-generated deadlines

upvoted a collection 18 days ago

Comics Understanding

Collection

4 items • Updated 21 days ago • 3

liked a dataset 18 days ago

VLR-CVC/ComicsPAP

Viewer • Updated about 16 hours ago • 64.8k • 3.31k • 12

upvoted a collection 18 days ago

Phi-3

Collection

Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 26 items • Updated Jan 8 • 564

liked a model 18 days ago

BigDocs/BigDocs-Phi-3.5-instruct

Updated Jan 27 • 245 • 1

updated a model 20 days ago

andreagemelli/Phi-3.5-mini-thinking-function_calling-V0

Updated 20 days ago

published a model 21 days ago

andreagemelli/Phi-3.5-mini-thinking-function_calling-V0

Updated 20 days ago

reacted to burtenshaw's post with ❤️🚀 22 days ago

Post

7293

AGENTS + FINETUNING! This week Hugging Face learn has a whole pathway on finetuning for agentic applications. You can follow these two courses to get knowledge on levelling up your agent game beyond prompts:

1️⃣ New Supervised Fine-tuning unit in the NLP Course https://huggingface.co/learn/nlp-course/en/chapter11/1
2️⃣New Finetuning for agents bonus module in the Agents Course https://huggingface.co/learn/agents-course/bonus-unit1/introduction

Fine-tuning will squeeze everything out of your model for how you’re using it, more than any prompt.

2 replies

reacted to andito's post with 🚀 22 days ago

Post

1631

𝗜𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝗶𝗻𝗴 𝘁𝗵𝗲 𝘄𝗼𝗿𝗹𝗱'𝘀 𝘀𝗺𝗮𝗹𝗹𝗲𝘀𝘁 𝘃𝗶𝘀𝗶𝗼𝗻 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗺𝗼𝗱𝗲𝗹!

We’re thrilled to share 𝗦𝗺𝗼𝗹𝗩𝗟𝗠 (256M & 500M)—the smallest Visual Language Models ever built. Think: running on <1GB of GPU memory—you can fine-tune it on your laptop and run it on your toaster!

Why It’s Game-Changing:
- 𝗢𝘂𝘁𝗽𝗲𝗿𝗳𝗼𝗿𝗺𝘀 𝗟𝗮𝗿𝗴𝗲𝗿 𝗠𝗼𝗱𝗲𝗹𝘀: Even the 256M model surpasses our SOTA 80B-parameter model from just 17 months ago. Over 300x reduction!
𝗠𝗶𝗴𝗵𝘁𝘆 𝗘𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝗰𝘆: The 256M version delivers 80% of our 2.2B model’s performance, and the 500M version hits 90%
𝗟𝗶𝗴𝗵𝘁𝗻𝗶𝗻𝗴-𝗙𝗮𝘀𝘁 𝗦𝗲𝗮𝗿𝗰𝗵: SmolVLM integrates with ColiPali for state-of-the-art retrieval speeds—on par with models 10x bigger. That means cheaper, faster indexing and real-world impact.

What’s New Under the Hood:
- 𝗡𝗲𝘄 𝗩𝗶𝘀𝗶𝗼𝗻 𝗘𝗻𝗰𝗼𝗱𝗲𝗿: Smaller overall size (400M -> 93M), but with higher resolution.
- 𝗛𝗶𝗴𝗵𝗲𝗿 𝗣𝗶𝘅𝗲𝗹𝘀/𝗧𝗼𝗸𝗲𝗻: 4096 vs. 1820—more efficient image processing.
- 𝗦𝗺𝗮𝗿𝘁 𝗧𝗼𝗸𝗲𝗻𝗶𝘇𝗮𝘁𝗶𝗼𝗻: Faster training and a performance boost.

Check our blog: https://huggingface.co/blog/smolervlm
The models: HuggingFaceTB/smolvlm-256m-and-500m-6791fafc5bb0ab8acc960fb0
The demo: HuggingFaceTB/SmolVLM-256M-Demo