Alvaro Bartolome's picture

Alvaro Bartolome PRO

alvarobartt

AI & ML interests

machine learning @huggingface

Recent Activity

Organizations

Microsoft's profile picture Hugging Face's profile picture Spaces-explorers's profile picture Hackathon Somos NLP 2023: Los LLMs hablan Español's profile picture SomosNLP's profile picture Hugging Test Lab's profile picture Open-Source AI Meetup's profile picture Hugging Face H4's profile picture Argilla's profile picture Blog-explorers's profile picture ZeroGPU Explorers's profile picture gg-hf's profile picture Argilla Explorers's profile picture MLX Community's profile picture distilabel-internal-testing's profile picture ORPO Explorers's profile picture Data Is Better Together's profile picture Social Post Explorers's profile picture Hugging Face Discord Community's profile picture LLHF's profile picture SLLHF's profile picture Hugging Quants's profile picture blhf's profile picture Argilla Warehouse's profile picture nltpt's profile picture IOPO Experiments's profile picture Google Cloud 🤝🏻 Hugging Face's profile picture Huggingface HUGS's profile picture Data Is Better Together Contributor's profile picture AI Starter Pack's profile picture Open R1's profile picture gg-hf-g's profile picture Multimodal AI agents's profile picture

Posts 6

view post
Post
236
🔥 Agents can do anything! @microsoft Research just announced the release of Magma 8B!

Magma is a new Visual Language Model (VLM) with 8B parameters for multi-modal agents designed to handle complex interactions across virtual and real environments; and it's MIT licensed!

Magma comes with exciting new features such as:
- Introduces the Set-of-Mark and Trace-of-Mark techniques for fine-tuning
- Leverages a large amount of unlabeled video data to learn the spatial-temporal grounding and planning
- A strong generalization and ability to be fine-tuned for other agentic tasks
- SOTA in different multi-modal benchmarks spanning across UI navigation, robotics manipulation, image / video understanding and spatial understanding and reasoning
- Generates goal-driven visual plans and actions for agentic use cases

Model: microsoft/Magma-8B
Technical Report: Magma: A Foundation Model for Multimodal AI Agents (2502.13130)

Articles 9

Article
3

🤗 Serve any model with Inference Endpoints + Custom Handlers