AI & ML interests

None defined yet.

Recent Activity

OS-Copilot's activity

Symbol-LLM 
posted an update about 1 month ago
view post
Post
944
🥳 Thrilled to introduce our recent efforts on bootstrapping VLMs for multi-modal chain-of-thought reasoning !

📕 Title: Vision-Language Models Can Self-Improve Reasoning via Reflection

🔗 Link: Vision-Language Models Can Self-Improve Reasoning via Reflection (2411.00855)

😇Takeaways:

- We found that VLMs can self-improve reasoning performance through a reflection mechanism, and importantly, this approach can scale through test-time computing.

- Evaluation on comprehensive and diverse Vision-Language reasoning tasks are included !
numbmelon 
in OS-Copilot/OS-Atlas-Base-7B about 1 month ago

数据什么时候开源

1
#2 opened about 2 months ago by
gewuzhizhi
Symbol-LLM 
updated a Space about 1 month ago
Symbol-LLM 
posted an update about 2 months ago
view post
Post
2147
🚀 Excited to introduce a new member of the OS-Copilot family: OS-Atlas - an open-sourced foundational action model for GUI agents

📘 Paper: OS-ATLAS: A Foundation Action Model for Generalist GUI Agents (2410.23218)
🔗 Website: https://osatlas.github.io

😇 TL;DR: OS-Atlas offers:
1. State-of-the-Art GUI Grounding: Helps GUI agents accurately locate GUI elements.
2. Strong OOD Performance and Cross-platform Compatibility: Excels in out-of-domain agentic tasks across MacOS, Windows, Linux, Android, and Web.
3. Complete Infrastructure for GUI Data Synthesis:
You can easily build your own OS agent upon it!