Exciting to see open-source models thriving in the computer agent space! 🔥 I just built a demo for OS-ATLAS: A Foundation Action Model For Generalist GUI Agents — check it out here: maxiw/OS-ATLAS
This demo predicts bounding boxes based on screenshot + instructions as input.