ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities Paper โข 2408.04682 โข Published Aug 8, 2024 โข 17
UI Agent Collection a collection of algorithmic agents for user interfaces/interactions and program synthesis โข 268 items โข Updated about 21 hours ago โข 45