MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents Paper • 2503.01935 • Published 7 days ago • 21
MultiAgentBench: Evaluating the Collaboration and Competition of LLM agents Paper • 2503.01935 • Published 7 days ago • 21 • 3
EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents Paper • 2502.09560 • Published 24 days ago • 33
Eurus Collection Advancing LLM Reasoning Generalists with Preference Trees • 11 items • Updated Oct 22, 2024 • 24
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs Paper • 2307.16789 • Published Jul 31, 2023 • 100
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs Paper • 2307.16789 • Published Jul 31, 2023 • 100