MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper • 2508.20453 • Published Aug 28 • 63
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use Paper • 2509.24002 • Published 29 days ago • 166
TheMCPCompany: Creating General-purpose Agents with Task-specific Tools Paper • 2510.19286 • Published 5 days ago • 5