File size: 910 Bytes
2dba94f
e3f57bf
2dba94f
 
84c9c9b
 
0a1e9b6
d5645ee
4fcb84f
c2dcaba
1
2
3
4
5
6
7
8
9
10
Agent,Model,Organization,Source,Easy,Medium,Hard,Average SR,Date
Operator,OpenAI Computer-Using Agent,OpenAI,OSU NLP,83.1,58.0,43.2,61.3,2025-3-22
SeeAct,gpt-4o-2024-08-06,OSU,OSU NLP,60.2,25.2,8.1,30.7,2025-3-22
Browser Use,gpt-4o-2024-08-06,Browser Use,OSU NLP,55.4,26.6,8.1,30.0,2025-3-22
Claude Computer Use 3.5,claude-3-5-sonnet-20241022,Anthropic,OSU NLP,56.6,20.3,14.9,29.0,2025-3-22
Agent-E,gpt-4o-2024-08-06,Emergence AI,OSU NLP,49.4,26.6,6.8,28.0,2025-3-22
Claude Computer Use 3.7 (w/o thinking),Claude-3-7-sonnet-20250219,Anthropic,OSU NLP,90.4,49.0,32.4,56.3,2025-4-20
ACT-1-20250703,o3-2025-04-16 and Claude-sonnet-4-20250514,Enhans,Enhans,65.1,46.2,23.0,45.7,2025-7-16
ACT-1-20250814,o3-2025-04-16 and Claude-sonnet-4-20250514,Enhans,Enhans,81.9,54.5,35.1,57.3,2025-8-23
Gemini 2.5 Computer Use,gemini-2.5-computer-use-preview-10-2025,Google DeepMind,Google DeepMind,77.1,71.3,55.4,69.0,2025-9-29