Running 17 Metacognition Leaderboard 🧠 17 Explore LLM metacognition rankings and submit models for evaluation
Running 18 RoboCasa Kitchen Leaderboard 🍳 18 Neutral aggregation of VLA success rates on RoboCasa Kitchen
Running Agents 27 FINAL-Bench Quantum Leaderboard ⚛ 27 Neutral quantum-method benchmark — QEC decoders & more