MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning Paper • 2510.14958 • Published 14 days ago • 22
VoiceAssistant-Eval: Benchmarking AI Assistants across Listening, Speaking, and Viewing Paper • 2509.22651 • Published Sep 26 • 22
WebGen-Agent: Enhancing Interactive Website Generation with Multi-Level Feedback and Step-Level Reinforcement Learning Paper • 2509.22644 • Published Sep 26 • 20
DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving Paper • 2510.12796 • Published 16 days ago • 11
FLUX-Reason-6M & PRISM-Bench: A Million-Scale Text-to-Image Reasoning Dataset and Comprehensive Benchmark Paper • 2509.09680 • Published Sep 11 • 42