Concerto: Joint 2D-3D Self-Supervised Learning Emerges Spatial Representations Paper • 2510.23607 • Published 2 days ago • 157
QueST: Incentivizing LLMs to Generate Difficult Problems Paper • 2510.17715 • Published 9 days ago • 31
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data Paper • 2509.15221 • Published Sep 18 • 109
Harnessing Uncertainty: Entropy-Modulated Policy Gradients for Long-Horizon LLM Agents Paper • 2509.09265 • Published Sep 11 • 45
CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning Paper • 2508.20096 • Published Aug 27 • 36
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19 • 118
DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization Paper • 2508.14460 • Published Aug 20 • 82
FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction Paper • 2508.11987 • Published Aug 16 • 69
CodeEvo: Interaction-Driven Synthesis of Code-centric Data through Hybrid and Iterative Feedback Paper • 2507.22080 • Published Jul 25 • 9
Seed-X: Building Strong Multilingual Translation LLM with 7B Parameters Paper • 2507.13618 • Published Jul 18 • 16
MUR: Momentum Uncertainty guided Reasoning for Large Language Models Paper • 2507.14958 • Published Jul 20 • 46
Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving Paper • 2507.06229 • Published Jul 8 • 74
DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation Paper • 2506.20639 • Published Jun 25 • 31
From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios Paper • 2506.20279 • Published Jun 25 • 19
SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning Paper • 2506.08989 • Published Jun 10 • 14
Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published Jun 8 • 113