On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters Paper • 2606.02437 • Published about 1 month ago • 236
3DCodeBench: Benchmarking Agentic Procedural 3D Modeling Via Code Paper • 2606.01057 • Published May 31 • 8
SkillAdaptor: Self-Adapting Skills for LLM Agents from Trajectories Paper • 2606.01311 • Published May 31 • 37
SOD: Step-wise On-policy Distillation for Small Language Model Agents Paper • 2605.07725 • Published May 8 • 25
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published May 20 • 207
Zero-Shot Sim-to-Real Robot Learning: A Dexterous Manipulation Study on Reactive Catching Paper • 2605.09789 • Published May 10 • 6
Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding Paper • 2605.07637 • Published May 12 • 20
AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning Paper • 2605.00425 • Published May 8 • 23
Lightning Unified Video Editing via In-Context Sparse Attention Paper • 2605.04569 • Published May 6 • 18
TexOCR: Advancing Document OCR Models for Compilable Page-to-LaTeX Reconstruction Paper • 2604.22880 • Published Apr 24 • 10
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 509
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 639
AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents Paper • 2604.02947 • Published Apr 3 • 19
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 344
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 353
InCoder-32B: Code Foundation Model for Industrial Scenarios Paper • 2603.16790 • Published Mar 17 • 312