VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model Paper • 2509.09372 • Published Sep 11 • 237
OpenVLA: An Open-Source Vision-Language-Action Model Paper • 2406.09246 • Published Jun 13, 2024 • 41
VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks Paper • 2412.18194 • Published Dec 24, 2024 • 1
Diffusion LLMs Can Do Faster-Than-AR Inference via Discrete Diffusion Forcing Paper • 2508.09192 • Published Aug 8 • 30
OneTwoVLA: A Unified Vision-Language-Action Model with Adaptive Reasoning Paper • 2505.11917 • Published May 17 • 1
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning Paper • 2503.10480 • Published Mar 13 • 55
Improved Visual-Spatial Reasoning via R1-Zero-Like Training Paper • 2504.00883 • Published Apr 1 • 66
Show-o Turbo: Towards Accelerated Unified Multimodal Understanding and Generation Paper • 2502.05415 • Published Feb 8 • 22