GROOT-2: Weakly Supervised Multi-Modal Instruction Following Agents Paper • 2412.10410 • Published Dec 7, 2024
OpenHA: A Series of Open-Source Hierarchical Agentic Models in Minecraft Paper • 2509.13347 • Published Sep 13 • 1
Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents Paper • 2510.23691 • Published 4 days ago • 49
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Paper • 2509.02544 • Published Sep 2 • 123
Class Incremental Learning via Likelihood Ratio Based Task Prediction Paper • 2309.15048 • Published Sep 26, 2023
MCU: A Task-centric Framework for Open-ended Agent Evaluation in Minecraft Paper • 2310.08367 • Published Oct 12, 2023 • 1
JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models Paper • 2311.05997 • Published Nov 10, 2023 • 37
Selecting Large Language Model to Fine-tune via Rectified Scaling Law Paper • 2402.02314 • Published Feb 4, 2024 • 2
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation Paper • 2403.05313 • Published Mar 8, 2024 • 9
Continual Training of Language Models for Few-Shot Learning Paper • 2210.05549 • Published Oct 11, 2022
OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents Paper • 2407.00114 • Published Jun 27, 2024 • 13
Uni-3DAR: Unified 3D Generation and Understanding via Autoregression on Compressed Spatial Tokens Paper • 2503.16278 • Published Mar 20 • 7
Generative Evaluation of Complex Reasoning in Large Language Models Paper • 2504.02810 • Published Apr 3 • 14
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse Paper • 2503.16365 • Published Mar 20 • 40
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation Paper • 2403.05313 • Published Mar 8, 2024 • 9
RAM: Towards an Ever-Improving Memory System by Learning from Communications Paper • 2404.12045 • Published Apr 18, 2024 • 2