Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence Paper • 2510.20470 • Published 4 days ago • 11
Search Self-play: Pushing the Frontier of Agent Capability without Supervision Paper • 2510.18821 • Published 5 days ago • 14
Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence Paper • 2510.20579 • Published 4 days ago • 45
KORE: Enhancing Knowledge Injection for Large Multimodal Models via Knowledge-Oriented Augmentations and Constraints Paper • 2510.19316 • Published 5 days ago • 7
Directional Reasoning Injection for Fine-Tuning MLLMs Paper • 2510.15050 • Published 10 days ago • 10
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts Paper • 2510.19363 • Published 5 days ago • 54
ProCLIP: Progressive Vision-Language Alignment via LLM-based Embedder Paper • 2510.18795 • Published 5 days ago • 9
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented Generation Paper • 2510.17354 • Published 7 days ago • 32
Train a Unified Multimodal Data Quality Classifier with Synthetic Data Paper • 2510.15162 • Published 10 days ago • 2
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning Paper • 2510.15444 • Published 10 days ago • 137
VISTA: A Test-Time Self-Improving Video Generation Agent Paper • 2510.15831 • Published 9 days ago • 18