ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models Paper • 2512.07843 • Published Nov 24 • 19
Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation Paper • 2506.21876 • Published Jun 27 • 28
ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models Paper • 2505.13444 • Published May 19 • 17
COMPACT: COMPositional Atomic-to-Complex Visual Capability Tuning Paper • 2504.21850 • Published Apr 30 • 27
RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation Paper • 2501.08617 • Published Jan 15 • 10
Learning Video Representations without Natural Videos Paper • 2410.24213 • Published Oct 31, 2024 • 16
Distill Visual Chart Reasoning Ability from LLMs to MLLMs Paper • 2410.18798 • Published Oct 24, 2024 • 21
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception Paper • 2407.08303 • Published Jul 11, 2024 • 19
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs Paper • 2406.18521 • Published Jun 26, 2024 • 29
TokenCompose: Grounding Diffusion with Token-level Supervision Paper • 2312.03626 • Published Dec 6, 2023 • 5