Zirui Wang

zwcolin

·

https://zwcolin.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a collection 7 days ago

authored a paper 13 days ago

Playful Agentic Robot Learning

upvoted a paper 13 days ago

Playful Agentic Robot Learning

View all activity

Organizations

upvoted a collection 7 days ago

Evals

8 items • Updated Feb 20 • 3

upvoted a paper 13 days ago

Playful Agentic Robot Learning

Paper • 2606.19419 • Published 15 days ago • 50

upvoted a paper 28 days ago

Stateful Visual Encoders for Vision-Language Models

Paper • 2606.04433 • Published 29 days ago • 8

upvoted a paper 5 months ago

VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents

Paper • 2601.16973 • Published Jan 23 • 40

upvoted 2 papers 7 months ago

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

Paper • 2512.07843 • Published Nov 24, 2025 • 22

Pillar-0: A New Frontier for Radiology Foundation Models

Paper • 2511.17803 • Published Nov 21, 2025 • 25

upvoted 3 papers about 1 year ago

Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation

Paper • 2506.21876 • Published Jun 27, 2025 • 28

ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models

Paper • 2505.13444 • Published May 19, 2025 • 16

COMPACT: COMPositional Atomic-to-Complex Visual Capability Tuning

Paper • 2504.21850 • Published Apr 30, 2025 • 27

upvoted 3 papers over 1 year ago

RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation

Paper • 2501.08617 • Published Jan 15, 2025 • 10

Learning Video Representations without Natural Videos

Paper • 2410.24213 • Published Oct 31, 2024 • 16

Distill Visual Chart Reasoning Ability from LLMs to MLLMs

Paper • 2410.18798 • Published Oct 24, 2024 • 21

upvoted 2 papers almost 2 years ago

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

Paper • 2407.08303 • Published Jul 11, 2024 • 19

MAVIS: Mathematical Visual Instruction Tuning

Paper • 2407.08739 • Published Jul 11, 2024 • 32

upvoted a paper about 2 years ago

CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs

Paper • 2406.18521 • Published Jun 26, 2024 • 31

upvoted a paper over 2 years ago

TokenCompose: Grounding Diffusion with Token-level Supervision

Paper • 2312.03626 • Published Dec 6, 2023 • 5