5 19 32

Zhongwen Xu

zhongwenxu

AI & ML interests

None yet

Recent Activity

upvoted a paper 10 days ago

One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration

upvoted a paper 27 days ago

LIMI: Less is More for Agency

upvoted a paper 27 days ago

A Survey of Reinforcement Learning for Large Reasoning Models

View all activity

Organizations

None yet

upvoted a paper 10 days ago

One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration

Paper • 2510.12088 • Published 13 days ago • 4

upvoted 5 papers 27 days ago

upvoted 3 papers 28 days ago

UltraHorizon: Benchmarking Agent Capabilities in Ultra Long-Horizon Scenarios

Paper • 2509.21766 • Published Sep 26 • 23

Quantile Advantage Estimation for Entropy-Safe Reasoning

Paper • 2509.22611 • Published about 1 month ago • 117

Variational Reasoning for Language Models

Paper • 2509.22637 • Published about 1 month ago • 68

upvoted a paper about 1 month ago

FlowRL: Matching Reward Distributions for LLM Reasoning

Paper • 2509.15207 • Published Sep 18 • 110

upvoted a collection about 1 month ago

SPO

Collection

Single-stream Policy Optimization • 2 items • Updated Sep 17 • 2

upvoted a paper about 1 month ago

Single-stream Policy Optimization

Paper • 2509.13232 • Published Sep 16 • 33

upvoted a collection 2 months ago

Understanding Tool-Integrated Reasoning

Collection

The official models and datasets for the paper "Understanding Tool-Integrated Reasoning" • 5 items • Updated Aug 27 • 2

upvoted a paper 2 months ago

Understanding Tool-Integrated Reasoning

Paper • 2508.19201 • Published Aug 26 • 32

upvoted a paper 3 months ago

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

Paper • 2507.19457 • Published Jul 25 • 28

upvoted a paper 6 months ago

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 84

upvoted a collection 6 months ago

IndustryCorpus2

Collection

多语种多行业预训练数据集 • 35 items • Updated Feb 13 • 6

upvoted 2 papers 7 months ago

OpenCodeReasoning: Advancing Data Distillation for Competitive Coding

Paper • 2504.01943 • Published Apr 2 • 15

Agents Play Thousands of 3D Video Games

Paper • 2503.13356 • Published Mar 17 • 9

Zhongwen Xu

AI & ML interests

Recent Activity

Organizations

zhongwenxu's activity