Tao's picture

13 1

Tao

Leitian

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

What Limits Agentic Systems Efficiency?

upvoted a paper 12 days ago

DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation

authored a paper 17 days ago

The Era of Real-World Human Interaction: RL from User Conversations

View all activity

Organizations

upvoted a paper 6 days ago

What Limits Agentic Systems Efficiency?

Paper • 2510.16276 • Published 10 days ago • 3

upvoted a paper 12 days ago

DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation

Paper • 2510.09116 • Published 17 days ago • 94

upvoted 3 papers 18 days ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published 18 days ago • 40

The Era of Real-World Human Interaction: RL from User Conversations

Paper • 2509.25137 • Published 28 days ago • 18

Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

Paper • 2510.07242 • Published 19 days ago • 30

upvoted a paper 27 days ago

LUMINA: Detecting Hallucinations in RAG System with Context-Knowledge Signals

Paper • 2509.21875 • Published Sep 26 • 9

upvoted a paper about 2 months ago

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2 • 25

upvoted a paper 2 months ago

CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics

Paper • 2508.18124 • Published Aug 25 • 48

upvoted 4 papers 3 months ago

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Paper • 2508.08221 • Published Aug 11 • 47

CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks

Paper • 2507.23751 • Published Jul 31 • 4

A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published Aug 10 • 97

Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination

Paper • 2507.10532 • Published Jul 14 • 88

upvoted a paper 6 months ago

Absolute Zero: Reinforced Self-play Reasoning with Zero Data

Paper • 2505.03335 • Published May 6 • 185