Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model Paper • 2501.02790 • Published Jan 6 • 9
Who's Your Judge? On the Detectability of LLM-Generated Judgments Paper • 2509.25154 • Published 27 days ago • 29
TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning Paper • 2509.25760 • Published 27 days ago • 52
The Personalization Trap: How User Memory Alters Emotional Reasoning in LLMs Paper • 2510.09905 • Published 16 days ago • 6
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use Paper • 2510.05592 • Published 20 days ago • 92
LightMem: Lightweight and Efficient Memory-Augmented Generation Paper • 2510.18866 • Published 5 days ago • 101
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science Paper • 2510.16872 • Published 7 days ago • 83