Personalized Reasoning: Just-In-Time Personalization and Why LLMs Fail At It Paper • 2510.00177 • Published Sep 30 • 3
PrefPalette: Personalized Preference Modeling with Latent Attributes Paper • 2507.13541 • Published Jul 17 • 8
Spurious Rewards Collection Spurious Rewards: Rethinking Training Signals in RLVR • 14 items • Updated Jun 13 • 2