Ruizhe Li's picture

1 4 1

Ruizhe Li

rzdiversity

·

https://www.ruizhe.space/

AI & ML interests

Mechanistic Interpretability, Multimodal LLMs

Recent Activity

authored a paper 1 day ago

Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs

upvoted a paper 1 day ago

Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs

submitted a paper 1 day ago

Spurious Rewards Paradox: Mechanistically Understanding How RLVR Activates Memorization Shortcuts in LLMs

View all activity

Organizations

None yet

rzdiversity 's models

None public yet