Wenhan Ma's picture

1 4 14

Wenhan Ma

CuteNPC

·

https://github.com/CuteNPC

CuteNPC

AI & ML interests

Large Language Model

Recent Activity

upvoted a paper 1 day ago

Stabilizing Reinforcement Learning with LLMs: Formulation and Practices

liked a model 16 days ago

Lansechen/deepseek-v2-lite-16b-chat-R1-Distill-bs17k-batch32

authored a paper about 1 month ago

Stabilizing MoE Reinforcement Learning by Aligning Training and Inference Routers

View all activity

Organizations

None yet

Papers 3

arxiv:2510.11370

arxiv:2506.03569

arxiv:2505.07608

models 0

None public yet

datasets 0

None public yet