arxiv:2502.03492
Xie
Zhihui
AI & ML interests
None yet
Recent Activity
liked
a model
about 13 hours ago
MiniMaxAI/MiniMax-M2
upvoted
a
paper
18 days ago
The Alignment Waltz: Jointly Training Agents to Collaborate for Safety
upvoted
a
paper
18 days ago
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense