arxiv:2505.12929
Zhihe Yang
zhyang2226
AI & ML interests
Trustworthy RL & Offline RL
Recent Activity
liked
a model
13 days ago
tencent/HunyuanImage-3.0
liked
a model
4 months ago
tencent/HunyuanVideo
authored
a paper
4 months ago
Mitigating Hallucinations in Large Vision-Language Models via DPO:
On-Policy Data Hold the Key