GuoLiangTang
Tommy930
AI & ML interests
LLM,NLP,ML
Recent Activity
upvoted a paper about 11 hours ago
RODS: Reward-Driven Online Data Synthesis for Multi-Turn Tool-Use Agents upvoted a paper about 11 hours ago
From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning upvoted a paper about 11 hours ago
EfficientRollout: System-Aware Self-Speculative Decoding for RL RolloutsOrganizations
None yet