robin zhang
Chevolier
		·
				AI & ML interests
None yet
		Recent Activity
						updated 
								a collection
							
						6 days ago
						
					LLM
						
						updated 
								a collection
							
						6 days ago
						
					Multimodal
						
						updated 
								a collection
							
						6 days ago
						
					Multimodal
						Organizations
None yet
LLM
			
			
	
	- 
	
	
	Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement LearningPaper • 2510.03259 • Published • 57
- 
	
	
	Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be DensePaper • 2510.07242 • Published • 30
- 
	
	
	First Try Matters: Revisiting the Role of Reflection in Reasoning ModelsPaper • 2510.08308 • Published • 24
- 
	
	
	Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable RewardPaper • 2510.03222 • Published • 45
Multimodal
			
			
	
	- 
	
	
	MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy OptimizationPaper • 2510.08540 • Published • 108
- 
	
	
	Diffusion Transformers with Representation AutoencodersPaper • 2510.11690 • Published • 159
- 
	
	
	Spotlight on Token Perception for Multimodal Reinforcement LearningPaper • 2510.09285 • Published • 35
- 
	
	
	Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented GenerationPaper • 2510.17354 • Published • 32
Agent
			
			
	
	Video Generation
			
			
	
	Multimodal
			
			
	
	- 
	
	
	MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy OptimizationPaper • 2510.08540 • Published • 108
- 
	
	
	Diffusion Transformers with Representation AutoencodersPaper • 2510.11690 • Published • 159
- 
	
	
	Spotlight on Token Perception for Multimodal Reinforcement LearningPaper • 2510.09285 • Published • 35
- 
	
	
	Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented GenerationPaper • 2510.17354 • Published • 32
LLM
			
			
	
	- 
	
	
	Meta-Awareness Enhances Reasoning Models: Self-Alignment Reinforcement LearningPaper • 2510.03259 • Published • 57
- 
	
	
	Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be DensePaper • 2510.07242 • Published • 30
- 
	
	
	First Try Matters: Revisiting the Role of Reflection in Reasoning ModelsPaper • 2510.08308 • Published • 24
- 
	
	
	Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable RewardPaper • 2510.03222 • Published • 45
Agent