SpectralPO
community
						
						
						
						AI & ML interests
None defined yet.
			Organization Card
		
		This repo contains all the models for paper -
Spectral Policy Optimization: Coloring your Incorrect Reasoning in GRPO
https://arxiv.org/abs/2505.11595
Please cite
@inproceedings{chen2025spectral,
  title = {Spectral Policy Optimization: Coloring your Incorrect Reasoning in {GRPO}},
  author = {Peter Chen and Xiaopeng Li and Ziniu Li and Xi Chen and Tianyi Lin},
  booktitle = {2nd AI for Math Workshop @ ICML 2025},
  year = {2025},
  url = {https://openreview.net/forum?id=IIBDElbi7s}
}
			models
			27
		
			
	
	
	
	
	 
				SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO-QwQ-Ablation
		
				8B
			• 
	
				Updated
					
				
				• 
					
					3
				
	
				
				
 
				SpectralPO/DeepSeek-R1-Distill-Qwen-32B-GRPO
		
	
				Updated
					
				
				
				
	
				
				
 
				SpectralPO/DeepSeek-R1-Distill-Qwen-32B-SPO
		
	
				Updated
					
				
				
				
	
				
				
 
				SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO-Qwen3-235B
		
				8B
			• 
	
				Updated
					
				
				• 
					
					3
				
	
				
				
 
				SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO-QwQ
		
				8B
			• 
	
				Updated
					
				
				• 
					
					2
				
	
				
				
 
				SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO-DeepSeek-V3
		
				8B
			• 
	
				Updated
					
				
				• 
					
					3
				
	
				• 
					
					1
				
 
				SpectralPO/DeepSeek-R1-Distill-Llama-8B-SPO
		
				8B
			• 
	
				Updated
					
				
				
				
	
				
				
 
				SpectralPO/DeepSeek-R1-Distill-Llama-8B-GRPO
		
				8B
			• 
	
				Updated
					
				
				
				
	
				
				
 
				SpectralPO/Qwen2.5-32B-Instruct-GRPO
		
				33B
			• 
	
				Updated
					
				
				
				
	
				
				
 
				SpectralPO/Qwen2.5-32B-Instruct-SPO
		
				33B
			• 
	
				Updated
					
				
				
				
	
				
				
			datasets
			0
		
			
	None public yet