 
				RLHFlow/ArmoRM-Llama3-8B-v0.1
			Text Classification
			• 
		
				8B
			• 
	
				Updated
					
				
				• 
					
					10.7k
				
	
				• 
					
					182
				
Reward models trained by RLHFlow codebase (https://github.com/RLHFlow/RLHF-Reward-Modeling/)
 
				 
				Note Bradley-Terry reward model trained with RLHFlow codebase
Note Tech report that covers Pairwise Preference Model
Note Tech report for ArmoRM