Post
				
				
							1310
					π¨ Instruct-tuning impacts models differently across families! Qwen2.5-72B-Instruct excels on IFEval but struggles with MATH-Hard, while Llama-3.1-70B-Instruct avoids MATH performance loss! Why? Can they follow the format in examples? π Compare models: 
	open-llm-leaderboard/comparator
		
		
	