This model was distilled using only SFT, or through a combination of SFT and RL?
#23 opened about 1 month ago
		by
		
				
							
						wizardII
	
Which Livecodebench version? v5? v6?
#22 opened 4 months ago
		by
		
				
							
						owao
	
During git clone: encounters 2 file(s) that may not have been copied correctly on Windows
#21 opened 4 months ago
		by
		
				
							
						guynich
	
Is this model can not use function calling?
#20 opened 4 months ago
		by
		
				
							
						Fatzard
	
test_for_ds_r1_qwq_8b
#19 opened 4 months ago
		by
		
				
							
						JunZhangf
	
A quick test comparing R1-0528-Qwen3-8B with Phi-4
#17 opened 5 months ago
		by
		
				
							
						gptlocalhost
	
ciudades turisticas
#15 opened 5 months ago
		by
		
				
							
						lolisponce
	
Model collapse after SFT
								3
#14 opened 5 months ago
		by
		
				
							
						Banjiuyufen
	
Vocab missing tool-related strings in chat template, poor performance with tools
								4
#13 opened 5 months ago
		by
		
				
							
						mattjcly
	
Can you please release how you post-train qwen3 on deepseek?
								2
#12 opened 5 months ago
		by
		
				
							
						ZeroWw
	
Tried it, but not good as expected.
								5
#11 opened 5 months ago
		by
		
				
							
						kk3dmax
	
/no_think 标签不能用了吗
➕
							
						1
				
								5
#10 opened 5 months ago
		by
		
				
							
						loong
	
Any plans for a Qwen3-32B model?
👍
							
						14
				
								7
#9 opened 5 months ago
		by
		
				
							
						wanghf
	
BTW For programmer, `Gemma` series are best to help you write comments, docstrings, and documents.
🔥
							
						1
				
								1
#8 opened 5 months ago
		by
		
				
							
						DOFOFFICIAL
	
DeepSeek-R1-Lite
🔥
							❤️
							
						20
				
								7
#6 opened 5 months ago
		by
		
				
							
						Dampfinchen
	
generation_config.json is missing
👀
							👍
							
						6
				
								2
#5 opened 5 months ago
		by
		
				
							
						Doctor-Chad-PhD
	
Model broken
👍
							
						4
				
								11
#4 opened 5 months ago
		by
		
				
							
						sm54
	
Any plans on gemma series? ;-;
❤️
							
						4
				
								4
#2 opened 5 months ago
		by
		
				
							
						Nakdesu
	
Any plans on 30B-A3B model?
🔥
							
						31
				
								7
#1 opened 5 months ago
		by
		
				
							
						xxx777xxxASD