Zihan Wang
ZihanWang99
		AI & ML interests
None yet
		
		Organizations
MOE
			
			
	
	- 
	
	
	DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language ModelsPaper • 2401.06066 • Published • 56
- 
	
	
	Mixtral of ExpertsPaper • 2401.04088 • Published • 159
- 
	
	
	Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLMPaper • 2401.02994 • Published • 52
- 
	
	
	LLM Augmented LLMs: Expanding Capabilities through CompositionPaper • 2401.02412 • Published • 38
reading comprehension
			
			
	
	long context LLM
			
			
	
	- 
	
	
	E^2-LLM: Efficient and Extreme Length Extension of Large Language ModelsPaper • 2401.06951 • Published • 26
- 
	
	
	Extending LLMs' Context Window with 100 SamplesPaper • 2401.07004 • Published • 16
- 
	
	
	Soaring from 4K to 400K: Extending LLM's Context with Activation BeaconPaper • 2401.03462 • Published • 27
- 
	
	
	The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax MimicryPaper • 2402.04347 • Published • 15
COT
			
			
	
	Code Generation
			
			
	
	LLM infer
			
			
	
	long context LLM
			
			
	
	- 
	
	
	E^2-LLM: Efficient and Extreme Length Extension of Large Language ModelsPaper • 2401.06951 • Published • 26
- 
	
	
	Extending LLMs' Context Window with 100 SamplesPaper • 2401.07004 • Published • 16
- 
	
	
	Soaring from 4K to 400K: Extending LLM's Context with Activation BeaconPaper • 2401.03462 • Published • 27
- 
	
	
	The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax MimicryPaper • 2402.04347 • Published • 15
MOE
			
			
	
	- 
	
	
	DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language ModelsPaper • 2401.06066 • Published • 56
- 
	
	
	Mixtral of ExpertsPaper • 2401.04088 • Published • 159
- 
	
	
	Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLMPaper • 2401.02994 • Published • 52
- 
	
	
	LLM Augmented LLMs: Expanding Capabilities through CompositionPaper • 2401.02412 • Published • 38
COT
			
			
	
	reading comprehension
			
			
	
	Code Generation
			
			
	
	 
								