 samaffolter
			's Collections
			samaffolter
			's Collections
			
			
		AugmentedLearning
		
	updated
			
 
				
				
 - What Makes Good Data for Alignment? A Comprehensive Study of Automatic
  Data Selection in Instruction Tuning- 
			Paper
			 •- 
			2312.15685
			 •
			Published
				
			•- 
				16
			 
   - mistralai/Mixtral-8x7B-Instruct-v0.1- 
		
				47B
			• 
	
				Updated
					
				
				• 
					- 
					535k
				
	
				 •- 
					4.58k
				 
 
   - microsoft/phi-2- 
			Text Generation
			 • 
		
				3B
			• 
	
				Updated
					
				
				•- 
					704k
				
	
				 •- 
					3.41k
				 
   - TinyLlama/TinyLlama-1.1B-Chat-v1.0- 
			Text Generation
			 • 
		
				1B
			• 
	
				Updated
					
				
				•- 
					4.29M
				
	
				 •- 
					1.44k
				 
 - Are Emergent Abilities in Large Language Models just In-Context
  Learning?- 
			Paper
			 •- 
			2309.01809
			 •
			Published
				
			•- 
				3
			 
 - Commonsense Knowledge Transfer for Pre-trained Language Models- 
			Paper
			 •- 
			2306.02388
			 •
			Published
				
			•- 
				1
			 
 - Schema-learning and rebinding as mechanisms of in-context learning and
  emergence- 
			Paper
			 •- 
			2307.01201
			 •
			Published
				
			•- 
				2
			 
 - Finding Neurons in a Haystack: Case Studies with Sparse Probing- 
			Paper
			 •- 
			2305.01610
			 •
			Published
				
			•- 
				2
			 
 - Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable
  Mixture-of-Expert Inference- 
			Paper
			 •- 
			2308.12066
			 •
			Published
				
			•- 
				4
			 
 - Experts Weights Averaging: A New General Training Scheme for Vision
  Transformers- 
			Paper
			 •- 
			2308.06093
			 •
			Published
				
			•- 
				2
			 
 - Multi-Head Adapter Routing for Cross-Task Generalization- 
			Paper
			 •- 
			2211.03831
			 •
			Published
				
			•- 
				2
			 
 - Alternating Gradient Descent and Mixture-of-Experts for Integrated
  Multimodal Perception- 
			Paper
			 •- 
			2305.06324
			 •
			Published
				
			•- 
				1
			 
 - Multimodal Foundation Models: From Specialists to General-Purpose
  Assistants- 
			Paper
			 •- 
			2309.10020
			 •
			Published
				
			•- 
				41
			 
 - MIMIC-IT: Multi-Modal In-Context Instruction Tuning- 
			Paper
			 •- 
			2306.05425
			 •
			Published
				
			•- 
				11
			 
 - Evaluation and Mitigation of Agnosia in Multimodal Large Language Models- 
			Paper
			 •- 
			2309.04041
			 •
			Published
				
			•- 
				1
			 
 - From Sparse to Soft Mixtures of Experts- 
			Paper
			 •- 
			2308.00951
			 •
			Published
				
			•- 
				21