- 
	
	
	
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 3 - 
	
	
	
Efficient Monotonic Multihead Attention
Paper • 2312.04515 • Published • 8 - 
	
	
	
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 - 
	
	
	
Exploring Format Consistency for Instruction Tuning
Paper • 2307.15504 • Published • 8 
Collections
Discover the best community collections!
Collections including paper arxiv:2312.06134 
						
					
				- 
	
	
	
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 146 - 
	
	
	
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Paper • 2312.03491 • Published • 35 - 
	
	
	
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 3 - 
	
	
	
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 260 
- 
	
	
	
Dissecting In-Context Learning of Translations in GPTs
Paper • 2310.15987 • Published • 6 - 
	
	
	
Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca
Paper • 2309.08958 • Published • 2 - 
	
	
	
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Paper • 2305.04160 • Published • 2 - 
	
	
	
Ziya-VL: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning
Paper • 2310.08166 • Published • 1 
- 
	
	
	
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 55 - 
	
	
	
YAYI 2: Multilingual Open-Source Large Language Models
Paper • 2312.14862 • Published • 15 - 
	
	
	
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 3 - 
	
	
	
TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes
Paper • 2311.10797 • Published 
- 
	
	
	
Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models
Paper • 2311.08692 • Published • 13 - 
	
	
	
DiLoCo: Distributed Low-Communication Training of Language Models
Paper • 2311.08105 • Published • 16 - 
	
	
	
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 44 - 
	
	
	
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 3 
- 
	
	
	
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 3 - 
	
	
	
Efficient Monotonic Multihead Attention
Paper • 2312.04515 • Published • 8 - 
	
	
	
Contrastive Decoding Improves Reasoning in Large Language Models
Paper • 2309.09117 • Published • 39 - 
	
	
	
Exploring Format Consistency for Instruction Tuning
Paper • 2307.15504 • Published • 8 
- 
	
	
	
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 55 - 
	
	
	
YAYI 2: Multilingual Open-Source Large Language Models
Paper • 2312.14862 • Published • 15 - 
	
	
	
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 3 - 
	
	
	
TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes
Paper • 2311.10797 • Published 
- 
	
	
	
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Paper • 2312.00752 • Published • 146 - 
	
	
	
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis
Paper • 2312.03491 • Published • 35 - 
	
	
	
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 3 - 
	
	
	
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 260 
- 
	
	
	
Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models
Paper • 2311.08692 • Published • 13 - 
	
	
	
DiLoCo: Distributed Low-Communication Training of Language Models
Paper • 2311.08105 • Published • 16 - 
	
	
	
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 44 - 
	
	
	
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Paper • 2312.06134 • Published • 3 
- 
	
	
	
Dissecting In-Context Learning of Translations in GPTs
Paper • 2310.15987 • Published • 6 - 
	
	
	
Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca
Paper • 2309.08958 • Published • 2 - 
	
	
	
X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages
Paper • 2305.04160 • Published • 2 - 
	
	
	
Ziya-VL: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning
Paper • 2310.08166 • Published • 1