AugmentedLearning - a samaffolter Collection

samaffolter 's Collections

AugmentedLearning

AugmentedLearning

updated Jan 1, 2024

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

Paper • 2312.15685 • Published Dec 25, 2023 • 16
mistralai/Mixtral-8x7B-Instruct-v0.1

47B • Updated Jul 24 • 535k • 4.58k
microsoft/phi-2

Text Generation • 3B • Updated Apr 29, 2024 • 704k • 3.41k
TinyLlama/TinyLlama-1.1B-Chat-v1.0

Text Generation • 1B • Updated Mar 17, 2024 • 4.29M • 1.44k
Are Emergent Abilities in Large Language Models just In-Context Learning?

Paper • 2309.01809 • Published Sep 4, 2023 • 3
Commonsense Knowledge Transfer for Pre-trained Language Models

Paper • 2306.02388 • Published Jun 4, 2023 • 1
Schema-learning and rebinding as mechanisms of in-context learning and emergence

Paper • 2307.01201 • Published Jun 16, 2023 • 2
Finding Neurons in a Haystack: Case Studies with Sparse Probing

Paper • 2305.01610 • Published May 2, 2023 • 2
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference

Paper • 2308.12066 • Published Aug 23, 2023 • 4
Experts Weights Averaging: A New General Training Scheme for Vision Transformers

Paper • 2308.06093 • Published Aug 11, 2023 • 2
Multi-Head Adapter Routing for Cross-Task Generalization

Paper • 2211.03831 • Published Nov 7, 2022 • 2
Alternating Gradient Descent and Mixture-of-Experts for Integrated Multimodal Perception

Paper • 2305.06324 • Published May 10, 2023 • 1
Multimodal Foundation Models: From Specialists to General-Purpose Assistants

Paper • 2309.10020 • Published Sep 18, 2023 • 41
MIMIC-IT: Multi-Modal In-Context Instruction Tuning

Paper • 2306.05425 • Published Jun 8, 2023 • 11
Evaluation and Mitigation of Agnosia in Multimodal Large Language Models

Paper • 2309.04041 • Published Sep 7, 2023 • 1
From Sparse to Soft Mixtures of Experts

Paper • 2308.00951 • Published Aug 2, 2023 • 21