Untied Ulysses: Memory-Efficient Context Parallelism via Headwise Chunking Paper • 2602.21196 • Published Feb 24 • 7
Cartridges: Lightweight and general-purpose long context representations via self-study Paper • 2506.06266 • Published Jun 6, 2025 • 7
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published Jan 14, 2025 • 62
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 58
RedPajama: an Open Dataset for Training Large Language Models Paper • 2411.12372 • Published Nov 19, 2024 • 58
view post Post 2885 https://huggingface.co/organizations/nerdyface/share/xvWxWxYmYpCLqZlvNJEZbJHFsDITAicJAT 🚀 3 3 + Reply
BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing Paper • 2206.15076 • Published Jun 30, 2022 • 5
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models Paper • 2306.11698 • Published Jun 20, 2023 • 13
Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT Paper • 2402.07440 • Published Feb 12, 2024 • 1
Simple linear attention language models balance the recall-throughput tradeoff Paper • 2402.18668 • Published Feb 28, 2024 • 20
Just read twice: closing the recall gap for recurrent language models Paper • 2407.05483 • Published Jul 7, 2024
LoLCATs: On Low-Rank Linearizing of Large Language Models Paper • 2410.10254 • Published Oct 14, 2024 • 1
Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees Paper • 2110.03313 • Published Oct 7, 2021 • 1
Mixture-of-Agents Enhances Large Language Model Capabilities Paper • 2406.04692 • Published Jun 7, 2024 • 59
Mixture-of-Agents Enhances Large Language Model Capabilities Paper • 2406.04692 • Published Jun 7, 2024 • 59
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 37
Petals: Collaborative Inference and Fine-tuning of Large Models Paper • 2209.01188 • Published Sep 2, 2022 • 1