Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated Sep 18 • 94
LeetCodeDataset: A Temporal Dataset for Robust Evaluation and Efficient Training of Code LLMs Paper • 2504.14655 • Published Apr 20 • 20
PaperBench: Evaluating AI's Ability to Replicate AI Research Paper • 2504.01848 • Published Apr 2 • 36
view article Article LLM Inference on Edge: A Fun and Easy Guide to run LLMs via React Native on your Phone! Mar 7 • 87
view article Article Efficient LLM Pretraining: Packed Sequences and Masked Attention By sirluk • Oct 7, 2024 • 55
PaSa: An LLM Agent for Comprehensive Academic Paper Search Paper • 2501.10120 • Published Jan 17 • 52
view article Article Selective fine-tuning of Language Models with Spectrum By anakin87 • Sep 3, 2024 • 36
view article Article Docmatix - a huge dataset for Document Visual Question Answering Jul 18, 2024 • 78
view article Article Rank-Stabilized LoRA: Unlocking the Potential of LoRA Fine-Tuning By damjan-k • Feb 20, 2024 • 29
view article Article Fine-tuning LLMs with Singular Value Decomposition By fractalego • Jun 2, 2024 • 13