A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published Jul 17 • 257
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 173
Leaderboards and benchmarks ✨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 91 items • Updated Feb 28 • 114
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated Jul 21 • 544
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Jul 21 • 650
Persian-Datasets Collection دیتاستهای متنوع برای آموزش و ارزیابی مدلهای فارسی؛ اعضا میتوانند دیتاستهای خود را به اشتراک بگذارند یا از منابع موجود بهره ببرند • 59 items • Updated Jan 27 • 7
ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning Paper • 2501.06590 • Published Jan 11 • 11
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding Paper • 2501.12380 • Published Jan 21 • 85
Product Catalog Generator Collection Product Catalog Generator for Persian products which is hosted by Basalam • 7 items • Updated Sep 7, 2024 • 8
Open LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 65 items • Updated Mar 20 • 647