Collections
Discover the best community collections!
Collections including paper arxiv:2412.15115
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 376 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 625 -
meta-llama/Llama-4-Scout-17B-16E-Instruct
Image-Text-to-Text • 109B • Updated • 189k • • 1.13k -
keras-io/GauGAN-Image-generation
Updated • 10 • 4
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 298 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 420 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 376 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 257
-
Visual-RFT: Visual Reinforcement Fine-Tuning
Paper • 2503.01785 • Published • 84 -
When an LLM is apprehensive about its answers -- and when its uncertainty is justified
Paper • 2503.01688 • Published • 21 -
Predictive Data Selection: The Data That Predicts Is the Data That Teaches
Paper • 2503.00808 • Published • 56 -
Chain of Draft: Thinking Faster by Writing Less
Paper • 2502.18600 • Published • 49
-
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 376 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 625 -
meta-llama/Llama-4-Scout-17B-16E-Instruct
Image-Text-to-Text • 109B • Updated • 189k • • 1.13k -
keras-io/GauGAN-Image-generation
Updated • 10 • 4
-
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 298 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 420 -
Qwen2.5 Technical Report
Paper • 2412.15115 • Published • 376 -
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone
Paper • 2404.14219 • Published • 257
-
Visual-RFT: Visual Reinforcement Fine-Tuning
Paper • 2503.01785 • Published • 84 -
When an LLM is apprehensive about its answers -- and when its uncertainty is justified
Paper • 2503.01688 • Published • 21 -
Predictive Data Selection: The Data That Predicts Is the Data That Teaches
Paper • 2503.00808 • Published • 56 -
Chain of Draft: Thinking Faster by Writing Less
Paper • 2502.18600 • Published • 49