SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations Paper • 2512.14080 • Published 11 days ago • 5
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 6 items • Updated 3 days ago • 103
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 3 days ago • 81
Motif-Technologies/Motif-2-12.7B-Reasoning Text Generation • 13B • Updated 15 days ago • 638 • 33
view changelog Changelog Team & Enterprise Articles Now Featured on the Hugging Face Blog 19 days ago • 69
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices Paper • 2512.01374 • Published 26 days ago • 93
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23 • 276
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 26 days ago • 254
INTELLECT-3 Collection INTELLECT-3: A 100B+ MoE trained with large-scale RL • 4 items • Updated 28 days ago • 11