Running 72 Unlocking On-Policy Distillation for Any Model Family π 72 Apply on-policy distillation to any model family
Running on CPU Upgrade Featured 2.69k The Smol Training Playbook π 2.69k The secrets to building world-class LLMs
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition β’ 6B β’ Updated 17 days ago β’ 284k β’ 1.55k
yentinglin/Mistral-Small-24B-Instruct-2501-reasoning Text Generation β’ 24B β’ Updated Apr 20 β’ 102 β’ β’ 58
bartowski/DeepSeek-R1-Distill-Qwen-32B-abliterated-GGUF Text Generation β’ Updated Jan 25 β’ 6.97k β’ 126