-
-
-
-
-
-
Inference Providers
Active filters: gsm8k
rubenfb23/gpt2-gsm8k-cot-120b-distilled-lr2e5-acc4
Text Generation
• 0.1B • Updated
• 12
• 1
August4293/mistral_gsm8k_ssl_it1
Updated
August4293/mistral_gsm8k_ssl_it2
Updated
Text Generation
• Updated
• 12
• mradermacher/Qwen-0.5B-GRPO-GGUF
0.5B • Updated
• 37
mradermacher/prem-1B-grpo-GGUF
Reinforcement Learning
• 1B • Updated
• 118
yeok/DeepScaleR-1.5B-Preview-GSM8K-Demo
2B • Updated
• 2
LahiruWije/Qwen2.5-0.5B-Instruct-GPRO-GSM8K
Question Answering
• 0.5B • Updated
eagle0504/qwen-2-5-3b-instruct-using-openai-gsm8k-gguf-data-enhanced-with-deepseek-v3-small
3B • Updated
• 296
eagle0504/qwen-2-5-3b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v3
3B • Updated
• 197
eagle0504/qwen-2-5-3b-instruct-using-openai-gsm8k-data-enhanced-with-deepseek-v4
3B • Updated
• 162
Text Generation
• Updated
• 2
• 1
koolkarni-Atharva10/Nano_R1
Reinforcement Learning
• Updated
Text Generation
• Updated
• 1
• 3
klei1/bleta-logjike-27b-gguf
27B • Updated
• 11
solarpunkin/OpenELM-450M-gsm8k-LoRA
darshjoshi16/phi2-lora-math
Makrrr/Qwen3-1.7B-GSM8K-GRPO-verl
Reinforcement Learning
• 2B • Updated
• 28
• 3
Text Generation
• 0.6B • Updated
• 5
• 2
shivs28/jee_nujan_mix_v2_base
Text Generation
• 2B • Updated
• 4
tahamajs/Qwen3-4B-GSM8k-GRPO-Unsloth
4B • Updated
• 3
tahamajs/gemma-3-1b-it-finetune-gsmk8
Text Generation
• 1.0B • Updated
• 8
TroglodyteDerivations/smol_lm_3b
Updated
safouaneelg/Apertus-8B-Instruct-2509-GSM8k-SFT
Text Generation
• 8B • Updated
• 2
kotekjedi/qwen3-32b-lora-jailbreak-detection-merged
Text Generation
• 33B • Updated
• 5
yassine-boua/olmo-gsm8k-finetuned
Text Generation
• Updated
kotekjedi/qwen3-32b-lora-jailbreak-detection-merged_v2
Text Generation
• 33B • Updated
• 1
mradermacher/qwen3-32b-lora-jailbreak-detection-merged_v2-GGUF
33B • Updated
• 66
karthik/verl-qwen2.5-0.5b-gsm8k-ppo-step360
Text Generation
• 0.5B • Updated
• 2
DeryFerd/Qwen2.5-Math-7B-Instruct-Distill-Phi2-2.5K-MixMath
Text Generation
• 3B • Updated
• 12
• 1