Qwen2.5-14B-Intuitor-MATH-1EPOCH

Description:

An Intuitor-fine-tuned version of Qwen2.5-3B trained on the MATH dataset.

Citation

@article{zhao2025learning,
  title   = {Learning to Reason without External Rewards},
  author  = {Zhao, Xuandong and Kang, Zhewei and Feng, Aosong and Levine, Sergey and Song, Dawn},
  journal = {arXiv preprint arXiv:2505.19590},
  year    = {2025}
}

Downloads last month: 4

Safetensors

Model size

15B params

Tensor type

BF16

Model tree for sunblaze-ucb/Qwen2.5-14B-Intuitor-MATH-1EPOCH

Base model

Qwen/Qwen2.5-14B

Finetuned

(86)

this model