π§ InfiR2
Collection
The InfiR2 releases the full suite of FP8 checkpoints from our pipeline, including models from CPTοΌSFT and RL.
β’
6 items
β’
Updated
π Paper | π Github | π Project Website
We performed supervised fine-tuning on the InfiR2-1.5B-base-FP8 with FP8 format in two stages using the InfiAlign-SFT-72k and InfiAlign-SFT-165k datasets.
Training Recipe:
Hyperparameters:
| Parameter | Value |
|---|---|
| Batch Size | 64 |
| Learning Rate | 5e-5 |
| Minimum Learning Rate | 5e-6 |
| Weight Decay | 0.05 |
| Context Length | 32k |
The resulting model is the InfiR2-1.5B-Instruct-FP8.
The InfiR2 framework offers multiple variants model with different size and training strategy:
Below is the performance comparison of InfiR2-1.5B-Instruct-FP8 on reasoning benchmarks. Note: 'w. InfiAlign' denotes Supervised Fine-Tuning (SFT) using the InfiAlign dataset.
| Model | AIME 25 | AIME 24 | GPQA | LiveCodeBench v5 |
|---|---|---|---|---|
| Deepseek-Distill-Qwen-1.5B | 21.35 | 26.87 | 32.26 | 18.50 |
| Qwen2.5-1.5B-base (w. InfiAlign) | 14.58 | 10.52 | 28.98 | 12.99 |
| InfiR2-1.5B-Instruct-FP8 | 18.45 | 17.39 | 29.48 | 17.10 |
from vllm import LLM, SamplingParams
import torch
import os
MODEL_NAME = "InfiX-ai/InfiR2-1.5B-Instruct-FP8"
prompt_text = "Briefly explain what a black hole is, and provide two interesting facts."
MAX_NEW_TOKENS = 256
TEMPERATURE = 0.8
DO_SAMPLE = True
llm = LLM(
model=MODEL_NAME,
dtype="auto",
)
sampling_params = SamplingParams(
n=1,
temperature=TEMPERATURE,
max_tokens=MAX_NEW_TOKENS,
)
tokenizer = llm.get_tokenizer()
messages = [
{"role": "user", "content": prompt_text}
]
prompt_formatted = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = llm.generate(
prompt_formatted,
sampling_params
)
generated_text = outputs[0].outputs[0].text
llm_response = generated_text.strip()
print("\n" + "="*70)
print(f"Prompt: \n{prompt_text}")
print("-" * 70)
print(f"(LLM Response): \n{llm_response}")
print("="*70)
# Create a directory for models
mkdir -p ./models
# Download InfiR2-1.5B-Instruct-FP8 model
huggingface-cli download --resume-download InfiX-ai/InfiR2-1.5B-Instruct-FP8 --local-dir ./models/InfiR2-1.5B-Instruct-FP8
This model is intended for research and commercial use. Example use cases include:
The model should not be used for:
If you find our work useful, please cite:
@misc{wang2025infir2comprehensivefp8training,
title={InfiR2: A Comprehensive FP8 Training Recipe for Reasoning-Enhanced Language Models},
author={Wenjun Wang and Shuo Cai and Congkai Xie and Mingfa Feng and Yiming Zhang and Zhen Li and Kejing Yang and Ming Li and Jiannong Cao and Hongxia Yang},
year={2025},
eprint={2509.22536},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={[https://arxiv.org/abs/2509.22536](https://arxiv.org/abs/2509.22536)},
}