FlashResearch-4B-Thinking

A 4B-parameter Qwen model distilled from Tongyi DeepResearch-30B A3B, optimized for web-scale “deep research” tasks and inference with Alibaba-NLP/DeepResearch.

Base: Qwen 4B (dense)
Teacher: Tongyi DeepResearch 30B A3B (MoE)
Method: SFT distillation on 33k curated deep-research examples
Dataset: flashresearch/FlashResearch-DS-33k
Primary Use: Fast, low-cost DeepResearch agent runs (browsing, multi-step reasoning, source-grounded answers)

Evaluation

Training Data

Primary dataset: flashresearch/FlashResearch-DS-33k

Inference with Alibaba-NLP/DeepResearch (Recommended)

This model is intended to be used directly with the DeepResearch repo.

1) Install & set up

git clone https://github.com/Alibaba-NLP/DeepResearch
cd DeepResearch
# Create env (example)
python -m venv .venv && source .venv/bin/activate
pip install -e .  # or pip install -r requirements.txt if provided

2) Point DeepResearch to this model

Edit the config to add this model

MODEL_PATH=flashresearch/FlashResearch-4B-Thinking

Hardware notes

Single 12–16GB GPU is enough for 4B FP16; FP8/INT4 quantization allows smaller VRAM. If you quantize, the summary model can be local as well.

Acknowledgements

Qwen team for the base 4B architecture
Alibaba-NLP for DeepResearch
CheapResearch contributors for the 33k dataset

Citation

If you use this model, please cite:

@software{cheapresearch_thinking_2025,
  title        = {CheapResearch 4B Thinking},
  author       = {Artem Y.},
  year         = {2025},
  url          = {https://huggingface.co/flashresearch/FlashResearch-4B-Thinking}
}

And the dataset:

@dataset{cheapresearch_ds_33k,
  title        = {CheapResearch-DS-33k},
  author       = {Artem Y.},
  year         = {2025},
  url          = {https://huggingface.co/datasets/flashresearch/FlashResearch-DS-33k}
}

Changelog

v1.0.0 (2025-10-04) — First public release (33k distillation, DeepResearch-ready)

Model Card Metadata (Hugging Face)

---
language:
- en
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- qwen
- deep-research
- browsing
- citation
- reasoning
- distillation
- agent
- vllm
- cheapresearch
datasets:
- flashresearch/FlashResearch-DS-33k
base_model:
- Qwen/Qwen3-4B-Thinking-2507
model-index:
- name: FlashResearch-4B-Thinking
  results: []
---

Downloads last month: 323

Safetensors

Model size

196k params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for flashresearch/FlashResearch-4B-Thinking

Finetunes

1 model

Quantizations

7 models

flashresearch
/

FlashResearch-4B-Thinking