FlashResearch-4B-Thinking
A 4B-parameter Qwen model distilled from Tongyi DeepResearch-30B A3B, optimized for web-scale “deep research” tasks and inference with Alibaba-NLP/DeepResearch.
- Base: Qwen 4B (dense)
- Teacher: Tongyi DeepResearch 30B A3B (MoE)
- Method: SFT distillation on 33k curated deep-research examples
- Dataset:
flashresearch/FlashResearch-DS-33k - Primary Use: Fast, low-cost DeepResearch agent runs (browsing, multi-step reasoning, source-grounded answers)
Evaluation
Training Data
- Primary dataset:
flashresearch/FlashResearch-DS-33k
Inference with Alibaba-NLP/DeepResearch (Recommended)
This model is intended to be used directly with the DeepResearch repo.
1) Install & set up
git clone https://github.com/Alibaba-NLP/DeepResearch
cd DeepResearch
# Create env (example)
python -m venv .venv && source .venv/bin/activate
pip install -e . # or pip install -r requirements.txt if provided
2) Point DeepResearch to this model
Edit the config to add this model
MODEL_PATH=flashresearch/FlashResearch-4B-Thinking
Hardware notes
- Single 12–16GB GPU is enough for 4B FP16; FP8/INT4 quantization allows smaller VRAM. If you quantize, the summary model can be local as well.
Acknowledgements
- Qwen team for the base 4B architecture
- Alibaba-NLP for DeepResearch
- CheapResearch contributors for the 33k dataset
Citation
If you use this model, please cite:
@software{cheapresearch_thinking_2025,
title = {CheapResearch 4B Thinking},
author = {Artem Y.},
year = {2025},
url = {https://huggingface.co/flashresearch/FlashResearch-4B-Thinking}
}
And the dataset:
@dataset{cheapresearch_ds_33k,
title = {CheapResearch-DS-33k},
author = {Artem Y.},
year = {2025},
url = {https://huggingface.co/datasets/flashresearch/FlashResearch-DS-33k}
}
Changelog
- v1.0.0 (2025-10-04) — First public release (33k distillation, DeepResearch-ready)
Model Card Metadata (Hugging Face)
---
language:
- en
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- qwen
- deep-research
- browsing
- citation
- reasoning
- distillation
- agent
- vllm
- cheapresearch
datasets:
- flashresearch/FlashResearch-DS-33k
base_model:
- Qwen/Qwen3-4B-Thinking-2507
model-index:
- name: FlashResearch-4B-Thinking
results: []
---
- Downloads last month
- 323
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support