---
license: mit
datasets:
- FreedomIntelligence/ALLaVA-4V
- FreedomIntelligence/ALLaVA-4V-Chinese
language:
- en
- zh
---

# Introduce
We trained a qwen2.5-vl-7b eagle3 draft model on 200k data random select from [FreedomIntelligence/ALLaVA-4V](https://huggingface.co/datasets/FreedomIntelligence/ALLaVA-4V) and [FreedomIntelligence/ALLaVA-4V-Chinese](https://huggingface.co/datasets/FreedomIntelligence/ALLaVA-4V-Chinese) with [specforge](https://github.com/sgl-project/SpecForge/pull/102)

# Usage
infer with [sglang](https://github.com/sgl-project/sglang/pull/8801)
benchmark with [mmstar](https://github.com/sgl-project/SpecForge/pull/106)

start server:
```
python -m sglang.launch_server --model-path Qwen/Qwen2.5-VL-7B-Instruct --speculative-draft Rayzl/qwen2.5-vl-7b-eagle3-sgl --trust-remote-code --chat-template qwen2-vl --chunked-prefill-size -1 --cuda-graph-max-bs 1 --speculative-algo EAGLE3 --speculative-num-steps 4 --speculative-eagle-topk 6 --speculative-num-draft-tokens 24 --tp 1 --mem-fraction-static 0.7 --host 0.0.0.0 --port 8080
```
benchmark:
```
python run_mmstar.py --host http://0.0.0.0 --port 8080 --parallel 1 --num-questions 100
```