--- license: mit base_model: - Qwen/Qwen3-32B --- FP8-Dynamic quant to support Ampere cards Use following vllm command to run on 2x 3090 ```bash vllm serve khajaphysist/Qwen3-32B-FP8-Dynamic --enable-reasoning --reasoning-parser deepseek_r1 \ -tp 2 --gpu-memory-utilization 0.99 --disable-log-requests --enforce-eager --max-num-seqs 15 ```