QuantTrio
/

Qwen3-30B-A3B-Instruct-2507-GPTQ-Int8

Text Generation

8-bit precision

Model card Files Files and versions

JunHowie commited on Jul 30

Commit

22bcc00

·

verified ·

1 Parent(s): 15bdc40

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -22,7 +22,7 @@ Base model: [Qwen/Qwen3-30B-A3B-Instruct-2507](https://huggingface.co/Qwen/Qwen3
 CONTEXT_LENGTH=32768  # 262144
 vllm serve \
-    tclf90/Qwen3-30B-A3B-Instruct-2507-GPTQ-Int8 \
     --served-model-name Qwen3-30B-A3B-Instruct-2507-GPTQ-Int8 \
     --enable-expert-parallel \
     --swap-space 16 \
@@ -57,8 +57,8 @@ vllm>=0.9.2
 ### 【Model Download】
 ```python
-from modelscope import snapshot_download
-snapshot_download('tclf90/Qwen3-30B-A3B-Instruct-2507-GPTQ-Int8', cache_dir="your_local_path")
 ```
 ### 【Overview】

 CONTEXT_LENGTH=32768  # 262144
 vllm serve \
+    QuantTrio/Qwen3-30B-A3B-Instruct-2507-GPTQ-Int8 \
     --served-model-name Qwen3-30B-A3B-Instruct-2507-GPTQ-Int8 \
     --enable-expert-parallel \
     --swap-space 16 \
 ### 【Model Download】
 ```python
+from huggingface_hub import snapshot_download
+snapshot_download('QuantTrio/Qwen3-30B-A3B-Instruct-2507-GPTQ-Int8', cache_dir="your_local_path")
 ```
 ### 【Overview】