LG-AI-EXAONE commited on
Commit
1b30fcd
·
1 Parent(s): 59231af

Update vLLM support

Browse files
Files changed (1) hide show
  1. README.md +12 -2
README.md CHANGED
@@ -194,13 +194,23 @@ You can run the TensorRT-LLM server by following steps:
194
 
195
  2. Run server with the configuration
196
  ```bash
197
- trtllm-serve serve [MODEL_PATH] --backend pytorch --extra_llm_api_options extra_llm_api_config.yaml
198
  ```
199
 
200
  For more details, please refer to [the documentation](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/models/core/exaone) of EXAONE from TensorRT-LLM.
201
 
 
 
 
 
 
 
 
 
 
 
202
  > [!NOTE]
203
- > Other inference engines including `vllm` and `sglang` don't support the EXAONE 4.0 officially now. We will update as soon as these libraries are updated.
204
 
205
 
206
  ## Performance
 
194
 
195
  2. Run server with the configuration
196
  ```bash
197
+ trtllm-serve serve LGAI-EXAONE/EXAONE-4.0-32B --backend pytorch --extra_llm_api_options extra_llm_api_config.yaml
198
  ```
199
 
200
  For more details, please refer to [the documentation](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/models/core/exaone) of EXAONE from TensorRT-LLM.
201
 
202
+ ### vLLM
203
+
204
+ vLLM officially supports EXAONE 4.0 models in the version of `0.10.0`. You can run the vLLM server by following command:
205
+
206
+ ```bash
207
+ vllm serve LGAI-EXAONE/EXAONE-4.0-32B --enable-auto-tool-choice --tool-call-parser hermes --reasoning-parser qwen3
208
+ ```
209
+
210
+ For more details, please refer to [the vLLM documentation](https://docs.vllm.ai/en/stable/).
211
+
212
  > [!NOTE]
213
+ > Other inference engines including `sglang` don't support the EXAONE 4.0 officially now. We will update as soon as these libraries are updated.
214
 
215
 
216
  ## Performance