Update README.md
Browse files
README.md
CHANGED
|
@@ -292,7 +292,8 @@ To speed up your inference, you can use the vLLM engine from [our repository](ht
|
|
| 292 |
Make sure to switch to the `v0.9.2rc2_hyperclovax_vision_seed` branch.
|
| 293 |
|
| 294 |
**Launch API server**:
|
| 295 |
-
|
|
|
|
| 296 |
pyenv virtualenv 3.10.2 .vllm
|
| 297 |
pyenv activate .vllm
|
| 298 |
sudo apt-get install -y kmod
|
|
@@ -317,7 +318,7 @@ pip install -U pynvml
|
|
| 317 |
pip install timm av decord
|
| 318 |
|
| 319 |
# Then launch api
|
| 320 |
-
MODEL=
|
| 321 |
export ATTENTION_BACKEND=FLASH_ATTN_VLLM_V1
|
| 322 |
VLLM_USE_V1=1 VLLM_ATTENTION_BACKEND=${ATTENTION_BACKEND} CUDA_VISIBLE_DEVICES=0,1 python -m vllm.entrypoints.openai.api_server \
|
| 323 |
--seed 20250525 \
|
|
|
|
| 292 |
Make sure to switch to the `v0.9.2rc2_hyperclovax_vision_seed` branch.
|
| 293 |
|
| 294 |
**Launch API server**:
|
| 295 |
+
|
| 296 |
+
```bash
|
| 297 |
pyenv virtualenv 3.10.2 .vllm
|
| 298 |
pyenv activate .vllm
|
| 299 |
sudo apt-get install -y kmod
|
|
|
|
| 318 |
pip install timm av decord
|
| 319 |
|
| 320 |
# Then launch api
|
| 321 |
+
MODEL=your/mode/path
|
| 322 |
export ATTENTION_BACKEND=FLASH_ATTN_VLLM_V1
|
| 323 |
VLLM_USE_V1=1 VLLM_ATTENTION_BACKEND=${ATTENTION_BACKEND} CUDA_VISIBLE_DEVICES=0,1 python -m vllm.entrypoints.openai.api_server \
|
| 324 |
--seed 20250525 \
|