suhmily commited on
Commit
ef54e62
·
verified ·
1 Parent(s): 0ce9cf6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -43,7 +43,7 @@ InfLLM-V2-Long-Sparse-Base supports both dense attention inference and sparse at
43
  - Dense attention inference: vLLM, SGLang, Huggingface Transformers
44
  - Sparse attention inference: Huggingface Transformers, CPM.cu
45
 
46
- **To facilitate researches in sparse attention, we provide [InfLLM-V2 Kernels](https://github.com/OpenBMB/infllmv2_cuda_impl) and [CPM.cu, a high-performance CUDA implementation](https://github.com/OpenBMB/CPM.cu.git).**
47
 
48
  ### Inference with Transformers
49
  InfLLM-V2-Long-Sparse-Base requires `transformers>=4.56`.
 
43
  - Dense attention inference: vLLM, SGLang, Huggingface Transformers
44
  - Sparse attention inference: Huggingface Transformers, CPM.cu
45
 
46
+ **To facilitate researches in sparse attention, we provide [InfLLM-V2 Kernels](https://github.com/OpenBMB/infllmv2_cuda_impl) and [CPM.cu](https://github.com/OpenBMB/CPM.cu.git).**
47
 
48
  ### Inference with Transformers
49
  InfLLM-V2-Long-Sparse-Base requires `transformers>=4.56`.