lordjimen commited on
Commit
a5b6fb8
·
verified ·
1 Parent(s): 87def42

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -26
README.md CHANGED
@@ -131,32 +131,6 @@ print(tokenizer.decode(outputs[0]))
131
 
132
  **Important Note:** Models based on Gemma 2 such as BgGPT-Gemma-2-2.6B-IT-v1.0 do not support flash attention. Using it results in degraded performance.
133
 
134
- ```python
135
- tokenizer = AutoTokenizer.from_pretrained(
136
- "INSAIT-Institute/BgGPT-Gemma-2-27B-IT-v1.0",
137
- use_default_system_prompt=False,
138
- )
139
-
140
- messages = [
141
- {"role": "user", "content": "Кога е основан Софийският университет?"},
142
- ]
143
-
144
- input_ids = tokenizer.apply_chat_template(
145
- messages,
146
- return_tensors="pt",
147
- add_generation_prompt=True,
148
- return_dict=True
149
- )
150
-
151
- outputs = model.generate(
152
- **input_ids,
153
- generation_config=generation_params
154
- )
155
- print(tokenizer.decode(outputs[0]))
156
- ```
157
-
158
- **Important Note:** Models based on Gemma 2 such as BgGPT-Gemma-2-2.6B-IT-v1.0 do not support flash attention. Using it results in degraded performance.
159
-
160
  # Use with vLLM
161
 
162
  Example usage with vLLM:
 
131
 
132
  **Important Note:** Models based on Gemma 2 such as BgGPT-Gemma-2-2.6B-IT-v1.0 do not support flash attention. Using it results in degraded performance.
133
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
134
  # Use with vLLM
135
 
136
  Example usage with vLLM: