---
base_model:
- Qwen/Qwen2.5-Coder-1.5B
license: cc-by-nc-4.0
---
The code embedding model trained by Jina AI.
# Jina Code Embeddings: A Small but Performant Code Embedding Model
## Intended Usage & Model Info
`jina-code-embeddings` is an embedding model for code retrieval.
The model supports various types of code retrieval (text-to-code, code-to-code, code-to-text, code-to-completion) and technical question answering across 15+ programming languages.
Built on [Qwen/Qwen2.5-Coder-1.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B), `jina-code-embeddings-1.5b` features:
- **Multilingual support** (15+ programming languages) and compatibility with a wide range of domains, including web development, software development, machine learning, data science, and educational coding problems.
- **Task-specific instruction prefixes** for NL2Code, Code2Code, Code2NL, Code2Completion, and Technical QA, which can be selected at inference time.
- **Flexible embedding size**: dense embeddings are 1536-dimensional by default but can be truncated to as low as 128 with minimal performance loss.
Summary of features:
| Feature | Jina Code Embeddings 1.5B |
|------------|------------|
| Base Model | Qwen2.5-Coder-1.5B |
| Supported Tasks | `nl2code`, `code2code`, `code2nl`, `code2completion`, `qa` |
| Model DType | BFloat 16 |
| Max Sequence Length | 32768 |
| Embedding Vector Dimension | 1536 |
| Matryoshka dimensions | 128, 256, 512, 1024, 1536 |
| Pooling Strategy | Last-token pooling |
| Attention Mechanism | FlashAttention2 |
## Usage
Requirements
The following Python packages are required:
- `transformers>=4.53.0`
- `torch>=2.7.1`
### Optional / Recommended
- **flash-attention**: Installing [flash-attention](https://github.com/Dao-AILab/flash-attention) is recommended for improved inference speed and efficiency, but not mandatory.
- **sentence-transformers**: If you want to use the model via the `sentence-transformers` interface, install this package as well.
via transformers
```python
# !pip install transformers>=4.53.0 torch>=2.7.1
from transformers import AutoModel
import torch
# Initialize the model
model = AutoModel.from_pretrained("jinaai/jina-code-embeddings-1.5b", trust_remote_code=True)
model.to("cuda")
# Configure truncate_dim, max_length, batch_size in the encode function if needed
# Encode query
query_embeddings = model.encode(
["print hello world in python"],
task="nl2code",
prompt_name="query",
)
# Encode passage
passage_embeddings = model.encode(
["print('Hello World!')"],
task="nl2code",
prompt_name="passage",
)
```
via sentence-transformers
```python
# !pip install sentence_transformers>=5.0.0 torch>=2.7.1
import torch
from sentence_transformers import SentenceTransformer
# Load the model
model = SentenceTransformer(
"jinaai/jina-code-embeddings-1.5b",
model_kwargs={
"torch_dtype": torch.bfloat16,
"attn_implementation": "flash_attention_2",
"device_map": "auto"
}
)
# The queries and documents to embed
queries = [
"print hello world in python",
"initialize array of 5 zeros in c++"
]
documents = [
"print('Hello World!')",
"int arr[5] = {0, 0, 0, 0, 0};"
]
query_embeddings = model.encode(queries, prompt_name="nl2code_query")
document_embeddings = model.encode(documents, prompt_name="nl2code_document")
# Compute the (cosine) similarity between the query and document embeddings
similarity = model.similarity(query_embeddings, document_embeddings)
print(similarity)
# tensor([[0.8157, 0.1222],
# [0.1201, 0.5500]])
```
## Training & Evaluation
Please refer to our technical report of jina-code-embeddings for training details and benchmarks.
## Contact
Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas.