--- base_model: - Qwen/Qwen2.5-Coder-1.5B license: cc-by-nc-4.0 ---

The code embedding model trained by Jina AI.

# Jina Code Embeddings: A Small but Performant Code Embedding Model ## Intended Usage & Model Info `jina-code-embeddings` is an embedding model for code retrieval. The model supports various types of code retrieval (text-to-code, code-to-code, code-to-text, code-to-completion) and technical question answering across 15+ programming languages. Built on [Qwen/Qwen2.5-Coder-1.5B](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B), `jina-code-embeddings-1.5b` features: - **Multilingual support** (15+ programming languages) and compatibility with a wide range of domains, including web development, software development, machine learning, data science, and educational coding problems. - **Task-specific instruction prefixes** for NL2Code, Code2Code, Code2NL, Code2Completion, and Technical QA, which can be selected at inference time. - **Flexible embedding size**: dense embeddings are 1536-dimensional by default but can be truncated to as low as 128 with minimal performance loss. Summary of features: | Feature | Jina Code Embeddings 1.5B | |------------|------------| | Base Model | Qwen2.5-Coder-1.5B | | Supported Tasks | `nl2code`, `code2code`, `code2nl`, `code2completion`, `qa` | | Model DType | BFloat 16 | | Max Sequence Length | 32768 | | Embedding Vector Dimension | 1536 | | Matryoshka dimensions | 128, 256, 512, 1024, 1536 | | Pooling Strategy | Last-token pooling | | Attention Mechanism | FlashAttention2 | ## Usage

Requirements

The following Python packages are required: - `transformers>=4.53.0` - `torch>=2.7.1` ### Optional / Recommended - **flash-attention**: Installing [flash-attention](https://github.com/Dao-AILab/flash-attention) is recommended for improved inference speed and efficiency, but not mandatory. - **sentence-transformers**: If you want to use the model via the `sentence-transformers` interface, install this package as well.

via transformers

```python # !pip install transformers>=4.53.0 torch>=2.7.1 from transformers import AutoModel import torch # Initialize the model model = AutoModel.from_pretrained("jinaai/jina-code-embeddings-1.5b", trust_remote_code=True) model.to("cuda") # Configure truncate_dim, max_length, batch_size in the encode function if needed # Encode query query_embeddings = model.encode( ["print hello world in python"], task="nl2code", prompt_name="query", ) # Encode passage passage_embeddings = model.encode( ["print('Hello World!')"], task="nl2code", prompt_name="passage", ) ```

via sentence-transformers

```python # !pip install sentence_transformers>=5.0.0 torch>=2.7.1 import torch from sentence_transformers import SentenceTransformer # Load the model model = SentenceTransformer( "jinaai/jina-code-embeddings-1.5b", model_kwargs={ "torch_dtype": torch.bfloat16, "attn_implementation": "flash_attention_2", "device_map": "auto" } ) # The queries and documents to embed queries = [ "print hello world in python", "initialize array of 5 zeros in c++" ] documents = [ "print('Hello World!')", "int arr[5] = {0, 0, 0, 0, 0};" ] query_embeddings = model.encode(queries, prompt_name="nl2code_query") document_embeddings = model.encode(documents, prompt_name="nl2code_document") # Compute the (cosine) similarity between the query and document embeddings similarity = model.similarity(query_embeddings, document_embeddings) print(similarity) # tensor([[0.8157, 0.1222], # [0.1201, 0.5500]]) ```

## Training & Evaluation Please refer to our technical report of jina-code-embeddings for training details and benchmarks. ## Contact Join our [Discord community](https://discord.jina.ai) and chat with other community members about ideas.