--- license: cc-by-nc-4.0 base_model: - Salesforce/SFR-Embedding-2_R pipeline_tag: feature-extraction --- ## Model Card: APEX-Embedding-7B [NON-COMMERCIAL USE ONLY] ## [Fifth Dimension](https://www.fifthdimensionai.com/en-gb) ### Read our paper on ArXiv here: [https://arxiv.org/abs/2410.18105](https://arxiv.org/abs/2410.18105) ### Model Overview APEX-Embedding-7B is a 7-billion parameter model optimized for **Factual Document Retrieval** in **Retrieval-Augmented Generation (RAG)** systems. During training, the model was enhanced using **Structured Entity Relationship Maps** and **Model-Aware Contrastive Sampling** to focus on factual accuracy. The final model is highly effective for generating precise text embeddings, especially for industries that rely on large-scale document retrieval tasks such as legal, compliance, and real estate. This model achieves a **90.86%** in rank@1 accuracy during our document retrieval evaluation compared to similar models, ensuring reliable and accurate retrieval of relevant documents from large datasets. ### License & Disclaimer License: **Creative Commons Attribution Non Commercial 4.0 License** This release is for non-commercial research purposes only, and has been published in support of an academic paper. The model has been fine-tuned for real-world Factual RAG tasks and has not been designed or evaluated for *all* embedding tasks, such as those found in the MTEB or other benchmark datasets. This model is provided 'as is' without any warranties and the authors are not to be held responsible for any consequences resulting from its use. Users are responsible for ensuring both their legal compliance with the license terms, local laws, and the model's technical suitability for their specific use case. --- ### How to Use The following steps guide you on how to generate embeddings and compare the similarity between queries and documents. #### Environment Setup First, install the necessary libraries: ```bash pip install torch transformers peft accelerate numpy pip install -U bitsandbytes ``` #### Loading the Model Here is the code to load the model: (There is a pre-quantised version of the base model available in the `base` directory) ```python from transformers import AutoModel, AutoTokenizer, BitsAndBytesConfig import torch model_path = "5DAI/APEX-Embedding-7B-v0.1" device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') quantization_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 ) model = AutoModel.from_pretrained( model_path, quantization_config=quantization_config ) tokenizer = AutoTokenizer.from_pretrained(model_path) model.to(device) ``` #### Generating Embeddings for Queries and Documents When generating embeddings, use the following instruction prompt for both **queries** and **documents**: ```python def get_embedding(text: str, model, tokenizer): prompt = f"Instruction: Please perform a RAG search based on the following. Text: {text}" inputs = tokenizer(prompt, return_tensors="pt", max_length=8192, padding=True, truncation=True).to(device) with torch.no_grad(): outputs = model(**inputs) embedding = outputs.last_hidden_state[:, -1, :] # Use last token pooling embedding = torch.nn.functional.normalize(embedding, p=2, dim=1) return embedding.cpu().numpy() ``` Context Window extends to 32K tokens, but best results can be achieved if text is chunked by page (< 8192 tokens). #### Cosine Similarity for Queries and Documents To compare a **query** and a **document**, use the cosine similarity function: ```python import numpy as np def cosine_similarity(vecA: np.ndarray, vecB: np.ndarray) -> float: normA = np.linalg.norm(vecA) normB = np.linalg.norm(vecB) return np.dot(vecA, vecB) / (normA * normB) if normA > 0 and normB > 0 else 0 ``` #### Example: Query vs. Document Embedding Here’s how to generate embeddings for a query and a document, and compare their similarity: ```python query = "What are the legal requirements for property zoning in urban areas?" document = "This document contains details about urban property zoning laws, including legal frameworks and compliance standards." query_embedding = get_embedding(query, model, tokenizer)[0] document_embedding = get_embedding(document, model, tokenizer)[0] similarity = cosine_similarity(query_embedding, document_embedding) print(f"Cosine similarity between query and document: {similarity}") ``` --- ### Citation Please cite APEX-Embedding-7B as follows: ``` @misc{APEX-embedding-7b, title={APEX-Embedding-7B: Improving Embedding Accuracy for Document Retrieval Using Entity Relationship Maps and Model-Aware Contrastive Sampling}, author={Thea Aviss}, year={2024}, url={https://arxiv.org/abs/2410.18105} } ```