SentenceTransformer based on BAAI/bge-base-en-v1.5
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-base-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ayushexel/emb-bge-base-en-v1.5-squad-6-epochs")
# Run inference
sentences = [
'Why are herds of deer starting to enter residential areas in London?',
"Herds of red and fallow deer also roam freely within much of Richmond and Bushy Park. A cull takes place each November and February to ensure numbers can be sustained. Epping Forest is also known for its fallow deer, which can frequently be seen in herds to the north of the Forest. A rare population of melanistic, black fallow deer is also maintained at the Deer Sanctuary near Theydon Bois. Muntjac deer, which escaped from deer parks at the turn of the twentieth century, are also found in the forest. While Londoners are accustomed to wildlife such as birds and foxes sharing the city, more recently urban deer have started becoming a regular feature, and whole herds of fallow and white-tailed deer come into residential areas at night to take advantage of the London's green spaces.",
"Herds of red and fallow deer also roam freely within much of Richmond and Bushy Park. A cull takes place each November and February to ensure numbers can be sustained. Epping Forest is also known for its fallow deer, which can frequently be seen in herds to the north of the Forest. A rare population of melanistic, black fallow deer is also maintained at the Deer Sanctuary near Theydon Bois. Muntjac deer, which escaped from deer parks at the turn of the twentieth century, are also found in the forest. While Londoners are accustomed to wildlife such as birds and foxes sharing the city, more recently urban deer have started becoming a regular feature, and whole herds of fallow and white-tailed deer come into residential areas at night to take advantage of the London's green spaces.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Triplet
- Dataset:
gooqa-dev - Evaluated with
TripletEvaluator
| Metric | Value |
|---|---|
| cosine_accuracy | 0.4156 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 44,287 training samples
- Columns:
question,context, andnegative - Approximate statistics based on the first 1000 samples:
question context negative type string string string details - min: 7 tokens
- mean: 14.36 tokens
- max: 34 tokens
- min: 31 tokens
- mean: 150.89 tokens
- max: 512 tokens
- min: 31 tokens
- mean: 156.54 tokens
- max: 510 tokens
- Samples:
question context negative Who is the artist that drew grapes so lifelike, birds flew up and pecked at them?Hellenistic art saw a turn from the idealistic, perfected, calm and composed figures of classical Greek art to a style dominated by realism and the depiction of emotion (pathos) and character (ethos). The motif of deceptively realistic naturalism in art (aletheia) is reflected in stories such as that of the painter Zeuxis, who was said to have painted grapes that seemed so real that birds came and pecked at them. The female nude also became more popular as epitomized by the Aphrodite of Cnidos of Praxiteles and art in general became more erotic (e.g. Leda and the Swan and Scopa's Pothos). The dominant ideals of Hellenistic art were those of sensuality and passion.The quail is a small to medium-sized, cryptically coloured bird. In its natural environment, it is found in bushy places, in rough grassland, among agricultural crops, and in other places with dense cover. It feeds on seeds, insects, and other small invertebrates. Being a largely ground-dwelling, gregarious bird, domestication of the quail was not difficult, although many of its wild instincts are retained in captivity. It was known to the Egyptians long before the arrival of chickens and was depicted in hieroglyphs from 2575 BC. It migrated across Egypt in vast flocks and the birds could sometimes be picked up off the ground by hand. These were the common quail (Coturnix coturnix), but modern domesticated flocks are mostly of Japanese quail (Coturnix japonica) which was probably domesticated as early as the 11th century AD in Japan. They were originally kept as songbirds, and they are thought to have been regularly used in song contests.Which character helps Link get Ganondorf off of his horse?Ganondorf then revives, and Midna teleports Link and Zelda outside the castle so she can hold him off with the Fused Shadows. However, as Hyrule Castle collapses, it is revealed that Ganondorf was victorious as he crushes Midna's helmet. Ganondorf engages Link on horseback, and, assisted by Zelda and the Light Spirits, Link eventually knocks Ganondorf off his horse and they duel on foot before Link strikes down Ganondorf and plunges the Master Sword into his chest. With Ganondorf dead, the Light Spirits not only bring Midna back to life, but restore her to her true form. After bidding farewell to Link and Zelda, Midna returns home before destroying the Mirror of Twilight with a tear to maintain balance between Hyrule and the Twilight Realm. Near the end, as Hyrule Castle is rebuilt, Link is shown leaving Ordon Village heading to parts unknown.After gaining the Master Sword, Link is cleansed of the magic that kept him in wolf form, obtaining the Shadow Crystal. Now able to use it to switch between both forms at will, Link is led by Midna to the Mirror of Twilight located deep within the Gerudo Desert, the only known gateway between the Twilight Realm and Hyrule. However, they discover that the mirror is broken. The Sages there explain that Zant tried to destroy it, but he was only able to shatter it into fragments; only the true ruler of the Twili can completely destroy the Mirror of Twilight. They also reveal that they used it a century ago to banish Ganondorf, the Gerudo leader who attempted to steal the Triforce, to the Twilight Realm when executing him failed. Assisted by an underground resistance group they meet in Castle Town, Link and Midna set out to retrieve the missing shards of the Mirror, defeating those they infected. Once the portal has been restored, Midna is revealed to be the true ruler of the Twilight Realm...What is the Judaeo-Spanish language developed by Sephardic Jews who migrated to the Iberian peninsula?For centuries, Jews worldwide have spoken the local or dominant languages of the regions they migrated to, often developing distinctive dialectal forms or branches that became independent languages. Yiddish is the Judæo-German language developed by Ashkenazi Jews who migrated to Central Europe. Ladino is the Judæo-Spanish language developed by Sephardic Jews who migrated to the Iberian peninsula. Due to many factors, including the impact of the Holocaust on European Jewry, the Jewish exodus from Arab and Muslim countries, and widespread emigration from other Jewish communities around the world, ancient and distinct Jewish languages of several communities, including Judæo-Georgian, Judæo-Arabic, Judæo-Berber, Krymchak, Judæo-Malayalam and many others, have largely fallen out of use.Jews are often identified as belonging to one of two major groups: the Ashkenazim and the Sephardim. Ashkenazim, or "Germanics" (Ashkenaz meaning "Germany" in Hebrew), are so named denoting their German Jewish cultural and geographical origins, while Sephardim, or "Hispanics" (Sefarad meaning "Spain/Hispania" or "Iberia" in Hebrew), are so named denoting their Spanish/Portuguese Jewish cultural and geographic origins. The more common term in Israel for many of those broadly called Sephardim, is Mizrahim (lit. "Easterners", Mizrach being "East" in Hebrew), that is, in reference to the diverse collection of Middle Eastern and North African Jews who are often, as a group, referred to collectively as Sephardim (together with Sephardim proper) for liturgical reasons, although Mizrahi Jewish groups and Sephardi Jews proper are ethnically distinct. - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 5,000 evaluation samples
- Columns:
question,context, andnegative_1 - Approximate statistics based on the first 1000 samples:
question context negative_1 type string string string details - min: 7 tokens
- mean: 14.44 tokens
- max: 36 tokens
- min: 32 tokens
- mean: 150.57 tokens
- max: 512 tokens
- min: 32 tokens
- mean: 149.45 tokens
- max: 431 tokens
- Samples:
question context negative_1 Who was the artist of the first song used?The song "Tom's Diner" by Suzanne Vega was the first song used by Karlheinz Brandenburg to develop the MP3. Brandenburg adopted the song for testing purposes, listening to it again and again each time refining the scheme, making sure it did not adversely affect the subtlety of Vega's voice.Influences also came to her from the art world, most notably through the works of Mexican artist Frida Kahlo. The music video of the song "Bedtime Story" featured images inspired by the paintings of Kahlo and Remedios Varo. Madonna is also a collector of Tamara de Lempicka's Art Deco paintings and has included them in her music videos and tours. Her video for "Hollywood" (2003) was an homage to the work of photographer Guy Bourdin; Bourdin's son subsequently filed a lawsuit for unauthorised use of his father's work. Pop artist Andy Warhol's use of sadomasochistic imagery in his underground films were reflected in the music videos for "Erotica" and "Deeper and Deeper".what is SONET?For customers with more demanding requirements (such as medium-to-large businesses, or other ISPs) can use higher-speed DSL (such as single-pair high-speed digital subscriber line), Ethernet, metropolitan Ethernet, gigabit Ethernet, Frame Relay, ISDN Primary Rate Interface, ATM (Asynchronous Transfer Mode) and synchronous optical networking (SONET).The name Ashkenazi derives from the biblical figure of Ashkenaz, the first son of Gomer, son of Khaphet, son of Noah, and a Japhetic patriarch in the Table of Nations (Genesis 10). The name of Gomer has often been linked to the ethnonym Cimmerians. Biblical Ashkenaz is usually derived from Assyrian Aškūza (cuneiform Aškuzai/Iškuzai), a people who expelled the Cimmerians from the Armenian area of the Upper Euphrates, whose name is usually associated with the name of the Scythians. The intrusive n in the Biblical name is likely due to a scribal error confusing a waw ו with a nun נ.The Wii version makes use of what kind of sensors?The GameCube and Wii versions feature several minor differences in their controls. The Wii version of the game makes use of the motion sensors and built-in speaker of the Wii Remote. The speaker emits the sounds of a bowstring when shooting an arrow, Midna's laugh when she gives advice to Link, and the series' trademark "chime" when discovering secrets. The player controls Link's sword by swinging the Wii Remote. Other attacks are triggered using similar gestures with the Nunchuk. Unique to the GameCube version is the ability for the player to control the camera freely, without entering a special "lookaround" mode required by the Wii; however, in the GameCube version, only two of Link's secondary weapons can be equipped at a time, as opposed to four in the Wii version.[g]The GameCube and Wii versions feature several minor differences in their controls. The Wii version of the game makes use of the motion sensors and built-in speaker of the Wii Remote. The speaker emits the sounds of a bowstring when shooting an arrow, Midna's laugh when she gives advice to Link, and the series' trademark "chime" when discovering secrets. The player controls Link's sword by swinging the Wii Remote. Other attacks are triggered using similar gestures with the Nunchuk. Unique to the GameCube version is the ability for the player to control the camera freely, without entering a special "lookaround" mode required by the Wii; however, in the GameCube version, only two of Link's secondary weapons can be equipped at a time, as opposed to four in the Wii version.[g] - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 128per_device_eval_batch_size: 128num_train_epochs: 6warmup_ratio: 0.1fp16: Truebatch_sampler: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 128per_device_eval_batch_size: 128per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 6max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional
Training Logs
| Epoch | Step | Training Loss | Validation Loss | gooqa-dev_cosine_accuracy |
|---|---|---|---|---|
| -1 | -1 | - | - | 0.3534 |
| 0.2890 | 100 | 0.7342 | 0.8138 | 0.3798 |
| 0.5780 | 200 | 0.4625 | 0.7497 | 0.4052 |
| 0.8671 | 300 | 0.4131 | 0.7312 | 0.4156 |
| 1.1561 | 400 | 0.3127 | 0.7263 | 0.4142 |
| 1.4451 | 500 | 0.2591 | 0.7254 | 0.4132 |
| 1.7341 | 600 | 0.2576 | 0.7197 | 0.4124 |
| 2.0231 | 700 | 0.2503 | 0.7115 | 0.4160 |
| 2.3121 | 800 | 0.1465 | 0.7226 | 0.4166 |
| 2.6012 | 900 | 0.1546 | 0.7237 | 0.4172 |
| 2.8902 | 1000 | 0.1552 | 0.7217 | 0.4180 |
| 3.1792 | 1100 | 0.1201 | 0.7377 | 0.4138 |
| 3.4682 | 1200 | 0.1006 | 0.7381 | 0.4142 |
| 3.7572 | 1300 | 0.1032 | 0.7420 | 0.4126 |
| 4.0462 | 1400 | 0.1007 | 0.7415 | 0.4154 |
| 4.3353 | 1500 | 0.0756 | 0.7485 | 0.4182 |
| 4.6243 | 1600 | 0.0774 | 0.7466 | 0.4180 |
| 4.9133 | 1700 | 0.0791 | 0.7503 | 0.4170 |
| 5.2023 | 1800 | 0.07 | 0.7516 | 0.4240 |
| 5.4913 | 1900 | 0.0665 | 0.7556 | 0.4152 |
| 5.7803 | 2000 | 0.0653 | 0.7554 | 0.4138 |
| -1 | -1 | - | - | 0.4156 |
Framework Versions
- Python: 3.11.0
- Sentence Transformers: 4.0.1
- Transformers: 4.50.3
- PyTorch: 2.6.0+cu124
- Accelerate: 1.5.2
- Datasets: 3.5.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- -
Model tree for ayushexel/emb-bge-base-en-v1.5-squad-6-epochs
Base model
BAAI/bge-base-en-v1.5