metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:44283
- loss:MultipleNegativesRankingLoss
base_model: BAAI/bge-base-en-v1.5
widget:
- source_sentence: What do detritivores eat?
sentences:
- >-
Many organisms (of which humans are prime examples) eat from multiple
levels of the food chain and, thus, make this classification
problematic. A carnivore may eat both secondary and tertiary consumers,
and its prey may itself be difficult to classify for similar reasons.
Organisms showing both carnivory and herbivory are known as omnivores.
Even herbivores such as the giant panda may supplement their diet with
meat. Scavenging of carrion provides a significant part of the diet of
some of the most fearsome predators. Carnivorous plants would be very
difficult to fit into this classification, producing their own food but
also digesting anything that they may trap. Organisms that eat
detritivores or parasites would also be difficult to classify by such a
scheme.
- >-
In an ecosystem, predation is a biological interaction where a predator
(an organism that is hunting) feeds on its prey (the organism that is
attacked). Predators may or may not kill their prey prior to feeding on
them, but the act of predation often results in the death of the prey
and the eventual absorption of the prey's tissue through consumption.
Thus predation is often, though not always, carnivory. Other categories
of consumption are herbivory (eating parts of plants), fungivory (eating
parts of fungi), and detritivory (the consumption of dead organic
material (detritus)). All these consumption categories fall under the
rubric of consumer-resource systems. It can often be difficult to
separate various types of feeding behaviors. For example, some parasitic
species prey on a host organism and then lay their eggs on it for their
offspring to feed on it while it continues to live in or on its decaying
corpse after it has died. The key characteristic of predation however is
the predator's direct impact on the prey population. On the other hand,
detritivores simply eat dead organic material arising from the decay of
dead individuals and have no direct impact on the "donor" organism(s).
- >-
Buddhism is practiced by an estimated 488 million,[web 1] 495 million,
or 535 million people as of the 2010s, representing 7% to 8% of the
world's total population.
- source_sentence: >-
Along with Georgia, what American jurisdiction allowed people to be
executed for the rape of an adult prior to Coker?
sentences:
- >-
The engagement was not without controversy: Philip had no financial
standing, was foreign-born (though a British subject who had served in
the Royal Navy throughout the Second World War), and had sisters who had
married German noblemen with Nazi links. Marion Crawford wrote, "Some of
the King's advisors did not think him good enough for her. He was a
prince without a home or kingdom. Some of the papers played long and
loud tunes on the string of Philip's foreign origin." Elizabeth's mother
was reported, in later biographies, to have opposed the union initially,
even dubbing Philip "The Hun". In later life, however, she told
biographer Tim Heald that Philip was "an English gentleman".
- >-
In 1977, the Supreme Court's Coker v. Georgia decision barred the death
penalty for rape of an adult woman, and implied that the death penalty
was inappropriate for any offense against another person other than
murder. Prior to the decision, the death penalty for rape of an adult
had been gradually phased out in the United States, and at the time of
the decision, the State of Georgia and the U.S. Federal government were
the only two jurisdictions to still retain the death penalty for that
offense. However, three states maintained the death penalty for child
rape, as the Coker decision only imposed a ban on executions for the
rape of an adult woman. In 2008, the Kennedy v. Louisiana decision
barred the death penalty for child rape. The result of these two
decisions means that the death penalty in the United States is largely
restricted to cases where the defendant took the life of another human
being. The current federal kidnapping statute, however, may be exempt
because the death penalty applies if the victim dies in the
perpetrator's custody, not necessarily by his hand, thus stipulating a
resulting death, which was the wording of the objection. In addition,
the Federal government retains the death penalty for non-murder offenses
that are considered crimes against the state, including treason,
espionage, and crimes under military jurisdiction.
- >-
In 1977, the Supreme Court's Coker v. Georgia decision barred the death
penalty for rape of an adult woman, and implied that the death penalty
was inappropriate for any offense against another person other than
murder. Prior to the decision, the death penalty for rape of an adult
had been gradually phased out in the United States, and at the time of
the decision, the State of Georgia and the U.S. Federal government were
the only two jurisdictions to still retain the death penalty for that
offense. However, three states maintained the death penalty for child
rape, as the Coker decision only imposed a ban on executions for the
rape of an adult woman. In 2008, the Kennedy v. Louisiana decision
barred the death penalty for child rape. The result of these two
decisions means that the death penalty in the United States is largely
restricted to cases where the defendant took the life of another human
being. The current federal kidnapping statute, however, may be exempt
because the death penalty applies if the victim dies in the
perpetrator's custody, not necessarily by his hand, thus stipulating a
resulting death, which was the wording of the objection. In addition,
the Federal government retains the death penalty for non-murder offenses
that are considered crimes against the state, including treason,
espionage, and crimes under military jurisdiction.
- source_sentence: Testing revealed today's dogs trace back by how many years?
sentences:
- >-
The Schnellweg (en: expressway) system, a number of Bundesstraße roads,
forms a structure loosely resembling a large ring road together with A2
and A7. The roads are B 3, B 6 and B 65, called Westschnellweg (B6 on
the northern part, B3 on the southern part), Messeschnellweg (B3,
becomes A37 near Burgdorf, crosses A2, becomes B3 again, changes to B6
at Seelhorster Kreuz, then passes the Hanover fairground as B6 and
becomes A37 again before merging into A7) and Südschnellweg (starts out
as B65, becomes B3/B6/B65 upon crossing Westschnellweg, then becomes B65
again at Seelhorster Kreuz).
- >-
Although initially thought to have originated as a manmade variant of an
extant canid species (variously supposed as being the dhole, golden
jackal, or gray wolf), extensive genetic studies undertaken during the
2010s indicate that dogs diverged from an extinct wolf-like canid in
Eurasia 40,000 years ago. Being the oldest domesticated animal, their
long association with people has allowed dogs to be uniquely attuned to
human behavior, as well as thrive on a starch-rich diet which would be
inadequate for other canid species.
- >-
Although it is said that the "dog is man's best friend" regarding 17–24%
of dogs in the developed countries, in the developing world they are
feral, village or community dogs, with pet dogs uncommon. These live
their lives as scavengers and have never been owned by humans, with one
study showing their most common response when approached by strangers
was to run away (52%) or respond with aggression (11%). We know little
about these dogs, nor about the dogs that live in developed countries
that are feral, stray or are in shelters, yet the great majority of
modern research on dog cognition has focused on pet dogs living in human
homes.
- source_sentence: What kind of societies usually follow a regular daily schedule year-round?
sentences:
- >-
Industrialized societies generally follow a clock-based schedule for
daily activities that do not change throughout the course of the year.
The time of day that individuals begin and end work or school, and the
coordination of mass transit, for example, usually remain constant
year-round. In contrast, an agrarian society's daily routines for work
and personal conduct are more likely governed by the length of daylight
hours and by solar time, which change seasonally because of the Earth's
axial tilt. North and south of the tropics daylight lasts longer in
summer and shorter in winter, the effect becoming greater as one moves
away from the tropics.
- >-
Relatively insensitive film, with a correspondingly lower speed index,
requires more exposure to light to produce the same image density as a
more sensitive film, and is thus commonly termed a slow film. Highly
sensitive films are correspondingly termed fast films. In both digital
and film photography, the reduction of exposure corresponding to use of
higher sensitivities generally leads to reduced image quality (via
coarser film grain or higher image noise of other types). In short, the
higher the sensitivity, the grainier the image will be. Ultimately
sensitivity is limited by the quantum efficiency of the film or sensor.
- >-
Industrialized societies generally follow a clock-based schedule for
daily activities that do not change throughout the course of the year.
The time of day that individuals begin and end work or school, and the
coordination of mass transit, for example, usually remain constant
year-round. In contrast, an agrarian society's daily routines for work
and personal conduct are more likely governed by the length of daylight
hours and by solar time, which change seasonally because of the Earth's
axial tilt. North and south of the tropics daylight lasts longer in
summer and shorter in winter, the effect becoming greater as one moves
away from the tropics.
- source_sentence: How many people outside the UK were under British rule in 1945?
sentences:
- >-
20th Street starts at Avenue C, and 21st and 22nd Streets begin at First
Avenue. They all end at Eleventh Avenue. Travel on the last block of the
20th, 21st and 22nd Streets, between Tenth and Eleventh Avenues, is in
the opposite direction than it is on the rest of the respective street.
20th Street is very wide from the Avenue C to First Avenue.
- >-
By the start of the 20th century, Germany and the United States
challenged Britain's economic lead. Subsequent military and economic
tensions between Britain and Germany were major causes of the First
World War, during which Britain relied heavily upon its empire. The
conflict placed enormous strain on the military, financial and manpower
resources of Britain. Although the British Empire achieved its largest
territorial extent immediately after World War I, Britain was no longer
the world's pre-eminent industrial or military power. In the Second
World War, Britain's colonies in South-East Asia were occupied by
Imperial Japan. Despite the final victory of Britain and its allies, the
damage to British prestige helped to accelerate the decline of the
empire. India, Britain's most valuable and populous possession, achieved
independence as part of a larger decolonisation movement in which
Britain granted independence to most territories of the Empire. The
transfer of Hong Kong to China in 1997 marked for many the end of the
British Empire. Fourteen overseas territories remain under British
sovereignty. After independence, many former British colonies joined the
Commonwealth of Nations, a free association of independent states. The
United Kingdom is now one of 16 Commonwealth nations, a grouping known
informally as the Commonwealth realms, that share one monarch—Queen
Elizabeth II.
- >-
Though Britain and the empire emerged victorious from the Second World
War, the effects of the conflict were profound, both at home and abroad.
Much of Europe, a continent that had dominated the world for several
centuries, was in ruins, and host to the armies of the United States and
the Soviet Union, who now held the balance of global power. Britain was
left essentially bankrupt, with insolvency only averted in 1946 after
the negotiation of a $US 4.33 billion loan (US$56 billion in 2012) from
the United States, the last instalment of which was repaid in 2006. At
the same time, anti-colonial movements were on the rise in the colonies
of European nations. The situation was complicated further by the
increasing Cold War rivalry of the United States and the Soviet Union.
In principle, both nations were opposed to European colonialism. In
practice, however, American anti-communism prevailed over
anti-imperialism, and therefore the United States supported the
continued existence of the British Empire to keep Communist expansion in
check. The "wind of change" ultimately meant that the British Empire's
days were numbered, and on the whole, Britain adopted a policy of
peaceful disengagement from its colonies once stable, non-Communist
governments were available to transfer power to. This was in contrast to
other European powers such as France and Portugal, which waged costly
and ultimately unsuccessful wars to keep their empires intact. Between
1945 and 1965, the number of people under British rule outside the UK
itself fell from 700 million to five million, three million of whom were
in Hong Kong.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy
model-index:
- name: SentenceTransformer based on BAAI/bge-base-en-v1.5
results:
- task:
type: triplet
name: Triplet
dataset:
name: gooqa dev
type: gooqa-dev
metrics:
- type: cosine_accuracy
value: 0.41920000314712524
name: Cosine Accuracy
SentenceTransformer based on BAAI/bge-base-en-v1.5
This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-base-en-v1.5
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ayushexel/emb-bge-base-en-v1.5-squad-3-epochs")
# Run inference
sentences = [
'How many people outside the UK were under British rule in 1945?',
'Though Britain and the empire emerged victorious from the Second World War, the effects of the conflict were profound, both at home and abroad. Much of Europe, a continent that had dominated the world for several centuries, was in ruins, and host to the armies of the United States and the Soviet Union, who now held the balance of global power. Britain was left essentially bankrupt, with insolvency only averted in 1946 after the negotiation of a $US 4.33 billion loan (US$56 billion in 2012) from the United States, the last instalment of which was repaid in 2006. At the same time, anti-colonial movements were on the rise in the colonies of European nations. The situation was complicated further by the increasing Cold War rivalry of the United States and the Soviet Union. In principle, both nations were opposed to European colonialism. In practice, however, American anti-communism prevailed over anti-imperialism, and therefore the United States supported the continued existence of the British Empire to keep Communist expansion in check. The "wind of change" ultimately meant that the British Empire\'s days were numbered, and on the whole, Britain adopted a policy of peaceful disengagement from its colonies once stable, non-Communist governments were available to transfer power to. This was in contrast to other European powers such as France and Portugal, which waged costly and ultimately unsuccessful wars to keep their empires intact. Between 1945 and 1965, the number of people under British rule outside the UK itself fell from 700 million to five million, three million of whom were in Hong Kong.',
"By the start of the 20th century, Germany and the United States challenged Britain's economic lead. Subsequent military and economic tensions between Britain and Germany were major causes of the First World War, during which Britain relied heavily upon its empire. The conflict placed enormous strain on the military, financial and manpower resources of Britain. Although the British Empire achieved its largest territorial extent immediately after World War I, Britain was no longer the world's pre-eminent industrial or military power. In the Second World War, Britain's colonies in South-East Asia were occupied by Imperial Japan. Despite the final victory of Britain and its allies, the damage to British prestige helped to accelerate the decline of the empire. India, Britain's most valuable and populous possession, achieved independence as part of a larger decolonisation movement in which Britain granted independence to most territories of the Empire. The transfer of Hong Kong to China in 1997 marked for many the end of the British Empire. Fourteen overseas territories remain under British sovereignty. After independence, many former British colonies joined the Commonwealth of Nations, a free association of independent states. The United Kingdom is now one of 16 Commonwealth nations, a grouping known informally as the Commonwealth realms, that share one monarch—Queen Elizabeth II.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Triplet
- Dataset:
gooqa-dev - Evaluated with
TripletEvaluator
| Metric | Value |
|---|---|
| cosine_accuracy | 0.4192 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 44,283 training samples
- Columns:
question,context, andnegative - Approximate statistics based on the first 1000 samples:
question context negative type string string string details - min: 6 tokens
- mean: 14.62 tokens
- max: 35 tokens
- min: 31 tokens
- mean: 152.26 tokens
- max: 512 tokens
- min: 32 tokens
- mean: 153.89 tokens
- max: 510 tokens
- Samples:
question context negative About how many cubic meters of wood was used in 1991 to make products like glulam, LVL, and structural composite lumber?These products include glued laminated timber (glulam), wood structural panels (including plywood, oriented strand board and composite panels), laminated veneer lumber (LVL) and other structural composite lumber (SCL) products, parallel strand lumber, and I-joists. Approximately 100 million cubic meters of wood was consumed for this purpose in 1991. The trends suggest that particle board and fiber board will overtake plywood.Engineered wood products, glued building products "engineered" for application-specific performance requirements, are often used in construction and industrial applications. Glued engineered wood products are manufactured by bonding together wood strands, veneers, lumber or other forms of wood fiber with glue to form a larger, more efficient composite structural unit.Which Eclogues discusses homosexual love?The biographical tradition asserts that Virgil began the hexameter Eclogues (or Bucolics) in 42 BC and it is thought that the collection was published around 39–38 BC, although this is controversial. The Eclogues (from the Greek for "selections") are a group of ten poems roughly modeled on the bucolic hexameter poetry ("pastoral poetry") of the Hellenistic poet Theocritus. After his victory in the Battle of Philippi in 42 BC, fought against the army led by the assassins of Julius Caesar, Octavian tried to pay off his veterans with land expropriated from towns in northern Italy, supposedly including, according to the tradition, an estate near Mantua belonging to Virgil. The loss of his family farm and the attempt through poetic petitions to regain his property have traditionally been seen as Virgil's motives in the composition of the Eclogues. This is now thought to be an unsupported inference from interpretations of the Eclogues. In Eclogues 1 and 9, Virgil indeed dramatizes the contra...Sometime after the publication of the Eclogues (probably before 37 BC), Virgil became part of the circle of Maecenas, Octavian's capable agent d'affaires who sought to counter sympathy for Antony among the leading families by rallying Roman literary figures to Octavian's side. Virgil came to know many of the other leading literary figures of the time, including Horace, in whose poetry he is often mentioned, and Varius Rufus, who later helped finish the Aeneid.What nation lies to the west of the Marshall Islands?The islands are located about halfway between Hawaii and Australia, north of Nauru and Kiribati, east of the Federated States of Micronesia, and south of the U.S. territory of Wake Island, to which it lays claim. The atolls and islands form two groups: the Ratak (sunrise) and the Ralik (sunset). The two island chains lie approximately parallel to one another, running northwest to southeast, comprising about 750,000 square miles (1,900,000 km2) of ocean but only about 70 square miles (180 km2) of land mass. Each includes 15 to 18 islands and atolls. The country consists of a total of 29 atolls and five isolated islands.The climate is hot and humid, with a wet season from May to November. Many Pacific typhoons begin as tropical storms in the Marshall Islands region, and grow stronger as they move west toward the Mariana Islands and the Philippines. - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 5,000 evaluation samples
- Columns:
question,context, andnegative_1 - Approximate statistics based on the first 1000 samples:
question context negative_1 type string string string details - min: 3 tokens
- mean: 14.61 tokens
- max: 52 tokens
- min: 28 tokens
- mean: 151.48 tokens
- max: 494 tokens
- min: 28 tokens
- mean: 151.7 tokens
- max: 512 tokens
- Samples:
question context negative_1 When did Simon Cowell announce he was no longer going to be a judge?In season eight, Latin Grammy Award-nominated singer–songwriter and record producer Kara DioGuardi was added as a fourth judge. She stayed for two seasons and left the show before season ten. Paula Abdul left the show before season nine after failing to agree terms with the show producers. Emmy Award-winning talk show host Ellen DeGeneres replaced Paula Abdul for that season, but left after just one season. On January 11, 2010, Simon Cowell announced that he was leaving the show to pursue introducing the American version of his show The X Factor to the USA for 2011. Jennifer Lopez and Steven Tyler joined the judging panel in season ten, but both left after two seasons. They were replaced by three new judges, Mariah Carey, Nicki Minaj and Keith Urban, who joined Randy Jackson in season 12. However both Carey and Minaj left after one season, and Randy Jackson also announced that he would depart the show after twelve seasons as a judge but would return as a mentor. Urban is the only judge...A special tribute to Simon Cowell was presented in the finale for his final season with the show. Many figures from the show's past, including Paula Abdul, made an appearance.When did Hodgson publish his DNA study?According to an autosomal DNA study by Hodgson et al. (2014), the Afro-Asiatic languages were likely spread across Africa and the Near East by an ancestral population(s) carrying a newly identified non-African genetic component, which the researchers dub the "Ethio-Somali". This Ethio-Somali component is today most common among Afro-Asiatic-speaking populations in the Horn of Africa. It reaches a frequency peak among ethnic Somalis, representing the majority of their ancestry. The Ethio-Somali component is most closely related to the Maghrebi non-African genetic component, and is believed to have diverged from all other non-African ancestries at least 23,000 years ago. On this basis, the researchers suggest that the original Ethio-Somali carrying population(s) probably arrived in the pre-agricultural period from the Near East, having crossed over into northeastern Africa via the Sinai Peninsula. The population then likely split into two branches, with one group heading westward toward ...Advances in understanding genes and inheritance continued throughout the 20th century. Deoxyribonucleic acid (DNA) was shown to be the molecular repository of genetic information by experiments in the 1940s to 1950s. The structure of DNA was studied by Rosalind Franklin using X-ray crystallography, which led James D. Watson and Francis Crick to publish a model of the double-stranded DNA molecule whose paired nucleotide bases indicated a compelling hypothesis for the mechanism of genetic replication. Collectively, this body of research established the central dogma of molecular biology, which states that proteins are translated from RNA, which is transcribed from DNA. This dogma has since been shown to have exceptions, such as reverse transcription in retroviruses. The modern study of genetics at the level of DNA is known as molecular genetics.What campus did Yale buy in 2008?Yale's central campus in downtown New Haven covers 260 acres (1.1 km2) and comprises its main, historic campus and a medical campus adjacent to the Yale-New Haven Hospital. In western New Haven, the university holds 500 acres (2.0 km2) of athletic facilities, including the Yale Golf Course. In 2008, Yale purchased the 136-acre (0.55 km2) former Bayer Pharmaceutical campus in West Haven, Connecticut, the buildings of which are now used as laboratory and research space. Yale also owns seven forests in Connecticut, Vermont, and New Hampshire—the largest of which is the 7,840-acre (31.7 km2) Yale-Myers Forest in Connecticut's Quiet Corner—and nature preserves including Horse Island.Yale's central campus in downtown New Haven covers 260 acres (1.1 km2) and comprises its main, historic campus and a medical campus adjacent to the Yale-New Haven Hospital. In western New Haven, the university holds 500 acres (2.0 km2) of athletic facilities, including the Yale Golf Course. In 2008, Yale purchased the 136-acre (0.55 km2) former Bayer Pharmaceutical campus in West Haven, Connecticut, the buildings of which are now used as laboratory and research space. Yale also owns seven forests in Connecticut, Vermont, and New Hampshire—the largest of which is the 7,840-acre (31.7 km2) Yale-Myers Forest in Connecticut's Quiet Corner—and nature preserves including Horse Island. - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 128per_device_eval_batch_size: 128warmup_ratio: 0.1fp16: Truebatch_sampler: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 128per_device_eval_batch_size: 128per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 3max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional
Training Logs
| Epoch | Step | Training Loss | Validation Loss | gooqa-dev_cosine_accuracy |
|---|---|---|---|---|
| -1 | -1 | - | - | 0.3536 |
| 0.2890 | 100 | 0.6663 | 0.7830 | 0.3922 |
| 0.5780 | 200 | 0.4374 | 0.7399 | 0.4064 |
| 0.8671 | 300 | 0.4048 | 0.7294 | 0.4086 |
| 1.1561 | 400 | 0.3149 | 0.7244 | 0.4136 |
| 1.4451 | 500 | 0.2378 | 0.7246 | 0.4182 |
| 1.7341 | 600 | 0.2358 | 0.7179 | 0.4158 |
| 2.0231 | 700 | 0.2338 | 0.7170 | 0.4240 |
| 2.3121 | 800 | 0.1602 | 0.7293 | 0.4148 |
| 2.6012 | 900 | 0.1595 | 0.7237 | 0.4230 |
| 2.8902 | 1000 | 0.1545 | 0.7229 | 0.4146 |
| -1 | -1 | - | - | 0.4192 |
Framework Versions
- Python: 3.11.0
- Sentence Transformers: 4.0.1
- Transformers: 4.50.3
- PyTorch: 2.6.0+cu124
- Accelerate: 1.5.2
- Datasets: 3.5.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}