Filtering
Collection
12 items
•
Updated
This is a sentence-transformers model finetuned from intfloat/e5-small-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'query: How Much Isoflavones Do I Need to Reduce Ovarian Cancer Risk?',
'passsage: Dietary phytochemical compounds, including isoflavones and isothiocyanates, may inhibit cancer development but have not yet been examined in prospective epidemiologic studies of ovarian cancer. The authors have investigated the association between consumption of these and other nutrients and ovarian cancer risk in a prospective cohort study. Among 97,275 eligible women in the California Teachers Study cohort who completed the baseline dietary assessment in 1995–1996, 280 women developed invasive or borderline ovarian cancer by December 31, 2003. Multivariable Cox proportional hazards regression, with age as the timescale, was used to estimate relative risks and 95% confidence intervals; all statistical tests were two sided. Intake of isoflavones was associated with lower risk of ovarian cancer. Compared with the risk for women who consumed less than 1 mg of total isoflavones per day, the relative risk of ovarian cancer associated with consumption of more than 3 mg/day was 0.56 (95% confidence interval: 0.33, 0.96). Intake of isothiocyanates or foods high in isothiocyanates was not associated with ovarian cancer risk, nor was intake of macronutrients, antioxidant vitamins, or other micronutrients. Although dietary consumption of isoflavones may be associated with decreased ovarian cancer risk, most dietary factors are unlikely to play a major role in ovarian cancer development.',
"passsage: It is likely that plant food consumption throughout much of human evolution shaped the dietary requirements of contemporary humans. Diets would have been high in dietary fiber, vegetable protein, plant sterols and associated phytochemicals, and low in saturated and trans-fatty acids and other substrates for cholesterol biosynthesis. To meet the body's needs for cholesterol, we believe genetic differences and polymorphisms were conserved by evolution, which tended to raise serum cholesterol levels. As a result modern man, with a radically different diet and lifestyle, especially in middle age, is now recommended to take medications to lower cholesterol and reduce the risk of cardiovascular disease. Experimental introduction of high intakes of viscous fibers, vegetable proteins and plant sterols in the form of a possible Myocene diet of leafy vegetables, fruit and nuts, lowered serum LDL-cholesterol in healthy volunteers by over 30%, equivalent to first generation statins, the standard cholesterol-lowering medications. Furthermore, supplementation of a modern therapeutic diet in hyperlipidemic subjects with the same components taken as oat, barley and psyllium for viscous fibers, soy and almonds for vegetable proteins and plant sterol-enriched margarine produced similar reductions in LDL-cholesterol as the Myocene-like diet and reduced the majority of subjects' blood lipids concentrations into the normal range. We conclude that reintroduction of plant food components, which would have been present in large quantities in the plant based diets eaten throughout most of human evolution into modern diets can correct the lipid abnormalities associated with contemporary eating patterns and reduce the need for pharmacological interventions.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000, 0.8492, 0.0442],
# [ 0.8492, 1.0000, -0.0126],
# [ 0.0442, -0.0126, 1.0000]])
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
query: Does Turmeric Help With Ulcers? |
passsage: The purpose of this review is to summarize the pertinent literature published in the present era regarding the antiulcerogenic property of curcumin against the pathological changes in response to ulcer effectors (Helicobacter pylori infection, chronic ingestion of non-steroidal anti-inflammatory drugs, and exogenous substances). The gastrointestinal problems caused by different etiologies was observed to be associated with the alterations of various physiologic parameters such as reactive oxygen species, nitric oxide synthase, lipid peroxidation, and secretion of excessive gastric acid. Gastrointestinal ulcer results probably due to imbalance between the aggressive and the defensive factors. In 80% of the cases, gastric ulcer is caused primarily due to the use of non-steroidal anti-inflammatory category of drug, 10% by H. pylori, and about 8-10% by the intake of very spicy and fast food. Although a number of antiulcer drugs and cytoprotectants are available, all these drugs h... |
query: What Are the Dangers of Environmental Toxins for Men's Reproductive Health? |
passsage: Male reproductive disorders that are of interest from an environmental point of view include sexual dysfunction, infertility, cryptorchidism, hypospadias and testicular cancer. Several reports suggest declining sperm counts and increase of these reproductive disorders in some areas during some time periods past 50 years. Except for testicular cancer this evidence is circumstantial and needs cautious interpretation. However, the male germ line is one of the most sensitive tissues to the damaging effects of ionizing radiation, radiant heat and a number of known toxicants. So far occupational hazards are the best documented risk factors for impaired male reproductive function and include physical exposures (radiant heat, ionizing radiation, high frequency electromagnetic radiation), chemical exposures (some solvents as carbon disulfide and ethylene glycol ethers, some pesticides as dibromochloropropane, ethylendibromide and DDT/DDE, some heavy metals as inorganic lead and mercur... |
query: Does Flaxseed Help Lower Blood Pressure? |
passsage: Flaxseed contains ω-3 fatty acids, lignans, and fiber that together may provide benefits to patients with cardiovascular disease. Animal work identified that patients with peripheral artery disease may particularly benefit from dietary supplementation with flaxseed. Hypertension is commonly associated with peripheral artery disease. The purpose of the study was to examine the effects of daily ingestion of flaxseed on systolic (SBP) and diastolic blood pressure (DBP) in peripheral artery disease patients. In this prospective, double-blinded, placebo-controlled, randomized trial, patients (110 in total) ingested a variety of foods that contained 30 g of milled flaxseed or placebo each day over 6 months. Plasma levels of the ω-3 fatty acid α-linolenic acid and enterolignans increased 2- to 50-fold in the flaxseed-fed group but did not increase significantly in the placebo group. Patient body weights were not significantly different between the 2 groups at any time. SBP was ≈ 10 ... |
CachedMultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"mini_batch_size": 16,
"gather_across_devices": false
}
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
query: Does Diet Reduce Alzheimer's Risk? |
passsage: BACKGROUND: Numerous studies have investigated risk factors for Alzheimer disease (AD). However, at a recent National Institutes of Health State-of-the-Science Conference, an independent panel found insufficient evidence to support the association of any modifiable factor with risk of cognitive decline or AD. OBJECTIVE: To present key findings for selected factors and AD risk that led the panel to their conclusion. DATA SOURCES: An evidence report was commissioned by the Agency for Healthcare Research and Quality. It included English-language publications in MEDLINE and the Cochrane Database of Systematic Reviews from 1984 through October 27, 2009. Expert presentations and public discussions were considered. STUDY SELECTION: Study inclusion criteria for the evidence report were participants aged 50 years and older from general populations in developed countries; minimum sample sizes of 300 for cohort studies and 50 for randomized controlled trials; at least 2 years between ex... |
query: Is Spam Bad For Your Diabetes? |
passsage: Background: Fifty percent of American Indians (AIs) develop diabetes by age 55 y. Whether processed meat is associated with the risk of diabetes in AIs, a rural population with a high intake of processed meat (eg, canned meats in general, referred to as “spam”) and a high rate of diabetes, is unknown. Objective: We examined the associations of usual intake of processed meat with incident diabetes in AIs. Design: This prospective cohort study included AI participants from the Strong Heart Family Study who were free of diabetes and cardiovascular disease at baseline and who participated in a 5-y follow-up examination (n = 2001). Dietary intake was ascertained by using a Block food-frequency questionnaire at baseline. Incident diabetes was defined on the basis of 2003 American Diabetes Association criteria. Generalized estimating equations were used to examine the associations of dietary intake with incident diabetes. Results: We identified 243 incident cases of diabetes. In a c... |
query: Is Vitamin D Good For Cancer Prevention? |
passsage: Observational and ecological studies are generally used to determine the presence of effect of cancer risk-modifying factors. Researchers generally agree that environmental factors such as smoking, alcohol consumption, poor diet, lack of physical activity, and low serum 25-hdyroxyvitamin D levels are important cancer risk factors. This ecological study used age-adjusted incidence rates for 21 cancers for 157 countries (87 with high-quality data) in 2008 with respect to dietary supply and other factors, including per capita gross domestic product, life expectancy, lung cancer incidence rate (an index for smoking), and latitude (an index for solar ultraviolet-B doses). The factors found to correlate strongly with multiple types of cancer were lung cancer (direct correlation with 12 types of cancer), energy derived from animal products (direct correlation with 12 types of cancer, inverse with two), latitude (direct correlation with six types, inverse correlation with three), and... |
CachedMultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"mini_batch_size": 16,
"gather_across_devices": false
}
eval_strategy: epochper_device_train_batch_size: 128learning_rate: 2e-05num_train_epochs: 30warmup_ratio: 0.1fp16: Trueload_best_model_at_end: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: epochprediction_loss_only: Trueper_device_train_batch_size: 128per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 30max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | Validation Loss |
|---|---|---|---|
| 1.0 | 205 | 1.4561 | 0.0201 |
| 2.0 | 410 | 0.1947 | 0.0127 |
| 3.0 | 615 | 0.1403 | 0.0114 |
| 4.0 | 820 | 0.1137 | 0.0088 |
| 5.0 | 1025 | 0.0911 | 0.0080 |
| 6.0 | 1230 | 0.0818 | 0.0084 |
| 7.0 | 1435 | 0.0734 | 0.0078 |
| 8.0 | 1640 | 0.067 | 0.0076 |
| 9.0 | 1845 | 0.0607 | 0.0082 |
| 10.0 | 2050 | 0.0553 | 0.0073 |
| 11.0 | 2255 | 0.0526 | 0.0065 |
| 12.0 | 2460 | 0.0505 | 0.0066 |
| 13.0 | 2665 | 0.0505 | 0.0063 |
| 14.0 | 2870 | 0.0466 | 0.0060 |
| 15.0 | 3075 | 0.0448 | 0.0063 |
| 16.0 | 3280 | 0.045 | 0.0060 |
| 17.0 | 3485 | 0.0417 | 0.0057 |
| 18.0 | 3690 | 0.0395 | 0.0057 |
| 19.0 | 3895 | 0.0425 | 0.0065 |
| 20.0 | 4100 | 0.0379 | 0.0058 |
| 21.0 | 4305 | 0.0389 | 0.0055 |
| 22.0 | 4510 | 0.0394 | 0.0055 |
| 23.0 | 4715 | 0.0351 | 0.0057 |
| 24.0 | 4920 | 0.0359 | 0.0054 |
| 25.0 | 5125 | 0.0366 | 0.0054 |
| 26.0 | 5330 | 0.0356 | 0.0055 |
| 27.0 | 5535 | 0.0347 | 0.0055 |
| 28.0 | 5740 | 0.0346 | 0.0054 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021},
eprint={2101.06983},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Base model
intfloat/e5-small-v2