metadata
language:
- en
tags:
- sentence-transformers
- cross-encoder
- reranker
- generated_from_trainer
- dataset_size:78704
- loss:ListNetLoss
base_model: jhu-clsp/ettin-encoder-1b
datasets:
- microsoft/ms_marco
pipeline_tag: text-ranking
library_name: sentence-transformers
metrics:
- map
- mrr@10
- ndcg@10
model-index:
- name: CrossEncoder based on jhu-clsp/ettin-encoder-1b
results:
- task:
type: cross-encoder-reranking
name: Cross Encoder Reranking
dataset:
name: NanoMSMARCO R100
type: NanoMSMARCO_R100
metrics:
- type: map
value: 0.5989
name: Map
- type: mrr@10
value: 0.5889
name: Mrr@10
- type: ndcg@10
value: 0.6445
name: Ndcg@10
- task:
type: cross-encoder-reranking
name: Cross Encoder Reranking
dataset:
name: NanoNFCorpus R100
type: NanoNFCorpus_R100
metrics:
- type: map
value: 0.3535
name: Map
- type: mrr@10
value: 0.5271
name: Mrr@10
- type: ndcg@10
value: 0.3808
name: Ndcg@10
- task:
type: cross-encoder-reranking
name: Cross Encoder Reranking
dataset:
name: NanoNQ R100
type: NanoNQ_R100
metrics:
- type: map
value: 0.6692
name: Map
- type: mrr@10
value: 0.6896
name: Mrr@10
- type: ndcg@10
value: 0.7157
name: Ndcg@10
- task:
type: cross-encoder-nano-beir
name: Cross Encoder Nano BEIR
dataset:
name: NanoBEIR R100 mean
type: NanoBEIR_R100_mean
metrics:
- type: map
value: 0.5405
name: Map
- type: mrr@10
value: 0.6018
name: Mrr@10
- type: ndcg@10
value: 0.5804
name: Ndcg@10
CrossEncoder based on jhu-clsp/ettin-encoder-1b
This is a Cross Encoder model finetuned from jhu-clsp/ettin-encoder-1b on the ms_marco dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
Model Details
Model Description
- Model Type: Cross Encoder
- Base model: jhu-clsp/ettin-encoder-1b
- Maximum Sequence Length: 7999 tokens
- Number of Output Labels: 1 label
- Training Dataset:
- Language: en
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("kdhole/reranker-msmarco-v1.1-ettin-encoder-1b-listnet")
# Get scores for pairs of texts
pairs = [
['how do you measure a horse in hands', '1 A hand is equal to 4 inches or 10.2cms. 2 You should measure your horse from the point of the withers to the ground. 3 A horse that is 61 inches tall is 15.1 hands or 15 hands and 1 inch or 15.1hh. 4 This is calculated using (61/4 = 15.25); the .25 is the decimal equivalent of one quarter and a quarter of 4 = 1; so 15.1hh.'],
['how do you measure a horse in hands', '1 If a measuring tape is being used, conversion of the measurement from inches to hands is required. 2 One hand equals 4 inches (10.2 cm), so divide the measurement by 4. 3 For example, if the horse measures 71 inches (180.3 cm), divide 71 by 4 inches. 4 The result is 17 hands with 3 inches (7.6 cm) left over.'],
['how do you measure a horse in hands', 'Record the measurement. 1 If the horse measuring stick is being used, then the measurement can be recorded in hands immediately. 2 If a measuring tape is being used, conversion of the measurement from inches to hands is required. 3 One hand equals 4 inches (10.2 cm), so divide the measurement by 4.'],
['how do you measure a horse in hands', 'After you have measured your horse you will need to convert the results from inches to hands.. Horse height is correctly referred to by a unit of measurement known as a hand.. One hand is equal to four inches. The gray mare in the photo above is 58 inches from the ground to the top of her withers. When 58 is divided by 4, you have 14.5.'],
['how do you measure a horse in hands', '1 If the horse measuring stick is being used, then the measurement can be recorded in hands immediately. 2 If a measuring tape is being used, conversion of the measurement from inches to hands is required. 3 One hand equals 4 inches (10.2 cm), so divide the measurement by 4.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'how do you measure a horse in hands',
[
'1 A hand is equal to 4 inches or 10.2cms. 2 You should measure your horse from the point of the withers to the ground. 3 A horse that is 61 inches tall is 15.1 hands or 15 hands and 1 inch or 15.1hh. 4 This is calculated using (61/4 = 15.25); the .25 is the decimal equivalent of one quarter and a quarter of 4 = 1; so 15.1hh.',
'1 If a measuring tape is being used, conversion of the measurement from inches to hands is required. 2 One hand equals 4 inches (10.2 cm), so divide the measurement by 4. 3 For example, if the horse measures 71 inches (180.3 cm), divide 71 by 4 inches. 4 The result is 17 hands with 3 inches (7.6 cm) left over.',
'Record the measurement. 1 If the horse measuring stick is being used, then the measurement can be recorded in hands immediately. 2 If a measuring tape is being used, conversion of the measurement from inches to hands is required. 3 One hand equals 4 inches (10.2 cm), so divide the measurement by 4.',
'After you have measured your horse you will need to convert the results from inches to hands.. Horse height is correctly referred to by a unit of measurement known as a hand.. One hand is equal to four inches. The gray mare in the photo above is 58 inches from the ground to the top of her withers. When 58 is divided by 4, you have 14.5.',
'1 If the horse measuring stick is being used, then the measurement can be recorded in hands immediately. 2 If a measuring tape is being used, conversion of the measurement from inches to hands is required. 3 One hand equals 4 inches (10.2 cm), so divide the measurement by 4.',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
Evaluation
Metrics
Cross Encoder Reranking
- Datasets:
NanoMSMARCO_R100,NanoNFCorpus_R100andNanoNQ_R100 - Evaluated with
CrossEncoderRerankingEvaluatorwith these parameters:{ "at_k": 10, "always_rerank_positives": true }
| Metric | NanoMSMARCO_R100 | NanoNFCorpus_R100 | NanoNQ_R100 |
|---|---|---|---|
| map | 0.5989 (+0.1094) | 0.3535 (+0.0925) | 0.6692 (+0.2496) |
| mrr@10 | 0.5889 (+0.1114) | 0.5271 (+0.0272) | 0.6896 (+0.2629) |
| ndcg@10 | 0.6445 (+0.1041) | 0.3808 (+0.0558) | 0.7157 (+0.2151) |
Cross Encoder Nano BEIR
- Dataset:
NanoBEIR_R100_mean - Evaluated with
CrossEncoderNanoBEIREvaluatorwith these parameters:{ "dataset_names": [ "msmarco", "nfcorpus", "nq" ], "rerank_k": 100, "at_k": 10, "always_rerank_positives": true }
| Metric | Value |
|---|---|
| map | 0.5405 (+0.1505) |
| mrr@10 | 0.6018 (+0.1338) |
| ndcg@10 | 0.5804 (+0.1250) |
Training Details
Training Dataset
ms_marco
- Dataset: ms_marco at a47ee7a
- Size: 78,704 training samples
- Columns:
query,docs, andlabels - Approximate statistics based on the first 1000 samples:
query docs labels type string list list details - min: 11 characters
- mean: 33.93 characters
- max: 109 characters
- min: 3 elements
- mean: 6.50 elements
- max: 10 elements
- min: 3 elements
- mean: 6.50 elements
- max: 10 elements
- Samples:
query docs labels Hemophilia is a group of different inherited blood-clotting disorders. Which is true about hemophilia['Hemophilia is a hereditary bleeding disorder caused by a deficiency in one of two blood clotting factors: factor VIII or factor IX. Several different gene abnormalities can cause the disorder. People bleed unexpectedly or after minor injuries. ', 'Hemophilia is an inherited bleeding disorder that almost always affects males. A person with hemophilia has low or non-existent levels of blood clotting protein called factor. Coagulation factor is necessary for the clotting mechanism in our bodies to work. There are 13 blood clotting proteins (coagulation factor) along with platelets and fibrin necessary for clotting blood. Factor IX deficiency usually only manifests in males. Hemophilia C: This person has low levels of or is missing completely factor 11 (Also called FXI or factor XI deficiency) Hemophilia C is 10 times rarer than type A. Factor XI deficiency is different because it can show up in both males and females.', 'Hemophilia is a rare hereditary (inherited) bleeding disorder in w...[1, 0, 0, 0, 0, ...]what is the meaning of nazia['Show similar names Show variant names. Name Nazia generally means Princess or Queen, is of Indian origin, Name Nazia is a Feminine (or Girl) name. Person with name Nazia are mainly Muslim by religion. Name Nazia belongs to rashi Vrushik (Scorpio) with dominant planet Mars (Mangal) ', "Nazia's are very outgoing once you get to meet her,she's also a undercover freak so you gotta watch her. Nazia's are unique you can tell by the name, she yurn for attention and always wants to be in a relationship. Nazia's never like to be alone they love to be around people. They are loyal so once you meet one keep them. Nazia's are good friends once you proove to them your not fake. When you meet Nazia, You'll Love her. A beautiful girl! The name means 'Pride' so she is hardworking to bring that status to her family. All Nazia's are fantastic and they don't open up easily so you will have to give them some time.", '(viewable to Premium Members only). Below is a brief analysis of the first name only. F...[1, 0, 0, 0, 0, ...]how injection moulding temperature affects polystyrene['But melt temperature also has an influence on the final molecular weight of the polymer in the moulded part[3,4]. Keywords: Polymer nanocomposites, nano kaolin clay, injection moulding, moulding temperature. influence on the behaviour of the polymer are the 1. Material is fed into a heated barrel, mixed, and forced into a mould cavity where it cools and hardens to the configuration of the cavity[13]. In injection moulding, moulding conditions have a significant influence on the final properties of the material regardless of the part design.', 'It is very easy to forget that plastic melts are not thermally stable over long periods at, or above, melt temperature. Equally, it is as easy to forget that the molten mass is not impervious to the effects of shear. Plastic Melts are not Newtonian in their behaviour. That is they do not react in a linear fashion when exposed to shearing of the melt or changes in temperature. A Newtonian melt would show a straight line graph when plotted for sh...[1, 0, 0, 0, 0, ...] - Loss:
ListNetLosswith these parameters:{ "activation_fn": "torch.nn.modules.linear.Identity", "mini_batch_size": 16 }
Evaluation Dataset
ms_marco
- Dataset: ms_marco at a47ee7a
- Size: 1,000 evaluation samples
- Columns:
query,docs, andlabels - Approximate statistics based on the first 1000 samples:
query docs labels type string list list details - min: 11 characters
- mean: 34.24 characters
- max: 101 characters
- min: 3 elements
- mean: 6.50 elements
- max: 10 elements
- min: 3 elements
- mean: 6.50 elements
- max: 10 elements
- Samples:
query docs labels how do you measure a horse in hands['1 A hand is equal to 4 inches or 10.2cms. 2 You should measure your horse from the point of the withers to the ground. 3 A horse that is 61 inches tall is 15.1 hands or 15 hands and 1 inch or 15.1hh. 4 This is calculated using (61/4 = 15.25); the .25 is the decimal equivalent of one quarter and a quarter of 4 = 1; so 15.1hh.', '1 If a measuring tape is being used, conversion of the measurement from inches to hands is required. 2 One hand equals 4 inches (10.2 cm), so divide the measurement by 4. 3 For example, if the horse measures 71 inches (180.3 cm), divide 71 by 4 inches. 4 The result is 17 hands with 3 inches (7.6 cm) left over.', 'Record the measurement. 1 If the horse measuring stick is being used, then the measurement can be recorded in hands immediately. 2 If a measuring tape is being used, conversion of the measurement from inches to hands is required. 3 One hand equals 4 inches (10.2 cm), so divide the measurement by 4.', 'After you have measured your horse you wi...[1, 0, 0, 0, 0, ...]where is amsterdam located["Amsterdam is located in the western Netherlands, in the province of North Holland. The river Amstel terminates in the city centre and connects to a large number of canals that eventually terminate in the IJ. Amsterdam is situated 2 metres below sea level. The surrounding land is flat as it is formed of large polders. Amsterdam's main attractions, including its historic canals, the Rijksmuseum, the Van Gogh Museum, Stedelijk Museum, Hermitage Amsterdam, Anne Frank House, Amsterdam Museum, its red-light district, and its many cannabis coffee shops draw more than 5 million international visitors annually.", 'The Netherlands is bordered by Belgium in the South, Germany in the East and the Northsea in the North and West. Amsterdam is located in the South of the province of North Holland: Amsterdam Facts. 1 Amsterdam is the largest city in the Netherlands. 2 Amsterdam is the capital of the Netherlands (while The Hague is the seat of government). 3 Amsterdam is the financial and cultural...[1, 0, 0, 0, 0, ...]what does affected mean['Effected means executed, produced, or brought about. For example, The dictatorial regime quickly effected changes to the constitution that restricted the freedom of the people. On the other hand, affected means made an impact on. It is the past tense of the verb form of affect, which means to impact.', 'Meaning of Affect and Effect. In order to understand the correct situation in which to use the word affect or effect, the first thing one must do is have a clear understanding of what each word means. 1 Affect is a verb. 2 It means to produce a change in or influence something. 3 Effect is a noun that can also be used as a verb.', "affect 2 is not used as a noun; as a verb it means “to pretend” or “to assume” (new students affecting a nonchalance they didn't feel). The verb effect means “to bring about, accomplish”: Her administration effected radical changes. The noun effect means “result, consequence”: the serious effects of the oil spill.", 'Affect means to have an influence on something. Affect is normally a verb. Effect is the result of an influence or change. Effect is normally a noun. They are related in t … hat when something affects something else, it produces an effect on it. The word affect has a noun meaning related to psychology and emotion. The word effect has a verb meaning, which is to create, bring about, or institute.', 'In order to understand the correct situation in which to use the word affect or effect, the first thing one must do is have a clear understanding of what each word means. 1 Affect is a verb. 2 It means to produce a change in or influence something. 3 Effect is a noun that can also be used as a verb.'][1, 0, 0, 0, 0] - Loss:
ListNetLosswith these parameters:{ "activation_fn": "torch.nn.modules.linear.Identity", "mini_batch_size": 16 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 16per_device_eval_batch_size: 16learning_rate: 2e-05num_train_epochs: 1seed: 12bf16: Trueload_best_model_at_end: True
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 12data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
| Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_R100_ndcg@10 | NanoNFCorpus_R100_ndcg@10 | NanoNQ_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
|---|---|---|---|---|---|---|---|
| -1 | -1 | - | - | 0.0000 (-0.5404) | 0.2648 (-0.0602) | 0.0388 (-0.4618) | 0.1012 (-0.3541) |
| 0.0002 | 1 | 2.3028 | - | - | - | - | - |
| 0.0203 | 100 | 2.0955 | 2.0679 | 0.3022 (-0.2382) | 0.2808 (-0.0442) | 0.4762 (-0.0244) | 0.3531 (-0.1023) |
| 0.0407 | 200 | 2.0633 | 2.0643 | 0.5733 (+0.0329) | 0.3362 (+0.0112) | 0.6797 (+0.1790) | 0.5297 (+0.0743) |
| 0.0610 | 300 | 2.0738 | 2.0616 | 0.5738 (+0.0334) | 0.3480 (+0.0230) | 0.6018 (+0.1011) | 0.5079 (+0.0525) |
| 0.0813 | 400 | 2.0679 | 2.0617 | 0.5441 (+0.0036) | 0.3162 (-0.0088) | 0.6688 (+0.1681) | 0.5097 (+0.0543) |
| 0.1016 | 500 | 2.0702 | 2.0619 | 0.5566 (+0.0161) | 0.3423 (+0.0172) | 0.6932 (+0.1925) | 0.5307 (+0.0753) |
| 0.1220 | 600 | 2.0719 | 2.0602 | 0.5583 (+0.0179) | 0.3643 (+0.0392) | 0.7066 (+0.2060) | 0.5431 (+0.0877) |
| 0.1423 | 700 | 2.066 | 2.0600 | 0.5792 (+0.0388) | 0.3470 (+0.0219) | 0.6971 (+0.1965) | 0.5411 (+0.0857) |
| 0.1626 | 800 | 2.0704 | 2.0595 | 0.5980 (+0.0576) | 0.3493 (+0.0243) | 0.6749 (+0.1743) | 0.5407 (+0.0854) |
| 0.1830 | 900 | 2.0804 | 2.0596 | 0.6080 (+0.0675) | 0.3557 (+0.0307) | 0.6314 (+0.1307) | 0.5317 (+0.0763) |
| 0.2033 | 1000 | 2.0697 | 2.0590 | 0.5992 (+0.0587) | 0.3262 (+0.0012) | 0.7125 (+0.2119) | 0.5460 (+0.0906) |
| 0.2236 | 1100 | 2.0756 | 2.0597 | 0.6133 (+0.0729) | 0.3890 (+0.0639) | 0.6932 (+0.1926) | 0.5652 (+0.1098) |
| 0.2440 | 1200 | 2.0761 | 2.0592 | 0.5937 (+0.0533) | 0.3614 (+0.0363) | 0.6783 (+0.1776) | 0.5445 (+0.0891) |
| 0.2643 | 1300 | 2.0688 | 2.0587 | 0.5865 (+0.0461) | 0.3562 (+0.0312) | 0.6863 (+0.1856) | 0.5430 (+0.0876) |
| 0.2846 | 1400 | 2.0622 | 2.0588 | 0.6190 (+0.0786) | 0.3610 (+0.0360) | 0.6717 (+0.1710) | 0.5506 (+0.0952) |
| 0.3049 | 1500 | 2.0674 | 2.0589 | 0.6331 (+0.0926) | 0.3719 (+0.0469) | 0.7195 (+0.2189) | 0.5748 (+0.1195) |
| 0.3253 | 1600 | 2.0731 | 2.0590 | 0.6194 (+0.0790) | 0.3777 (+0.0527) | 0.6719 (+0.1713) | 0.5564 (+0.1010) |
| 0.3456 | 1700 | 2.0607 | 2.0589 | 0.5792 (+0.0388) | 0.3991 (+0.0740) | 0.6850 (+0.1843) | 0.5544 (+0.0991) |
| 0.3659 | 1800 | 2.0716 | 2.0593 | 0.6400 (+0.0996) | 0.3810 (+0.0560) | 0.7093 (+0.2087) | 0.5768 (+0.1214) |
| 0.3863 | 1900 | 2.065 | 2.0587 | 0.6490 (+0.1086) | 0.3732 (+0.0481) | 0.6862 (+0.1855) | 0.5694 (+0.1141) |
| 0.4066 | 2000 | 2.0716 | 2.0588 | 0.6336 (+0.0932) | 0.3676 (+0.0426) | 0.7023 (+0.2016) | 0.5678 (+0.1125) |
| 0.4269 | 2100 | 2.0755 | 2.0592 | 0.6227 (+0.0823) | 0.3789 (+0.0539) | 0.6523 (+0.1517) | 0.5513 (+0.0959) |
| 0.4472 | 2200 | 2.0621 | 2.0587 | 0.6296 (+0.0892) | 0.3543 (+0.0292) | 0.6721 (+0.1714) | 0.5520 (+0.0966) |
| 0.4676 | 2300 | 2.0733 | 2.0587 | 0.6452 (+0.1048) | 0.3677 (+0.0427) | 0.6939 (+0.1932) | 0.5689 (+0.1136) |
| 0.4879 | 2400 | 2.0735 | 2.0581 | 0.6360 (+0.0956) | 0.3491 (+0.0240) | 0.6830 (+0.1824) | 0.5560 (+0.1007) |
| 0.5082 | 2500 | 2.0681 | 2.0582 | 0.6328 (+0.0924) | 0.3443 (+0.0193) | 0.6792 (+0.1785) | 0.5521 (+0.0967) |
| 0.5286 | 2600 | 2.0741 | 2.0582 | 0.6618 (+0.1214) | 0.3536 (+0.0286) | 0.6812 (+0.1806) | 0.5655 (+0.1102) |
| 0.5489 | 2700 | 2.067 | 2.0587 | 0.6611 (+0.1207) | 0.3726 (+0.0476) | 0.6826 (+0.1819) | 0.5721 (+0.1167) |
| 0.5692 | 2800 | 2.0706 | 2.0579 | 0.6627 (+0.1223) | 0.3736 (+0.0486) | 0.6843 (+0.1836) | 0.5735 (+0.1182) |
| 0.5896 | 2900 | 2.0632 | 2.0580 | 0.6426 (+0.1022) | 0.3788 (+0.0538) | 0.6940 (+0.1933) | 0.5718 (+0.1164) |
| 0.6099 | 3000 | 2.0773 | 2.0582 | 0.6445 (+0.1041) | 0.3808 (+0.0558) | 0.7157 (+0.2151) | 0.5804 (+0.1250) |
| 0.6302 | 3100 | 2.071 | 2.0583 | 0.6354 (+0.0950) | 0.3810 (+0.0559) | 0.6792 (+0.1785) | 0.5652 (+0.1098) |
| 0.6505 | 3200 | 2.0678 | 2.0579 | 0.6224 (+0.0820) | 0.3753 (+0.0502) | 0.6622 (+0.1615) | 0.5533 (+0.0979) |
| 0.6709 | 3300 | 2.066 | 2.0577 | 0.6658 (+0.1254) | 0.3761 (+0.0510) | 0.6742 (+0.1735) | 0.5720 (+0.1166) |
| 0.6912 | 3400 | 2.065 | 2.0577 | 0.6525 (+0.1121) | 0.3750 (+0.0500) | 0.6760 (+0.1754) | 0.5678 (+0.1125) |
| 0.7115 | 3500 | 2.072 | 2.0580 | 0.6296 (+0.0892) | 0.3553 (+0.0303) | 0.6632 (+0.1625) | 0.5494 (+0.0940) |
| 0.7319 | 3600 | 2.065 | 2.0580 | 0.6223 (+0.0818) | 0.3638 (+0.0387) | 0.6762 (+0.1756) | 0.5541 (+0.0987) |
| 0.7522 | 3700 | 2.0633 | 2.0574 | 0.6400 (+0.0996) | 0.3718 (+0.0468) | 0.6643 (+0.1637) | 0.5587 (+0.1034) |
| 0.7725 | 3800 | 2.0655 | 2.0576 | 0.6476 (+0.1072) | 0.3882 (+0.0632) | 0.7001 (+0.1994) | 0.5786 (+0.1233) |
| 0.7928 | 3900 | 2.0703 | 2.0572 | 0.6385 (+0.0981) | 0.3848 (+0.0597) | 0.6705 (+0.1698) | 0.5646 (+0.1092) |
| 0.8132 | 4000 | 2.0741 | 2.0572 | 0.6266 (+0.0862) | 0.3614 (+0.0364) | 0.6759 (+0.1752) | 0.5546 (+0.0993) |
| 0.8335 | 4100 | 2.058 | 2.0574 | 0.6330 (+0.0925) | 0.3750 (+0.0500) | 0.6600 (+0.1593) | 0.5560 (+0.1006) |
| 0.8538 | 4200 | 2.0758 | 2.0574 | 0.6450 (+0.1046) | 0.3774 (+0.0524) | 0.6796 (+0.1789) | 0.5673 (+0.1120) |
| 0.8742 | 4300 | 2.0648 | 2.0572 | 0.6261 (+0.0857) | 0.3681 (+0.0430) | 0.6796 (+0.1789) | 0.5579 (+0.1025) |
| 0.8945 | 4400 | 2.0647 | 2.0573 | 0.6377 (+0.0973) | 0.3724 (+0.0473) | 0.6523 (+0.1517) | 0.5541 (+0.0988) |
| 0.9148 | 4500 | 2.0634 | 2.0570 | 0.6412 (+0.1008) | 0.3738 (+0.0488) | 0.6917 (+0.1911) | 0.5689 (+0.1136) |
| 0.9351 | 4600 | 2.0675 | 2.0570 | 0.6426 (+0.1022) | 0.3819 (+0.0569) | 0.6875 (+0.1869) | 0.5707 (+0.1153) |
| 0.9555 | 4700 | 2.061 | 2.0570 | 0.6428 (+0.1024) | 0.3884 (+0.0634) | 0.6929 (+0.1923) | 0.5747 (+0.1194) |
| 0.9758 | 4800 | 2.0652 | 2.0571 | 0.6462 (+0.1058) | 0.3892 (+0.0641) | 0.6933 (+0.1927) | 0.5763 (+0.1209) |
| 0.9961 | 4900 | 2.0636 | 2.0571 | 0.6489 (+0.1084) | 0.3896 (+0.0645) | 0.6889 (+0.1883) | 0.5758 (+0.1204) |
| -1 | -1 | - | - | 0.6445 (+0.1041) | 0.3808 (+0.0558) | 0.7157 (+0.2151) | 0.5804 (+0.1250) |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.9.18
- Sentence Transformers: 5.1.1
- Transformers: 4.56.2
- PyTorch: 2.8.0+cu128
- Accelerate: 1.10.1
- Datasets: 4.1.1
- Tokenizers: 0.22.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
ListNetLoss
@inproceedings{cao2007learning,
title={Learning to Rank: From Pairwise Approach to Listwise Approach},
author={Cao, Zhe and Qin, Tao and Liu, Tie-Yan and Tsai, Ming-Feng and Li, Hang},
booktitle={Proceedings of the 24th international conference on Machine learning},
pages={129--136},
year={2007}
}