SentenceTransformer based on FacebookAI/roberta-large

This is a sentence-transformers model finetuned from FacebookAI/roberta-large on the all-nli dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: FacebookAI/roberta-large
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'RobertaModel'})
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'A construction worker peeking out of a manhole while his coworker sits on the sidewalk smiling.',
    'A worker is looking out of a manhole.',
    'The workers are both inside the manhole.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8065, 0.5747],
#         [0.8065, 1.0000, 0.6756],
#         [0.5747, 0.6756, 1.0000]])

Evaluation

Metrics

Semantic Similarity

Metric sts-dev sts-test
pearson_cosine 0.7451 0.71
spearman_cosine 0.7649 0.7351

Training Details

Training Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 557,850 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 7 tokens
    • mean: 10.38 tokens
    • max: 45 tokens
    • min: 6 tokens
    • mean: 12.8 tokens
    • max: 39 tokens
    • min: 6 tokens
    • mean: 13.4 tokens
    • max: 50 tokens
  • Samples:
    anchor positive negative
    A person on a horse jumps over a broken down airplane. A person is outdoors, on a horse. A person is at a diner, ordering an omelette.
    Children smiling and waving at camera There are children present The kids are frowning
    A boy is jumping on skateboard in the middle of a red bridge. The boy does a skateboarding trick. The boy skates down the sidewalk.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768
        ],
        "matryoshka_weights": [
            1
        ],
        "n_dims_per_step": -1
    }
    

Evaluation Dataset

all-nli

  • Dataset: all-nli at d482672
  • Size: 6,584 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 6 tokens
    • mean: 18.02 tokens
    • max: 66 tokens
    • min: 5 tokens
    • mean: 9.81 tokens
    • max: 29 tokens
    • min: 5 tokens
    • mean: 10.37 tokens
    • max: 29 tokens
  • Samples:
    anchor positive negative
    Two women are embracing while holding to go packages. Two woman are holding packages. The men are fighting outside a deli.
    Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink. Two kids in numbered jerseys wash their hands. Two kids in jackets walk to school.
    A man selling donuts to a customer during a world exhibition event held in the city of Angeles A man selling donuts to a customer. A woman drinks her coffee in a small cafe.
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768
        ],
        "matryoshka_weights": [
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • num_train_epochs: 15
  • warmup_ratio: 0.1

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss sts-dev_spearman_cosine sts-test_spearman_cosine
-1 -1 - - 0.5730 -
0.0287 500 2.658 0.6755 0.8371 -
0.0574 1000 0.8483 0.3792 0.8641 -
0.0860 1500 0.6459 0.3036 0.8668 -
0.1147 2000 0.57 0.2605 0.8709 -
0.1434 2500 0.5211 0.2503 0.8685 -
0.1721 3000 0.4947 0.2367 0.8770 -
0.2008 3500 0.4651 0.2154 0.8688 -
0.2294 4000 0.4386 0.2154 0.8716 -
0.2581 4500 0.4351 0.2128 0.8766 -
0.2868 5000 0.4189 0.2061 0.8751 -
0.3155 5500 0.3989 0.2028 0.8744 -
0.3442 6000 0.4064 0.1998 0.8768 -
0.3729 6500 0.4056 0.2044 0.8718 -
0.4015 7000 0.3975 0.1953 0.8644 -
0.4302 7500 0.3799 0.2079 0.8629 -
0.4589 8000 0.3562 0.2059 0.8656 -
0.4876 8500 0.3789 0.1991 0.8637 -
0.5163 9000 0.3812 0.1941 0.8666 -
0.5449 9500 0.3697 0.2086 0.8655 -
0.5736 10000 0.3529 0.2041 0.8649 -
0.6023 10500 0.3591 0.2099 0.8606 -
0.6310 11000 0.3479 0.2068 0.8588 -
0.6597 11500 0.3532 0.1941 0.8615 -
0.6883 12000 0.3444 0.1949 0.8633 -
0.7170 12500 0.3574 0.2172 0.8560 -
0.7457 13000 0.3581 0.2073 0.8460 -
0.7744 13500 0.3501 0.2108 0.8592 -
0.8031 14000 0.3397 0.2087 0.8554 -
0.8318 14500 0.3468 0.2164 0.8655 -
0.8604 15000 0.3414 0.2055 0.8537 -
0.8891 15500 0.3441 0.2222 0.8549 -
0.9178 16000 0.3457 0.2153 0.8561 -
0.9465 16500 0.3432 0.2215 0.8532 -
0.9752 17000 0.3315 0.2237 0.8541 -
1.0038 17500 0.3272 0.2292 0.8451 -
1.0325 18000 0.3015 0.2257 0.8472 -
1.0612 18500 0.3041 0.2224 0.8370 -
1.0899 19000 0.3016 0.2207 0.8411 -
1.1186 19500 0.3113 0.2331 0.8464 -
1.1472 20000 0.3274 0.2427 0.8393 -
1.1759 20500 0.3215 0.2405 0.8395 -
1.2046 21000 0.3268 0.2332 0.8505 -
1.2333 21500 0.3324 0.2339 0.8351 -
1.2620 22000 0.3128 0.2450 0.8423 -
1.2907 22500 0.3247 0.2546 0.8457 -
1.3193 23000 0.3432 0.2366 0.8429 -
1.3480 23500 0.3324 0.2496 0.8414 -
1.3767 24000 0.3301 0.2424 0.8451 -
1.4054 24500 0.331 0.2472 0.8431 -
1.4341 25000 0.3273 0.2829 0.8421 -
1.4627 25500 0.3768 0.2615 0.8395 -
1.4914 26000 0.3409 0.2745 0.8323 -
1.5201 26500 0.3249 0.2575 0.8317 -
1.5488 27000 0.339 0.2651 0.8303 -
1.5775 27500 0.3873 0.2624 0.8406 -
1.6061 28000 0.3376 0.2623 0.8335 -
1.6348 28500 0.3497 0.2707 0.8336 -
1.6635 29000 0.3332 0.2694 0.8421 -
1.6922 29500 0.3439 0.2632 0.8398 -
1.7209 30000 0.3469 0.2748 0.8397 -
1.7496 30500 0.3408 0.3037 0.8194 -
1.7782 31000 0.3313 0.2587 0.8308 -
1.8069 31500 0.35 0.2783 0.8315 -
1.8356 32000 0.3273 0.2645 0.8191 -
1.8643 32500 0.3409 0.2507 0.8396 -
1.8930 33000 0.338 0.2742 0.8338 -
1.9216 33500 0.3164 0.2675 0.8289 -
1.9503 34000 0.3334 0.2672 0.8365 -
1.9790 34500 0.3275 0.2773 0.8411 -
2.0077 35000 0.3323 0.2758 0.8302 -
2.0364 35500 0.2837 0.2711 0.8254 -
2.0650 36000 0.341 0.2609 0.8347 -
2.0937 36500 0.2828 0.2615 0.8298 -
2.1224 37000 0.299 0.2707 0.8228 -
2.1511 37500 0.2901 0.2846 0.8156 -
2.1798 38000 0.3316 0.2579 0.8275 -
2.2085 38500 0.2837 0.2720 0.8220 -
2.2371 39000 0.2844 0.2937 0.8139 -
2.2658 39500 0.3028 0.2967 0.8260 -
2.2945 40000 0.2857 0.2785 0.8235 -
2.3232 40500 0.2975 0.2648 0.8339 -
2.3519 41000 0.2881 0.2818 0.8163 -
2.3805 41500 0.3047 0.2811 0.8219 -
2.4092 42000 0.315 0.2915 0.8067 -
2.4379 42500 0.3044 0.2871 0.8124 -
2.4666 43000 0.3454 0.4459 0.7998 -
2.4953 43500 0.3065 0.2800 0.8248 -
2.5239 44000 0.3011 0.3524 0.7959 -
2.5526 44500 0.2923 0.2935 0.8167 -
2.5813 45000 0.3105 0.2752 0.8165 -
2.6100 45500 0.3029 0.2990 0.8139 -
2.6387 46000 0.3102 0.3041 0.8111 -
2.6674 46500 0.2992 0.2826 0.8169 -
2.6960 47000 0.2954 0.2656 0.8226 -
2.7247 47500 0.2939 0.2861 0.8014 -
2.7534 48000 0.2871 0.2799 0.8076 -
2.7821 48500 0.2878 0.2694 0.8128 -
2.8108 49000 0.2879 0.2790 0.8168 -
2.8394 49500 0.2759 0.2907 0.8162 -
2.8681 50000 0.2824 0.2829 0.8149 -
2.8968 50500 0.2835 0.2980 0.8198 -
2.9255 51000 0.2914 0.2934 0.8030 -
2.9542 51500 0.3028 0.2898 0.8149 -
2.9828 52000 0.2744 0.2873 0.8210 -
3.0115 52500 0.2674 0.2872 0.8225 -
3.0402 53000 0.2319 0.2849 0.8136 -
3.0689 53500 0.2411 0.3113 0.8129 -
3.0976 54000 0.2564 0.2783 0.8207 -
3.1263 54500 0.2508 0.2751 0.8201 -
3.1549 55000 0.2318 0.2748 0.8236 -
3.1836 55500 0.2587 0.2945 0.8007 -
3.2123 56000 0.2697 0.2882 0.8217 -
3.2410 56500 0.2535 0.2917 0.8179 -
3.2697 57000 0.25 0.2752 0.8173 -
3.2983 57500 0.2299 0.2946 0.8070 -
3.3270 58000 0.2418 0.2832 0.8207 -
3.3557 58500 0.25 0.2761 0.8154 -
3.3844 59000 0.2422 0.2763 0.8173 -
3.4131 59500 0.2598 0.2772 0.8183 -
3.4417 60000 0.2353 0.2828 0.8199 -
3.4704 60500 0.2362 0.2827 0.8154 -
3.4991 61000 0.231 0.2869 0.8040 -
3.5278 61500 0.2326 0.2862 0.7984 -
3.5565 62000 0.2424 0.2769 0.8225 -
3.5852 62500 0.2492 0.2691 0.8112 -
3.6138 63000 0.2344 0.2680 0.8070 -
3.6425 63500 0.2579 0.2736 0.8196 -
3.6712 64000 0.2294 0.2861 0.8165 -
3.6999 64500 0.2403 0.2744 0.8140 -
3.7286 65000 0.2406 0.2680 0.8119 -
3.7572 65500 0.2529 0.2703 0.8179 -
3.7859 66000 0.2464 0.2803 0.8157 -
3.8146 66500 0.2489 0.2709 0.8069 -
3.8433 67000 0.2492 0.2638 0.8202 -
3.8720 67500 0.2401 0.2813 0.8123 -
3.9006 68000 0.2487 0.2720 0.8140 -
3.9293 68500 0.2289 0.2700 0.8145 -
3.9580 69000 0.2371 0.2672 0.8266 -
3.9867 69500 0.239 0.2721 0.8218 -
4.0154 70000 0.2241 0.2669 0.8280 -
4.0441 70500 0.1993 0.2769 0.8083 -
4.0727 71000 0.2028 0.2719 0.8072 -
4.1014 71500 0.2019 0.2827 0.8126 -
4.1301 72000 0.232 0.2704 0.8093 -
4.1588 72500 0.2145 0.2763 0.8154 -
4.1875 73000 0.2233 0.2855 0.8125 -
4.2161 73500 0.2029 0.2732 0.8142 -
4.2448 74000 0.2114 0.2788 0.8063 -
4.2735 74500 0.1968 0.2824 0.8078 -
4.3022 75000 0.2015 0.2691 0.8144 -
4.3309 75500 0.2052 0.2784 0.7987 -
4.3595 76000 0.2162 0.2695 0.8106 -
4.3882 76500 0.2234 0.2618 0.8123 -
4.4169 77000 0.2074 0.2775 0.8170 -
4.4456 77500 0.2086 0.2794 0.8073 -
4.4743 78000 0.2076 0.2836 0.8069 -
4.5030 78500 0.2186 0.2613 0.8182 -
4.5316 79000 0.1963 0.2713 0.8128 -
4.5603 79500 0.2058 0.2828 0.8184 -
4.5890 80000 0.2103 0.2716 0.8123 -
4.6177 80500 0.2319 0.2869 0.8031 -
4.6464 81000 0.2156 0.2841 0.7948 -
4.6750 81500 0.2017 0.2814 0.7971 -
4.7037 82000 0.2171 0.2996 0.8040 -
4.7324 82500 0.2174 0.2771 0.8061 -
4.7611 83000 0.2061 0.2679 0.7892 -
4.7898 83500 0.2168 0.2817 0.7896 -
4.8184 84000 0.2123 0.2868 0.7827 -
4.8471 84500 0.2125 0.2782 0.7980 -
4.8758 85000 0.2032 0.2857 0.8017 -
4.9045 85500 0.2177 0.3339 0.7665 -
4.9332 86000 0.2049 0.2761 0.7998 -
4.9619 86500 0.1928 0.2857 0.8029 -
4.9905 87000 0.2098 0.2788 0.7886 -
5.0192 87500 0.1869 0.2714 0.7951 -
5.0479 88000 0.1765 0.2783 0.7931 -
5.0766 88500 0.1707 0.2867 0.8087 -
5.1053 89000 0.1732 0.2722 0.8106 -
5.1339 89500 0.1778 0.2673 0.8034 -
5.1626 90000 0.1746 0.2945 0.8036 -
5.1913 90500 0.1809 0.2710 0.7987 -
5.2200 91000 0.219 0.2871 0.8113 -
5.2487 91500 0.177 0.2949 0.7935 -
5.2773 92000 0.1846 0.2761 0.8098 -
5.3060 92500 0.1728 0.2838 0.7929 -
5.3347 93000 0.1786 0.2849 0.7999 -
5.3634 93500 0.1783 0.2770 0.8003 -
5.3921 94000 0.1764 0.2868 0.8032 -
5.4208 94500 0.1741 0.2880 0.7912 -
5.4494 95000 0.1755 0.2798 0.8094 -
5.4781 95500 0.1845 0.2822 0.7921 -
5.5068 96000 0.1761 0.2856 0.7984 -
5.5355 96500 0.1803 0.2714 0.8006 -
5.5642 97000 0.1748 0.2938 0.7972 -
5.5928 97500 0.181 0.2818 0.7904 -
5.6215 98000 0.1723 0.2773 0.8043 -
5.6502 98500 0.176 0.2783 0.8099 -
5.6789 99000 0.1718 0.2694 0.7979 -
5.7076 99500 0.1724 0.2738 0.8002 -
5.7362 100000 0.1823 0.2781 0.7712 -
5.7649 100500 0.1684 0.2677 0.7971 -
5.7936 101000 0.1706 0.2727 0.7934 -
5.8223 101500 0.1742 0.2898 0.7976 -
5.8510 102000 0.1699 0.2746 0.7794 -
5.8797 102500 0.1801 0.2697 0.7909 -
5.9083 103000 0.1792 0.2774 0.7920 -
5.9370 103500 0.1719 0.2618 0.7981 -
5.9657 104000 0.1806 0.2657 0.7990 -
5.9944 104500 0.1767 0.2914 0.7838 -
6.0231 105000 0.1854 0.2886 0.7900 -
6.0517 105500 0.1449 0.2889 0.7756 -
6.0804 106000 0.1433 0.2772 0.7974 -
6.1091 106500 0.1429 0.2909 0.7976 -
6.1378 107000 0.1385 0.2763 0.7934 -
6.1665 107500 0.1452 0.2920 0.7954 -
6.1951 108000 0.1463 0.2715 0.7904 -
6.2238 108500 0.1488 0.2839 0.7982 -
6.2525 109000 0.1506 0.2741 0.8023 -
6.2812 109500 0.1524 0.2835 0.8007 -
6.3099 110000 0.1443 0.2720 0.7975 -
6.3386 110500 0.152 0.2882 0.7896 -
6.3672 111000 0.142 0.2759 0.8041 -
6.3959 111500 0.1431 0.2841 0.8054 -
6.4246 112000 0.1406 0.2857 0.7917 -
6.4533 112500 0.1478 0.3215 0.7817 -
6.4820 113000 0.1523 0.2796 0.7851 -
6.5106 113500 0.148 0.2736 0.7870 -
6.5393 114000 0.1481 0.2835 0.7993 -
6.5680 114500 0.1387 0.2844 0.7914 -
6.5967 115000 0.1475 0.2798 0.7981 -
6.6254 115500 0.1463 0.2739 0.7940 -
6.6540 116000 0.1491 0.2739 0.7987 -
6.6827 116500 0.1537 0.2708 0.7965 -
6.7114 117000 0.143 0.2685 0.8018 -
6.7401 117500 0.1481 0.2654 0.7902 -
6.7688 118000 0.1461 0.2741 0.7928 -
6.7975 118500 0.1489 0.2719 0.7965 -
6.8261 119000 0.1503 0.2852 0.7849 -
6.8548 119500 0.1435 0.2729 0.7983 -
6.8835 120000 0.1432 0.2703 0.7924 -
6.9122 120500 0.1481 0.2694 0.7889 -
6.9409 121000 0.1514 0.2735 0.7968 -
6.9695 121500 0.1424 0.2671 0.7914 -
6.9982 122000 0.143 0.2626 0.8006 -
7.0269 122500 0.1287 0.2754 0.7856 -
7.0556 123000 0.1269 0.2748 0.7850 -
7.0843 123500 0.1225 0.2821 0.7807 -
7.1129 124000 0.1223 0.2753 0.7781 -
7.1416 124500 0.1253 0.2688 0.7972 -
7.1703 125000 0.1214 0.2737 0.7905 -
7.1990 125500 0.1208 0.2689 0.7926 -
7.2277 126000 0.127 0.2754 0.7923 -
7.2564 126500 0.1152 0.2715 0.7867 -
7.2850 127000 0.1183 0.2766 0.7792 -
7.3137 127500 0.1195 0.2786 0.7850 -
7.3424 128000 0.1195 0.2885 0.7763 -
7.3711 128500 0.1332 0.2796 0.7868 -
7.3998 129000 0.1217 0.2838 0.7840 -
7.4284 129500 0.1191 0.2711 0.7819 -
7.4571 130000 0.1234 0.2752 0.7744 -
7.4858 130500 0.1297 0.2663 0.7802 -
7.5145 131000 0.1238 0.2643 0.7878 -
7.5432 131500 0.1196 0.2752 0.7809 -
7.5718 132000 0.1164 0.2744 0.7780 -
7.6005 132500 0.1208 0.2682 0.7722 -
7.6292 133000 0.1319 0.2774 0.7811 -
7.6579 133500 0.1208 0.2705 0.7921 -
7.6866 134000 0.1336 0.2681 0.7804 -
7.7153 134500 0.1226 0.3096 0.7763 -
7.7439 135000 0.1293 0.2724 0.7763 -
7.7726 135500 0.1309 0.2707 0.7718 -
7.8013 136000 0.1218 0.2636 0.7799 -
7.8300 136500 0.1253 0.2805 0.7719 -
7.8587 137000 0.1198 0.2619 0.7924 -
7.8873 137500 0.1195 0.2788 0.7822 -
7.9160 138000 0.1264 0.2795 0.7794 -
7.9447 138500 0.1186 0.2687 0.7811 -
7.9734 139000 0.1173 0.2743 0.7758 -
8.0021 139500 0.1216 0.2658 0.7735 -
8.0307 140000 0.1008 0.2725 0.7985 -
8.0594 140500 0.1026 0.2752 0.7897 -
8.0881 141000 0.1031 0.2743 0.7885 -
8.1168 141500 0.1019 0.2623 0.7881 -
8.1455 142000 0.1034 0.2590 0.7870 -
8.1742 142500 0.0986 0.2714 0.7872 -
8.2028 143000 0.0946 0.2729 0.7872 -
8.2315 143500 0.1018 0.2799 0.7842 -
8.2602 144000 0.1029 0.2796 0.7837 -
8.2889 144500 0.1031 0.2760 0.7832 -
8.3176 145000 0.0979 0.2751 0.7863 -
8.3462 145500 0.0958 0.2726 0.7899 -
8.3749 146000 0.0945 0.2709 0.7898 -
8.4036 146500 0.0982 0.2726 0.7944 -
8.4323 147000 0.1048 0.2639 0.7820 -
8.4610 147500 0.1006 0.2630 0.7787 -
8.4896 148000 0.1092 0.2716 0.7771 -
8.5183 148500 0.1024 0.2676 0.7903 -
8.5470 149000 0.1038 0.2619 0.7891 -
8.5757 149500 0.1032 0.2596 0.7960 -
8.6044 150000 0.1022 0.2660 0.7862 -
8.6331 150500 0.103 0.2767 0.7863 -
8.6617 151000 0.1138 0.2657 0.7781 -
8.6904 151500 0.1071 0.2607 0.7884 -
8.7191 152000 0.098 0.2567 0.7900 -
8.7478 152500 0.1019 0.2670 0.7854 -
8.7765 153000 0.0972 0.2647 0.7851 -
8.8051 153500 0.1089 0.2715 0.7759 -
8.8338 154000 0.102 0.2799 0.7817 -
8.8625 154500 0.102 0.2796 0.7808 -
8.8912 155000 0.1055 0.2691 0.7860 -
8.9199 155500 0.0989 0.2636 0.7843 -
8.9485 156000 0.0978 0.2671 0.7800 -
8.9772 156500 0.1085 0.2760 0.7889 -
9.0059 157000 0.102 0.2715 0.7824 -
9.0346 157500 0.0829 0.2759 0.7798 -
9.0633 158000 0.0811 0.2728 0.7897 -
9.0920 158500 0.0849 0.2641 0.7757 -
9.1206 159000 0.0795 0.2570 0.7824 -
9.1493 159500 0.0914 0.2624 0.7725 -
9.1780 160000 0.089 0.2677 0.7727 -
9.2067 160500 0.0874 0.2682 0.7760 -
9.2354 161000 0.0843 0.2690 0.7756 -
9.2640 161500 0.0805 0.2677 0.7732 -
9.2927 162000 0.09 0.2792 0.7723 -
9.3214 162500 0.0842 0.2727 0.7731 -
9.3501 163000 0.0861 0.2647 0.7737 -
9.3788 163500 0.0881 0.2748 0.7769 -
9.4074 164000 0.0838 0.2644 0.7864 -
9.4361 164500 0.0822 0.2609 0.7712 -
9.4648 165000 0.0826 0.2609 0.7787 -
9.4935 165500 0.0839 0.2688 0.7743 -
9.5222 166000 0.0863 0.2600 0.7781 -
9.5509 166500 0.0865 0.2663 0.7800 -
9.5795 167000 0.0816 0.2517 0.7738 -
9.6082 167500 0.0801 0.2597 0.7774 -
9.6369 168000 0.0849 0.2550 0.7764 -
9.6656 168500 0.0821 0.2629 0.7727 -
9.6943 169000 0.0845 0.2696 0.7737 -
9.7229 169500 0.0846 0.2647 0.7686 -
9.7516 170000 0.0818 0.2700 0.7729 -
9.7803 170500 0.0878 0.2620 0.7699 -
9.8090 171000 0.0808 0.2623 0.7672 -
9.8377 171500 0.0777 0.2622 0.7704 -
9.8663 172000 0.0835 0.2606 0.7734 -
9.8950 172500 0.0791 0.2568 0.7771 -
9.9237 173000 0.0739 0.2676 0.7747 -
9.9524 173500 0.0831 0.2566 0.7753 -
9.9811 174000 0.0857 0.2711 0.7637 -
10.0098 174500 0.0711 0.2718 0.7784 -
10.0384 175000 0.0638 0.2612 0.7787 -
10.0671 175500 0.0655 0.2647 0.7781 -
10.0958 176000 0.0702 0.2622 0.7713 -
10.1245 176500 0.0698 0.2672 0.7784 -
10.1532 177000 0.074 0.2678 0.7844 -
10.1818 177500 0.0672 0.2575 0.7830 -
10.2105 178000 0.0685 0.2667 0.7746 -
10.2392 178500 0.0662 0.2650 0.7719 -
10.2679 179000 0.0685 0.2647 0.7743 -
10.2966 179500 0.0666 0.2584 0.7787 -
10.3252 180000 0.073 0.2567 0.7730 -
10.3539 180500 0.0678 0.2665 0.7676 -
10.3826 181000 0.074 0.2621 0.7727 -
10.4113 181500 0.0698 0.2580 0.7798 -
10.4400 182000 0.0729 0.2529 0.7729 -
10.4687 182500 0.0645 0.2548 0.7714 -
10.4973 183000 0.0644 0.2599 0.7742 -
10.5260 183500 0.0638 0.2597 0.7754 -
10.5547 184000 0.0656 0.2606 0.7699 -
10.5834 184500 0.0645 0.2576 0.7776 -
10.6121 185000 0.0639 0.2600 0.7730 -
10.6407 185500 0.0668 0.2580 0.7742 -
10.6694 186000 0.0641 0.2571 0.7765 -
10.6981 186500 0.0689 0.2592 0.7708 -
10.7268 187000 0.067 0.2537 0.7672 -
10.7555 187500 0.0626 0.2549 0.7759 -
10.7841 188000 0.0704 0.2678 0.7751 -
10.8128 188500 0.0616 0.2692 0.7718 -
10.8415 189000 0.0717 0.2583 0.7750 -
10.8702 189500 0.0679 0.2594 0.7722 -
10.8989 190000 0.064 0.2616 0.7713 -
10.9276 190500 0.0695 0.2667 0.7829 -
10.9562 191000 0.0703 0.2619 0.7834 -
10.9849 191500 0.0715 0.2564 0.7813 -
11.0136 192000 0.0594 0.2624 0.7820 -
11.0423 192500 0.0526 0.2616 0.7830 -
11.0710 193000 0.0595 0.2636 0.7799 -
11.0996 193500 0.0537 0.2571 0.7875 -
11.1283 194000 0.0589 0.2617 0.7810 -
11.1570 194500 0.052 0.2632 0.7825 -
11.1857 195000 0.0607 0.2609 0.7829 -
11.2144 195500 0.057 0.2712 0.7757 -
11.2430 196000 0.0587 0.2672 0.7790 -
11.2717 196500 0.0593 0.2585 0.7731 -
11.3004 197000 0.0589 0.2721 0.7706 -
11.3291 197500 0.0556 0.2656 0.7706 -
11.3578 198000 0.0622 0.2584 0.7741 -
11.3865 198500 0.0572 0.2695 0.7750 -
11.4151 199000 0.0586 0.2649 0.7755 -
11.4438 199500 0.0595 0.2671 0.7767 -
11.4725 200000 0.0563 0.2630 0.7707 -
11.5012 200500 0.0574 0.2642 0.7674 -
11.5299 201000 0.0542 0.2644 0.7740 -
11.5585 201500 0.0613 0.2605 0.7694 -
11.5872 202000 0.0593 0.2604 0.7712 -
11.6159 202500 0.0556 0.2628 0.7699 -
11.6446 203000 0.0524 0.2631 0.7728 -
11.6733 203500 0.0602 0.2705 0.7622 -
11.7019 204000 0.0582 0.2631 0.7739 -
11.7306 204500 0.0579 0.2573 0.7721 -
11.7593 205000 0.057 0.2558 0.7774 -
11.7880 205500 0.051 0.2597 0.7757 -
11.8167 206000 0.0559 0.2515 0.7711 -
11.8454 206500 0.0543 0.2566 0.7720 -
11.8740 207000 0.0517 0.2554 0.7763 -
11.9027 207500 0.0493 0.2598 0.7722 -
11.9314 208000 0.0567 0.2592 0.7691 -
11.9601 208500 0.0559 0.2618 0.7755 -
11.9888 209000 0.0503 0.2615 0.7804 -
12.0174 209500 0.0499 0.2648 0.7812 -
12.0461 210000 0.047 0.2614 0.7819 -
12.0748 210500 0.0511 0.2632 0.7717 -
12.1035 211000 0.0464 0.2660 0.7693 -
12.1322 211500 0.0484 0.2658 0.7719 -
12.1608 212000 0.0465 0.2676 0.7756 -
12.1895 212500 0.0478 0.2689 0.7696 -
12.2182 213000 0.0467 0.2564 0.7684 -
12.2469 213500 0.0435 0.2606 0.7674 -
12.2756 214000 0.048 0.2602 0.7701 -
12.3043 214500 0.0471 0.2641 0.7687 -
12.3329 215000 0.0473 0.2557 0.7683 -
12.3616 215500 0.0503 0.2560 0.7705 -
12.3903 216000 0.044 0.2607 0.7724 -
12.4190 216500 0.045 0.2579 0.7707 -
12.4477 217000 0.0473 0.2605 0.7679 -
12.4763 217500 0.049 0.2557 0.7693 -
12.5050 218000 0.0482 0.2604 0.7725 -
12.5337 218500 0.049 0.2553 0.7751 -
12.5624 219000 0.0448 0.2597 0.7667 -
12.5911 219500 0.0443 0.2550 0.7685 -
12.6197 220000 0.0489 0.2561 0.7706 -
12.6484 220500 0.0448 0.2573 0.7693 -
12.6771 221000 0.0492 0.2565 0.7645 -
12.7058 221500 0.0475 0.2638 0.7674 -
12.7345 222000 0.0467 0.2612 0.7709 -
12.7632 222500 0.0443 0.2589 0.7702 -
12.7918 223000 0.0485 0.2605 0.7720 -
12.8205 223500 0.0437 0.2556 0.7716 -
12.8492 224000 0.0442 0.2558 0.7703 -
12.8779 224500 0.0452 0.2589 0.7711 -
12.9066 225000 0.0472 0.2575 0.7715 -
12.9352 225500 0.0484 0.2595 0.7697 -
12.9639 226000 0.0432 0.2578 0.7663 -
12.9926 226500 0.0479 0.2613 0.7641 -
13.0213 227000 0.04 0.2661 0.7659 -
13.0500 227500 0.0397 0.2573 0.7703 -
13.0786 228000 0.039 0.2697 0.7699 -
13.1073 228500 0.0428 0.2649 0.7680 -
13.1360 229000 0.0376 0.2637 0.7680 -
13.1647 229500 0.0441 0.2656 0.7646 -
13.1934 230000 0.0405 0.2600 0.7668 -
13.2221 230500 0.0438 0.2660 0.7690 -
13.2507 231000 0.0404 0.2635 0.7677 -
13.2794 231500 0.0395 0.2594 0.7684 -
13.3081 232000 0.0404 0.2614 0.7711 -
13.3368 232500 0.0428 0.2617 0.7660 -
13.3655 233000 0.0396 0.2602 0.7675 -
13.3941 233500 0.0433 0.2582 0.7676 -
13.4228 234000 0.0375 0.2602 0.7648 -
13.4515 234500 0.0404 0.2616 0.7666 -
13.4802 235000 0.0395 0.2594 0.7655 -
13.5089 235500 0.0375 0.2635 0.7618 -
13.5375 236000 0.039 0.2619 0.7650 -
13.5662 236500 0.0421 0.2606 0.7628 -
13.5949 237000 0.0416 0.2642 0.7627 -
13.6236 237500 0.0391 0.2636 0.7636 -
13.6523 238000 0.04 0.2632 0.7638 -
13.6809 238500 0.0409 0.2587 0.7632 -
13.7096 239000 0.0386 0.2643 0.7600 -
13.7383 239500 0.0396 0.2616 0.7605 -
13.7670 240000 0.0415 0.2600 0.7652 -
13.7957 240500 0.0403 0.2599 0.7662 -
13.8244 241000 0.0377 0.2621 0.7646 -
13.8530 241500 0.0384 0.2600 0.7623 -
13.8817 242000 0.0392 0.2590 0.7617 -
13.9104 242500 0.0386 0.2588 0.7621 -
13.9391 243000 0.0405 0.2628 0.7616 -
13.9678 243500 0.0379 0.2562 0.7627 -
13.9964 244000 0.0384 0.2611 0.7616 -
14.0251 244500 0.0331 0.2611 0.7596 -
14.0538 245000 0.0372 0.2619 0.7609 -
14.0825 245500 0.0348 0.2646 0.7599 -
14.1112 246000 0.0369 0.2618 0.7610 -
14.1398 246500 0.0351 0.2630 0.7588 -
14.1685 247000 0.0314 0.2639 0.7608 -
14.1972 247500 0.0389 0.2599 0.7606 -
14.2259 248000 0.0397 0.2630 0.7617 -
14.2546 248500 0.0352 0.2613 0.7642 -
14.2833 249000 0.0375 0.2621 0.7654 -
14.3119 249500 0.0408 0.2630 0.7622 -
14.3406 250000 0.0313 0.2633 0.7640 -
14.3693 250500 0.0362 0.2635 0.7629 -
14.3980 251000 0.0368 0.2624 0.7640 -
14.4267 251500 0.0367 0.2647 0.7630 -
14.4553 252000 0.0352 0.2620 0.7646 -
14.4840 252500 0.0312 0.2640 0.7646 -
14.5127 253000 0.0316 0.2636 0.7643 -
14.5414 253500 0.0343 0.2619 0.7638 -
14.5701 254000 0.0363 0.2629 0.7635 -
14.5987 254500 0.0327 0.2619 0.7654 -
14.6274 255000 0.0364 0.2635 0.7659 -
14.6561 255500 0.035 0.2651 0.7640 -
14.6848 256000 0.0369 0.2652 0.7635 -
14.7135 256500 0.037 0.2651 0.7645 -
14.7422 257000 0.0341 0.2643 0.7644 -
14.7708 257500 0.0379 0.2639 0.7648 -
14.7995 258000 0.0331 0.2629 0.7645 -
14.8282 258500 0.0309 0.2628 0.7650 -
14.8569 259000 0.0319 0.2632 0.7651 -
14.8856 259500 0.0342 0.2640 0.7646 -
14.9142 260000 0.0344 0.2637 0.7648 -
14.9429 260500 0.0371 0.2637 0.7648 -
14.9716 261000 0.0364 0.2637 0.7649 -
-1 -1 - - - 0.7351

Framework Versions

  • Python: 3.13.0
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.1
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.11.0
  • Datasets: 4.4.1
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
32
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sobamchan/roberta-large-no-mrl

Finetuned
(433)
this model

Dataset used to train sobamchan/roberta-large-no-mrl

Evaluation results