Sparse Encoder

This is a Sparse Encoder model trained on the json dataset using the sentence-transformers library. It maps sentences & paragraphs to a 50368-dimensional sparse vector space and can be used for semantic search and sparse retrieval.

Model Details

Model Description

  • Model Type: Sparse Encoder
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 50368 dimensions
  • Similarity Function: Dot Product
  • Training Dataset:
    • json

Model Sources

Full Model Architecture

SparseEncoder(
  (0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'ModernBertForEmbeddingsFusedMeanpool'})
  (1): SpladePooling({'pooling_strategy': 'mean', 'activation_function': 'log1p_relu', 'word_embedding_dimension': 50368})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SparseEncoder

# Download from the 🤗 Hub
model = SparseEncoder("sparse_encoder_model_id")
# Run inference
queries = [
    "Ok, so if you want to step up your coffee game, you need to cut it out with the pre-ground beans and buy whole. And while you\u0027re at it, you might as well go with this burr grinder over a blade grinder. It\u0027s really not all that much more expensive.  So why a burr grinder? Better consistency. A blade grinder chops up the beans, but by nature can\u0027t really produce a consistent size for the grounds. And this will really mess with your coffee extraction. Maybe you\u0027re average coffee drinker wouldn\u0027t notice it, but then again, they\u0027re probably still buying pre-ground coffee. And you\u0027re not average, or you wouldn\u0027t be looking at this product,right?  Let\u0027s get one thing out of the way... this is by no stretch of the imagination a top end grinder. For that, you want to go with a conical, low speed grinder. And that\u0027s going to cost. A lot.  This grinder uses two flat serrated discs, a fixed distance apart, depending on how fine or course you want your grounds. Like I said, conical burrs are the preferred, but really disc grinders also do a fine job of crushing the beans. Basically, since the cones or discs are a fixed distance apart, the beans are crushed between them. The beans can\u0027t pass through to the collection bin until they are crushed to the size determined by the space between the cones/plates, and it\u0027s almost physically impossible for the beans to be crushed smaller (unlike a blade grinder, which can slice some beans to powder, and leave large chunks of others). I mean, that\u0027s just in case you were wondering how it all works.  Really where this unit suffers compared to the really high end models is the speed. This is a fast grinder. There\u0027s a potential that the speed can create enough friction to heat up the beans a little. Really, though, that\u0027s a small concern. The bigger issue is that the speed tends to create static, which causes a lot of mess from grounds clinging all over the machine.  You\u0027ll need to clean this thing. Often.  But, aside from that, this thing will give you a better brew. The ability to ground your beans as you need them will give you fresher coffee, and the consistent size of the grind will give you better extraction. And that\u0027s what counts, right?  Full disclosure: I\u0027m on my second unit. The first one apparently had a defect in the screw that held on one of the discs. It failed and the unit was non-functional. I hopped on Cuisinart\u0027s web site, found this model, and submitted a repair request. They immediately shipped out a new unit, as well as a shipping label to return the defective one. So despite the inconvenience of a bad unit, I have to say that the customer service was flawless.",
]
documents = [
    'UPDATE:  I\'m going to have to ding this a couple of stars due to the faulty electronics.  I put in the batteries (it takes 2) shortly before attending a Halloween party and at first all seemed well.  The "eyes" have two on modes:  Flashing and solid.  Both were working as expected (although I\'m not certain why anyone would opt for the flashing mode).  Because it\'s difficult to see when the lights are on, I never intended to leave them on the entire time.  So I switched them on when entering the party, then turned them off to socialize, only turning them on again if someone wanted a photo.  About half way into the party, the lights inexplicably came on into flashing mode.  I pressed the button and switched them to solid, and then pressed it again to switch them off.  About 2 minutes later, they came back on in flashing mode.  This time they wouldn\'t shut off.  I had to take out the batteries.  I had a spare set and tried putting those in, just in case my batteries were faulty.  Things seemed normal for about 15 minutes when the lights again came on into flashing mode.  Again it wouldn\'t switch off, and this time the batter compartment was slightly warm.  Taking this SECOND set of batteries out, I notice some dark heat discoloration.  It kind of makes me wonder what kind of a health hazard this thing might have been had I not caught that the switch circuit board was stressing the batteries.  Bottom line?  This product looks ok (and can look phenomenal with a bit of work), but don\'t trust the electronics in it.  They are dangerous.  ------------------------------  I\'ve read a couple of complaints about this product, so I wanted to address them:  1. Size.  Some reviews have said this runs small.  Taking note, I ordered the larger size, and I\'m glad that I did.  I think that had I gone with the smaller size (which is listed as more in the range of my hat size), I\'d have been very disappointed with the fit.  2. Distortion:  Many have complained about the packaging squishing the helmet.  It would seem that the manufacturer has been paying attention.  Mine arrived in box and had filler inside the helmet to try to retain the shape as much as possible.  There is *some* inevitable distortion from the box, but pretty minimal and easily corrected.  Overall, I have to say that I\'m pretty impressed with this product.  The plastic is flexible, but much sturdier than what I was expecting.  I do have to say, however, that my review and star rating is based on MANAGED expectations.  This is not a completely screen accurate product and it is not without its flaws.  But for what it is, and for the cost, I\'d say it\'s a reasonable costume piece either as is, or (as I intend) a base for further modification. Whether you intend to modify the helmet or not, one suggestion I have for everyone is to obtain some foam padding.  There are some pressure point areas that will likely become uncomfortable with extended wear.',
    'Okay, I like the Panasonic\'s Zs cameras. They\'re easy to get the hang of and I\'ve had no reliability issues. I started with a Panasonic Zs3. Later on I upgraded to a Zs7, then when I saw Amazon was selling a Zs9 which was the same as the Zs8 but with a stereo microphone and the price wasn\'t much over $100 (from Amazon Warehouse), I couldn\'t resist buying it for its 16x zoom. It\'s been a great camera. I love my Zs9 with it\'s long zoom despite it not having GPS (which I didn\'t care about) or HDMI out, and with a lower LCD screen resolution than the Zs7. I thought the screen clarity was fine and didn\'t miss the 460,000 dots on the Zs7.  Well, I couldn\'t resist this newer upgraded Zs. The Zs20 is my 4th Panasonic Lumix Zs camera (all of them purchased from Amazon), and I can say without reservation that this is the best of them all. I love the fast burst shooting made possible by the CMOS sensor.  The 16x zoom on the Zs8/9 is really great. I\'ve taken some fantastic close-up shots with it. Now I have a 20x zoom. Amazing. It\'s operates when shooting video too.  Canon\'s G series, by comparison, has a larger 1:1.7 sensor compared to the Zs20 1:2.33. But, the zoom is very limited on the Canon and the camera is too big to carry around in your pocket - and it cost twice as much (and the older unavailable G9 model is better than the later ones I hear.) Their competitor for this Panasonic is their 20x zoom model SX260SH with a similar 1:2.33 CMOS sensor. Professional reviewers are rating the Zs20 higher than the comparable Canon, which is slightly larger, heavier, and costs more. Everyone raves about Panasonic\'s Leica lenses, with good reason. The Zs20 is just an amazing camera for the price. It\'s even slightly smaller than the Zs 7,8/9/10 and easy to carry in your pocket in a thin stretch case. Like the Zs7 (which never cost less than about $250.00) it has the 460,000 dot bright LCD screen, HDMI out, and GPS. It also has more features than the Zs7 or 8/9/10.  Don\'t expect the price to get much lower. As I recall, the Zs10 (the real predecessor of the Zs20) price came down to about the Zs20 price today, or maybe at the very end, a few dollars less, but I wouldn\'t wait. Panasonic may refuse to lower the price any further. It\'s already so heavily discounted, and there is a limit to how low they\'ll go. If you want the absolute best deal, get a "used" one from Amazon Warehouse. It\'s hardly, if at all "used," and most likely brand new but returned for some reason and re-packaged. You have 30 days to send it back if there is anything wrong with it. You don\'t get the one year warranty however, so if that\'s important to you, but a new one. It\'s definitely worth the price. I already had an extra battery but you can buy a non-proprietary battery very inexpensively that lasts even longer than the battery that comes with the camera. I also had a wall charger (two - from my Zs7 and Zs9)), so I didn\'t have to buy that. I\'d say that\'s the only thing Panasonic let down on. It\'s worth getting a wall charger, but it should come with the camera. This is the first time it doesn\'t. Don\'t let it deter you.',
    "The Rite is a very good movie despite the horrible critic reviews.  Having missed it in theaters I rented it on Amzon's video on demand.  Like the Exorcism of Emily Rose, this is not a horror film.  If you are looking to be scared, this is not the movie for you. Though most horror films are more disgusting than scary.  The Rite begins with the character of who <PERSON> fed up with working in his father's mortuary. He wants to leave but can't pay for college. So he decides to join the seminary so as to get a college education for free, even though he personally believes in nothing.  When he graduates, <PERSON> attempts to resign, but the priest he sends his resignation to is not fooled and indirectly threatens to force him to pay back the cost of his education.  He doesn't do this out of spite, but because he sees something in <PERSON> that <PERSON> doesn't see in himself.  Reluctantly, <PERSON> agrees to go to Rome for two month to study exorcism with the notion that if when he gets back and still wants to resign, he can without repercussions.  While in Rome, <PERSON> challenges the professor of the exorcism course thereby publicly announcing his own doubts.  He is later sent to work with an older priest, Father <PERSON>, who is an exorcist and has been in the trade for some time. <PERSON> immediately shows <PERSON> a possessed pregnant girl as proof of the existence of the devil. They continue to work on saving the possessed girl, who later dies in the movie. I like the scene where they first meet.  Fr. <PERSON> doesn't waste time with chit chat, he just immediately gets into the matter. Also, <PERSON> meets a female reporter named <PERSON> while in Rome. They connect and become friends. At least the script writers decided to not put a romantic interest between them. <PERSON> does a good job of playing the professional reporter who connects with <PERSON> as she too searches for the truth and has something in common with him.  As the movie progresses <PERSON> is forced to face his inner demons. The flashbacks in the movie let the viewer know that he lost his faith when his mother died, but had obviously hung on to a bit of it as he has spent his whole life searching for the truth about God and the Devil. <PERSON> does an excellent job playing the doubting <PERSON> as he is forced to face actual cases of possession and admit that maybe the devil is real. Never once did I think his performance was forced. He was able to display the character's inner turmoil and the fact that <PERSON> truly cares about these people that <PERSON> takes him to. At one point he challenges the older Fr. <PERSON> saying that the pregnant girl needs a psychiatrist more than a priest. Even though you can tell that even he is wondering if maybe she is possessed as she knows things about his past that no one knows about.  <PERSON> does a superb job as Father <PERSON>. When you first meet him he is a get down to business type and seems a bit off putting.  As the movie continues <PERSON> manages to demonstrate the emotional side of the character without it seeming cheesy or forced. There is a scene in which Father <PERSON> is clearly distraught over the young girl's death. It is a very emotional scene that even you will feel Fr. <PERSON>' pain as he breaks down and cries and pleads with <PERSON> to not let the devil destroy his soul like he destroyed the girl's. <PERSON> also does a great job portraying the demonic taking over of the older priest until the final exorcism scene at the end. In that scene <PERSON> puts a certain amount of class into it. Whereas in the Exorcist and in Emily Rose the final exorcism scene immediately starts with screaming and cussing and anything shocking that Hollywood can throw in; in The Rite, the final showdown with the devil starts off with the possessed Fr. <PERSON> displaying a certain amount of calm and class that only <PERSON> can provide. He gets in <PERSON> and <PERSON>'s heads instead on yelling insults at them.  Then, he slowly progresses into a more antagonistic character until he physically assaults <PERSON> and the female reporter.  Naturally the movie ends with <PERSON> being purified of his demons, which is good because I hate sad endings. <PERSON> takes his vows and maintains his newfound belief in God and the reality of the devil/evil. I like the parting scene between <PERSON> and Fr. <PERSON>. It's not overly sentimental, but there is a chemistry between the characters & actors where they have a newfound respect for each other as they have both witnessed something few ever see.  Like I said, the movie is not scary. I found it to be more of a spiritual journey type film that concerns <PERSON> and <PERSON> as they both face their own failings and learn to accept that evil does exist and only with God can it be fought. But the movie also does not ignore science. Even the character of <PERSON> admits that there are a lot of supposed possession cases where the person is truly mentally ill. The exorcist is there when science can't explain what's wrong with an individual. The movie does leave it up to you the viewer to decide if you believe in the devil, or  science. It does provide believable resolution for the characters as they accept what's happened and move on with their lives.  The quality of the movie was very good.  I rented the SD version on video on demand. It loaded quickly and there were no problems with playback. Not once did it look pixelated. It actually felt like I was watching a DVD.  I may buy this on DVD when the price goes down some.  I recommend this movie to anyone that likes a good drama and isn't expecting another Exorcist movie.  Understand that there is no pea soup, spinning heads, or masturbating 12 year old girls.",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 50368] [3, 50368]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[296.5509, 289.7054, 288.2206]])

Evaluation

Metrics

Sparse Information Retrieval

Metric Value
dot_accuracy@1 0.056
dot_accuracy@8 0.1574
dot_accuracy@50 0.3211
dot_accuracy@100 0.4101
dot_precision@1 0.056
dot_precision@8 0.0197
dot_precision@50 0.0064
dot_precision@100 0.0041
dot_recall@1 0.056
dot_recall@8 0.1574
dot_recall@50 0.3211
dot_recall@100 0.4101
dot_ndcg@10 0.107
dot_mrr@10 0.0873
dot_map@100 0.0953
query_active_dims 1024.0
query_sparsity_ratio 0.9797
corpus_active_dims 1024.0
corpus_sparsity_ratio 0.9797
avg_flops 828.1012

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 202,427 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 402 tokens
    • mean: 499.1 tokens
    • max: 512 tokens
    • min: 393 tokens
    • mean: 499.22 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    This knife, and theJ.A. Henckels Twin Four Star 3-Inch High Carbon Stainless-Steel Paring Knifeare the two best knives in the Henckels four star collection. There is something "just so" about them. They are just right, with all the various design parameters coming together to create a whole that is greater than the sum of the parts. This serrated utility knife works well in a great variety of applications. The five inch serrated blade is nicely thin (but still thick enough for good strength and rigidity) and shallow (i.e. not broad). I find it very useful for cutting pie or cake or brownies, as well as (of course) bread and tomatoes and many other vegetables. I've had this knife now for SIXTEEN YEARS, and it is still going strong, and still one of my favorites. However . . . you must SHARPEN this knife eventually. Like any other knife, it will go dull. NEVER HONE THIS KNIFE OR ANY OTHER SERRATED KNIFE! A sharpening steel is too large a diameter to be used on a serrated knife.... When I moved into my first (and current) house from my apartment, the previous owner had a Whirlpool (Ecodyne) WHER25 reverse osmosis system installed under the kitchen sink. I liked the water the system produced, but the flow control was misfunctioning, causing an annoying dripping sound that was almost constant. The installer (previous owner, not a plumber) had NOT made the common mistake of trimming out the flow control--which was the first thing I suspected. No, the problem, rather, was deformation of the thin rubber membranes (there are two) inside the head of the unit. I flipped them over (they are reversible) and this fixed the problem for a month or so, but it returned. I priced out new membranes/gaskets and flow control insert, with shipping, and decided that I should just start fresh with a whole new unit, since it was on a special sale locally and it would come with all new filters ($80 worth). I replaced just the head and all was well for a while. Then the tank stopp...
    The Good: Sawstop customer service is the best I have dealt with in years. When set up correctly it cuts sheet good like a dream. Only a panel saw would seem better. The adjustable stops are stout and easy to use. Great for repeat cuts. Sliding mechanism is very smooth The Bad: No postive stops - in my experience this borders on being a huge problem for two reasons. First, it is not easy to get the fence square to the blade if you want to be very accurate. On the best of days it takes me 5 minutes to get it close enough to make a 48" cut square. Without positive stops I have to square the sliding table fence every time it is bumped or removed. And, I remove it regularly as the sliding table fence sits close enough to the blade that almost all cuts over 48" using the regular saw fence demand the sliding table fence be removed or swung out of the way (if the cut is less than 48" the sliding table can be moved back with fence in place forming a little pocket to work within). The fence th... The Bad (yes there are a lot of bad things even with 5 stars): One of the worst written non fiction books I have ever owned. I really don't care if one of the authors clients liked a sauce so well that she would eat it over kitty litter. I don't care to read 100s of testaments to how good the recipes are (they are pretty good). I just want to get on with the book. Prove the recipes are not good. Don't spend all those pages trying to convince me. It even backfired. I was sure they were going to be terrible are reading all the testaments. Get ready to cook. A lot. Get ready to do a lot of dishes. Have to plan ahead. Have to make lunches the night before often. The flax seed breakfast takes some work and time. Can't just whip it up. If you run out without having already prepared more you will find yourself without a breakfast. Terribly organized. It is not sequential. You will read something and then find out later you were not suppose to do it when you did unless you read the entire b...
    I've had a Samsung WB850 and I still have a Fuji F900EXR. Both are megazoom pocket cameras. I'm sold on pocket megazooms for reasons that I explained in my review of the WB850. I've put my big DSLRs away only for stuff where I need the features of a big DSLR and those times are becoming rarer. I changed from the WB850 for one reason. It is SLOW between shots. Super camera but it just took too much time between shots AND I wanted a camera that would shoot in raw format. The Fuji is fast between shots and it shoots in raw BUT it won't let you run the camera and charge the battery in the camera through the USB port. Therefore I always had to carry extra batteries. After owning the Fuji for a few months, I found that I really did not need raw pictures. I just never used the raw file, only the jpg file. If you don't know what the raw format is, then you most likely don't need raw. I saw the 9700 in a store and was super impressed with the zoom. There is a BIG difference between the 21X me... The system works great! There are a few points that I would like to point out for installation. 1. Each controller will handle up to three doors or gates but you can add multiple controllers. I have 5 garage doors to control. I needed two controllers and three extra door sensors since each controller comes with one sensor. You must switch between controllers in the app to control each group of doors. The app does remember the last used controller. 2. The app will always default to Door #1 and you must swipe left or right to control Door #2 or #3. Therefore if there is one particular door that you use most, make that door #1. Each door can be labeled with a unique name and the name is what you will see in the app. I use the middle door mainly and I had to go back and rewire the middle door to the #1 terminal so that it would always show up as the default door and I did not need to swipe to select it. Name your controller the PERMANENT name that you want to call it initially. Once it...
  • Loss: model.SpladeMixedTopKLoss.SpladeMixedTopKLoss with these parameters:
    {
        "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score', gather_across_devices=False)",
        "document_regularizer_weight": 0.005,
        "query_regularizer_weight": 0.005,
        "document_regularizer_threshold": 256,
        "query_regularizer_threshold": 256
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • weight_decay: 0.0001
  • num_train_epochs: 2
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.075
  • save_only_model: True
  • bf16: True
  • dataloader_num_workers: 8
  • gradient_checkpointing: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0001
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.075
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: True
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 8
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: True
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss sparse-ir-eval_dot_ndcg@10
0.0063 20 787.2433 -
0.0126 40 440.0618 -
0.0190 60 271.8732 -
0.0253 80 159.8073 -
0.0316 100 115.2051 -
0.0379 120 68.2212 -
0.0443 140 37.2703 -
0.0506 160 21.9191 -
0.0569 180 13.6624 -
0.0632 200 6.6098 -
0.0696 220 4.905 -
0.0759 240 4.4195 -
0.0822 260 3.7145 -
0.0885 280 3.3807 -
0.0948 300 3.3389 -
0.1012 320 3.1854 -
0.1075 340 3.0942 -
0.1138 360 2.8802 -
0.1201 380 2.8744 -
0.1265 400 2.9323 -
0.1328 420 2.805 -
0.1391 440 2.7811 -
0.1454 460 2.7447 -
0.1518 480 2.7353 -
0.1581 500 2.6297 -
0.1644 520 2.6286 -
0.1707 540 2.6424 -
0.1770 560 2.5846 -
0.1834 580 2.6457 -
0.1897 600 2.4484 -
0.1960 620 2.5683 -
0.2023 640 2.5078 -
0.2087 660 2.4674 -
0.2150 680 2.4734 -
0.2213 700 2.3857 -
0.2276 720 2.3447 -
0.2340 740 2.3187 -
0.2403 760 2.4015 -
0.2466 780 2.3671 -
0.2529 800 2.2929 -
0.2592 820 2.3264 -
0.2656 840 2.7711 -
0.2719 860 2.2928 -
0.2782 880 2.1857 -
0.2845 900 2.1 -
0.2909 920 2.1598 -
0.2972 940 1.9527 -
0.3035 960 2.0608 -
0.3098 980 2.0235 -
0.3162 1000 1.9305 -
0.3225 1020 1.9598 -
0.3288 1040 1.9558 -
0.3351 1060 2.0087 -
0.3414 1080 1.9493 -
0.3478 1100 1.7575 -
0.3541 1120 1.7915 -
0.3604 1140 1.8282 -
0.3667 1160 1.774 -
0.3731 1180 1.7967 -
0.3794 1200 1.7661 -
0.3857 1220 1.7127 -
0.3920 1240 1.6856 -
0.3984 1260 1.737 -
0.4047 1280 1.7078 -
0.4110 1300 1.7971 -
0.4173 1320 1.6587 -
0.4236 1340 1.6127 -
0.4300 1360 1.5483 -
0.4363 1380 1.5743 -
0.4426 1400 1.6291 -
0.4489 1420 1.6277 -
0.4553 1440 1.5486 -
0.4616 1460 1.5393 -
0.4679 1480 1.5138 -
0.4742 1500 1.5601 -
0.4806 1520 1.5127 -
0.4869 1540 1.5186 -
0.4932 1560 1.4835 -
0.4995 1580 1.3831 -
0.5058 1600 1.5297 -
0.5122 1620 1.4104 -
0.5185 1640 1.3922 -
0.5248 1660 1.4043 -
0.5311 1680 1.4286 -
0.5375 1700 1.3533 -
0.5438 1720 1.3941 -
0.5501 1740 1.3218 -
0.5564 1760 1.3049 -
0.5628 1780 1.4483 -
0.5691 1800 1.3819 -
0.5754 1820 1.3073 -
0.5817 1840 1.3515 -
0.5880 1860 1.3165 -
0.5944 1880 1.2582 -
0.6007 1900 1.2801 -
0.6070 1920 1.2912 -
0.6133 1940 1.2768 -
0.6197 1960 1.2681 -
0.6260 1980 1.2818 -
0.6323 2000 1.2085 0.0834
0.6386 2020 1.2319 -
0.6450 2040 1.2843 -
0.6513 2060 1.2895 -
0.6576 2080 1.2754 -
0.6639 2100 1.3094 -
0.6702 2120 1.1937 -
0.6766 2140 1.2294 -
0.6829 2160 1.2211 -
0.6892 2180 1.3088 -
0.6955 2200 1.1989 -
0.7019 2220 1.2486 -
0.7082 2240 1.1296 -
0.7145 2260 1.1456 -
0.7208 2280 1.2594 -
0.7272 2300 1.1598 -
0.7335 2320 1.1291 -
0.7398 2340 1.1203 -
0.7461 2360 1.1708 -
0.7525 2380 1.175 -
0.7588 2400 1.2057 -
0.7651 2420 1.2125 -
0.7714 2440 1.2678 -
0.7777 2460 1.1447 -
0.7841 2480 1.2268 -
0.7904 2500 1.1557 -
0.7967 2520 1.1321 -
0.8030 2540 1.1172 -
0.8094 2560 1.1761 -
0.8157 2580 1.1746 -
0.8220 2600 1.1864 -
0.8283 2620 1.096 -
0.8347 2640 1.0784 -
0.8410 2660 1.1665 -
0.8473 2680 1.0553 -
0.8536 2700 1.0657 -
0.8599 2720 1.0973 -
0.8663 2740 1.0824 -
0.8726 2760 1.0886 -
0.8789 2780 1.1338 -
0.8852 2800 1.1033 -
0.8916 2820 1.0429 -
0.8979 2840 1.0102 -
0.9042 2860 1.1599 -
0.9105 2880 1.0423 -
0.9169 2900 1.0815 -
0.9232 2920 1.0804 -
0.9295 2940 1.1668 -
0.9358 2960 1.0606 -
0.9421 2980 1.0705 -
0.9485 3000 1.072 -
0.9548 3020 1.1239 -
0.9611 3040 1.112 -
0.9674 3060 1.0759 -
0.9738 3080 0.956 -
0.9801 3100 0.9945 -
0.9864 3120 1.0119 -
0.9927 3140 0.9965 -
0.9991 3160 1.1177 -
1.0054 3180 0.8884 -
1.0117 3200 0.9041 -
1.0180 3220 0.9367 -
1.0243 3240 0.8253 -
1.0307 3260 0.8637 -
1.0370 3280 0.8665 -
1.0433 3300 0.8306 -
1.0496 3320 0.8374 -
1.0560 3340 0.9326 -
1.0623 3360 0.8675 -
1.0686 3380 0.8846 -
1.0749 3400 0.8782 -
1.0813 3420 0.9058 -
1.0876 3440 0.8242 -
1.0939 3460 0.8406 -
1.1002 3480 0.8854 -
1.1065 3500 0.9114 -
1.1129 3520 0.7916 -
1.1192 3540 0.8902 -
1.1255 3560 0.8235 -
1.1318 3580 0.8662 -
1.1382 3600 0.8252 -
1.1445 3620 0.8636 -
1.1508 3640 0.8013 -
1.1571 3660 0.8126 -
1.1635 3680 0.8361 -
1.1698 3700 0.8975 -
1.1761 3720 0.8723 -
1.1824 3740 0.7598 -
1.1887 3760 0.8172 -
1.1951 3780 0.7955 -
1.2014 3800 0.8491 -
1.2077 3820 0.8096 -
1.2140 3840 0.8215 -
1.2204 3860 0.8388 -
1.2267 3880 0.8766 -
1.2330 3900 0.8822 -
1.2393 3920 0.7843 -
1.2457 3940 0.7955 -
1.2520 3960 0.7593 -
1.2583 3980 0.8728 -
1.2646 4000 0.7812 0.0966
1.2709 4020 0.7947 -
1.2773 4040 0.861 -
1.2836 4060 0.7238 -
1.2899 4080 0.8105 -
1.2962 4100 0.804 -
1.3026 4120 0.8112 -
1.3089 4140 0.8061 -
1.3152 4160 0.8149 -
1.3215 4180 0.7243 -
1.3279 4200 0.7487 -
1.3342 4220 0.789 -
1.3405 4240 0.7696 -
1.3468 4260 0.7236 -
1.3531 4280 0.7761 -
1.3595 4300 0.7864 -
1.3658 4320 0.8002 -
1.3721 4340 0.7939 -
1.3784 4360 0.7647 -
1.3848 4380 0.7741 -
1.3911 4400 0.7361 -
1.3974 4420 0.7732 -
1.4037 4440 0.79 -
1.4101 4460 0.7661 -
1.4164 4480 0.7779 -
1.4227 4500 0.7711 -
1.4290 4520 0.7952 -
1.4353 4540 0.7743 -
1.4417 4560 0.72 -
1.4480 4580 0.7801 -
1.4543 4600 0.7453 -
1.4606 4620 0.7509 -
1.4670 4640 0.7558 -
1.4733 4660 0.7718 -
1.4796 4680 0.6954 -
1.4859 4700 0.705 -
1.4923 4720 0.751 -
1.4986 4740 0.765 -
1.5049 4760 0.7983 -
1.5112 4780 0.7716 -
1.5175 4800 0.7747 -
1.5239 4820 0.7613 -
1.5302 4840 0.7962 -
1.5365 4860 0.7893 -
1.5428 4880 0.7291 -
1.5492 4900 0.6982 -
1.5555 4920 0.7057 -
1.5618 4940 0.7883 -
1.5681 4960 0.782 -
1.5745 4980 0.7625 -
1.5808 5000 0.7101 -
1.5871 5020 0.7394 -
1.5934 5040 0.6894 -
1.5997 5060 0.6992 -
1.6061 5080 0.7032 -
1.6124 5100 0.7659 -
1.6187 5120 0.7268 -
1.6250 5140 0.6928 -
1.6314 5160 0.7134 -
1.6377 5180 0.8233 -
1.6440 5200 0.7258 -
1.6503 5220 0.653 -
1.6567 5240 0.764 -
1.6630 5260 0.8153 -
1.6693 5280 0.6717 -
1.6756 5300 0.7592 -
1.6819 5320 0.7114 -
1.6883 5340 0.7035 -
1.6946 5360 0.702 -
1.7009 5380 0.735 -
1.7072 5400 0.7298 -
1.7136 5420 0.7082 -
1.7199 5440 0.693 -
1.7262 5460 0.7466 -
1.7325 5480 0.691 -
1.7389 5500 0.8491 -
1.7452 5520 0.7267 -
1.7515 5540 0.6938 -
1.7578 5560 0.7251 -
1.7641 5580 0.6835 -
1.7705 5600 0.7431 -
1.7768 5620 0.7031 -
1.7831 5640 0.6999 -
1.7894 5660 0.7097 -
1.7958 5680 0.7125 -
1.8021 5700 0.7399 -
1.8084 5720 0.677 -
1.8147 5740 0.7428 -
1.8211 5760 0.7495 -
1.8274 5780 0.7266 -
1.8337 5800 0.6984 -
1.8400 5820 0.7527 -
1.8463 5840 0.6564 -
1.8527 5860 0.7028 -
1.8590 5880 0.7015 -
1.8653 5900 0.7219 -
1.8716 5920 0.7569 -
1.8780 5940 0.6832 -
1.8843 5960 0.72 -
1.8906 5980 0.6878 -
1.8969 6000 0.6468 0.1070
1.9033 6020 0.6901 -
1.9096 6040 0.7066 -
1.9159 6060 0.6818 -
1.9222 6080 0.735 -
1.9285 6100 0.7364 -
1.9349 6120 0.7485 -
1.9412 6140 0.7123 -
1.9475 6160 0.7488 -
1.9538 6180 0.7161 -
1.9602 6200 0.6795 -
1.9665 6220 0.6925 -
1.9728 6240 0.8108 -
1.9791 6260 0.7295 -
1.9855 6280 0.7232 -
1.9918 6300 0.7575 -
1.9981 6320 0.7006 -

Framework Versions

  • Python: 3.11.14
  • Sentence Transformers: 5.1.1
  • Transformers: 4.57.1
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.11.0
  • Datasets: 4.2.0
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

FlopsLoss

@article{paria2020minimizing,
    title={Minimizing flops to learn efficient sparse representations},
    author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
    journal={arXiv preprint arXiv:2004.05665},
    year={2020}
}
Downloads last month
12
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for UBC-SLIME/splade-large-mean

Evaluation results