Sparse Encoder

This is a Sparse Encoder model trained on the json dataset using the sentence-transformers library. It maps sentences & paragraphs to a 512-dimensional sparse vector space and can be used for semantic search and sparse retrieval.

Model Details

Model Description

  • Model Type: Sparse Encoder
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 512 dimensions
  • Similarity Function: Colbert
  • Training Dataset:
    • json

Model Sources

Full Model Architecture

SparseColbertEncoder(
  (0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'ModernBertForSparseColbert'})
  (1): SpladePooling({'pooling_strategy': 'mean', 'activation_function': 'log1p_relu', 'word_embedding_dimension': 512})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SparseEncoder

# Download from the 🤗 Hub
model = SparseEncoder("sparse_encoder_model_id")
# Run inference
sentences = [
    "Ok, so if you want to step up your coffee game, you need to cut it out with the pre-ground beans and buy whole. And while you're at it, you might as well go with this burr grinder over a blade grinder. It's really not all that much more expensive.  So why a burr grinder? Better consistency. A blade grinder chops up the beans, but by nature can't really produce a consistent size for the grounds. And this will really mess with your coffee extraction. Maybe you're average coffee drinker wouldn't notice it, but then again, they're probably still buying pre-ground coffee. And you're not average, or you wouldn't be looking at this product,right?  Let's get one thing out of the way... this is by no stretch of the imagination a top end grinder. For that, you want to go with a conical, low speed grinder. And that's going to cost. A lot.  This grinder uses two flat serrated discs, a fixed distance apart, depending on how fine or course you want your grounds. Like I said, conical burrs are the preferred, but really disc grinders also do a fine job of crushing the beans. Basically, since the cones or discs are a fixed distance apart, the beans are crushed between them. The beans can't pass through to the collection bin until they are crushed to the size determined by the space between the cones/plates, and it's almost physically impossible for the beans to be crushed smaller (unlike a blade grinder, which can slice some beans to powder, and leave large chunks of others). I mean, that's just in case you were wondering how it all works.  Really where this unit suffers compared to the really high end models is the speed. This is a fast grinder. There's a potential that the speed can create enough friction to heat up the beans a little. Really, though, that's a small concern. The bigger issue is that the speed tends to create static, which causes a lot of mess from grounds clinging all over the machine.  You'll need to clean this thing. Often.  But, aside from that, this thing will give you a better brew. The ability to ground your beans as you need them will give you fresher coffee, and the consistent size of the grind will give you better extraction. And that's what counts, right?  Full disclosure: I'm on my second unit. The first one apparently had a defect in the screw that held on one of the discs. It failed and the unit was non-functional. I hopped on Cuisinart's web site, found this model, and submitted a repair request. They immediately shipped out a new unit, as well as a shipping label to return the defective one. So despite the inconvenience of a bad unit, I have to say that the customer service was flawless.",
    'UPDATE:  I\'m going to have to ding this a couple of stars due to the faulty electronics.  I put in the batteries (it takes 2) shortly before attending a Halloween party and at first all seemed well.  The "eyes" have two on modes:  Flashing and solid.  Both were working as expected (although I\'m not certain why anyone would opt for the flashing mode).  Because it\'s difficult to see when the lights are on, I never intended to leave them on the entire time.  So I switched them on when entering the party, then turned them off to socialize, only turning them on again if someone wanted a photo.  About half way into the party, the lights inexplicably came on into flashing mode.  I pressed the button and switched them to solid, and then pressed it again to switch them off.  About 2 minutes later, they came back on in flashing mode.  This time they wouldn\'t shut off.  I had to take out the batteries.  I had a spare set and tried putting those in, just in case my batteries were faulty.  Things seemed normal for about 15 minutes when the lights again came on into flashing mode.  Again it wouldn\'t switch off, and this time the batter compartment was slightly warm.  Taking this SECOND set of batteries out, I notice some dark heat discoloration.  It kind of makes me wonder what kind of a health hazard this thing might have been had I not caught that the switch circuit board was stressing the batteries.  Bottom line?  This product looks ok (and can look phenomenal with a bit of work), but don\'t trust the electronics in it.  They are dangerous.  ------------------------------  I\'ve read a couple of complaints about this product, so I wanted to address them:  1. Size.  Some reviews have said this runs small.  Taking note, I ordered the larger size, and I\'m glad that I did.  I think that had I gone with the smaller size (which is listed as more in the range of my hat size), I\'d have been very disappointed with the fit.  2. Distortion:  Many have complained about the packaging squishing the helmet.  It would seem that the manufacturer has been paying attention.  Mine arrived in box and had filler inside the helmet to try to retain the shape as much as possible.  There is *some* inevitable distortion from the box, but pretty minimal and easily corrected.  Overall, I have to say that I\'m pretty impressed with this product.  The plastic is flexible, but much sturdier than what I was expecting.  I do have to say, however, that my review and star rating is based on MANAGED expectations.  This is not a completely screen accurate product and it is not without its flaws.  But for what it is, and for the cost, I\'d say it\'s a reasonable costume piece either as is, or (as I intend) a base for further modification. Whether you intend to modify the helmet or not, one suggestion I have for everyone is to obtain some foam padding.  There are some pressure point areas that will likely become uncomfortable with extended wear.',
    'Okay, I like the Panasonic\'s Zs cameras. They\'re easy to get the hang of and I\'ve had no reliability issues. I started with a Panasonic Zs3. Later on I upgraded to a Zs7, then when I saw Amazon was selling a Zs9 which was the same as the Zs8 but with a stereo microphone and the price wasn\'t much over $100 (from Amazon Warehouse), I couldn\'t resist buying it for its 16x zoom. It\'s been a great camera. I love my Zs9 with it\'s long zoom despite it not having GPS (which I didn\'t care about) or HDMI out, and with a lower LCD screen resolution than the Zs7. I thought the screen clarity was fine and didn\'t miss the 460,000 dots on the Zs7.  Well, I couldn\'t resist this newer upgraded Zs. The Zs20 is my 4th Panasonic Lumix Zs camera (all of them purchased from Amazon), and I can say without reservation that this is the best of them all. I love the fast burst shooting made possible by the CMOS sensor.  The 16x zoom on the Zs8/9 is really great. I\'ve taken some fantastic close-up shots with it. Now I have a 20x zoom. Amazing. It\'s operates when shooting video too.  Canon\'s G series, by comparison, has a larger 1:1.7 sensor compared to the Zs20 1:2.33. But, the zoom is very limited on the Canon and the camera is too big to carry around in your pocket - and it cost twice as much (and the older unavailable G9 model is better than the later ones I hear.) Their competitor for this Panasonic is their 20x zoom model SX260SH with a similar 1:2.33 CMOS sensor. Professional reviewers are rating the Zs20 higher than the comparable Canon, which is slightly larger, heavier, and costs more. Everyone raves about Panasonic\'s Leica lenses, with good reason. The Zs20 is just an amazing camera for the price. It\'s even slightly smaller than the Zs 7,8/9/10 and easy to carry in your pocket in a thin stretch case. Like the Zs7 (which never cost less than about $250.00) it has the 460,000 dot bright LCD screen, HDMI out, and GPS. It also has more features than the Zs7 or 8/9/10.  Don\'t expect the price to get much lower. As I recall, the Zs10 (the real predecessor of the Zs20) price came down to about the Zs20 price today, or maybe at the very end, a few dollars less, but I wouldn\'t wait. Panasonic may refuse to lower the price any further. It\'s already so heavily discounted, and there is a limit to how low they\'ll go. If you want the absolute best deal, get a "used" one from Amazon Warehouse. It\'s hardly, if at all "used," and most likely brand new but returned for some reason and re-packaged. You have 30 days to send it back if there is anything wrong with it. You don\'t get the one year warranty however, so if that\'s important to you, but a new one. It\'s definitely worth the price. I already had an extra battery but you can buy a non-proprietary battery very inexpensively that lasts even longer than the battery that comes with the camera. I also had a wall charger (two - from my Zs7 and Zs9)), so I didn\'t have to buy that. I\'d say that\'s the only thing Panasonic let down on. It\'s worth getting a wall charger, but it should come with the camera. This is the first time it doesn\'t. Don\'t let it deter you.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 512]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[213.2979, 145.2629, 148.4550],
#         [145.3050, 219.8025, 145.6550],
#         [145.5939, 144.3196, 202.5328]])

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 202,427 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 402 tokens
    • mean: 499.1 tokens
    • max: 512 tokens
    • min: 393 tokens
    • mean: 499.22 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    This knife, and theJ.A. Henckels Twin Four Star 3-Inch High Carbon Stainless-Steel Paring Knifeare the two best knives in the Henckels four star collection. There is something "just so" about them. They are just right, with all the various design parameters coming together to create a whole that is greater than the sum of the parts. This serrated utility knife works well in a great variety of applications. The five inch serrated blade is nicely thin (but still thick enough for good strength and rigidity) and shallow (i.e. not broad). I find it very useful for cutting pie or cake or brownies, as well as (of course) bread and tomatoes and many other vegetables. I've had this knife now for SIXTEEN YEARS, and it is still going strong, and still one of my favorites. However . . . you must SHARPEN this knife eventually. Like any other knife, it will go dull. NEVER HONE THIS KNIFE OR ANY OTHER SERRATED KNIFE! A sharpening steel is too large a diameter to be used on a serrated knife.... When I moved into my first (and current) house from my apartment, the previous owner had a Whirlpool (Ecodyne) WHER25 reverse osmosis system installed under the kitchen sink. I liked the water the system produced, but the flow control was misfunctioning, causing an annoying dripping sound that was almost constant. The installer (previous owner, not a plumber) had NOT made the common mistake of trimming out the flow control--which was the first thing I suspected. No, the problem, rather, was deformation of the thin rubber membranes (there are two) inside the head of the unit. I flipped them over (they are reversible) and this fixed the problem for a month or so, but it returned. I priced out new membranes/gaskets and flow control insert, with shipping, and decided that I should just start fresh with a whole new unit, since it was on a special sale locally and it would come with all new filters ($80 worth). I replaced just the head and all was well for a while. Then the tank stopp...
    The Good: Sawstop customer service is the best I have dealt with in years. When set up correctly it cuts sheet good like a dream. Only a panel saw would seem better. The adjustable stops are stout and easy to use. Great for repeat cuts. Sliding mechanism is very smooth The Bad: No postive stops - in my experience this borders on being a huge problem for two reasons. First, it is not easy to get the fence square to the blade if you want to be very accurate. On the best of days it takes me 5 minutes to get it close enough to make a 48" cut square. Without positive stops I have to square the sliding table fence every time it is bumped or removed. And, I remove it regularly as the sliding table fence sits close enough to the blade that almost all cuts over 48" using the regular saw fence demand the sliding table fence be removed or swung out of the way (if the cut is less than 48" the sliding table can be moved back with fence in place forming a little pocket to work within). The fence th... The Bad (yes there are a lot of bad things even with 5 stars): One of the worst written non fiction books I have ever owned. I really don't care if one of the authors clients liked a sauce so well that she would eat it over kitty litter. I don't care to read 100s of testaments to how good the recipes are (they are pretty good). I just want to get on with the book. Prove the recipes are not good. Don't spend all those pages trying to convince me. It even backfired. I was sure they were going to be terrible are reading all the testaments. Get ready to cook. A lot. Get ready to do a lot of dishes. Have to plan ahead. Have to make lunches the night before often. The flax seed breakfast takes some work and time. Can't just whip it up. If you run out without having already prepared more you will find yourself without a breakfast. Terribly organized. It is not sequential. You will read something and then find out later you were not suppose to do it when you did unless you read the entire b...
    I've had a Samsung WB850 and I still have a Fuji F900EXR. Both are megazoom pocket cameras. I'm sold on pocket megazooms for reasons that I explained in my review of the WB850. I've put my big DSLRs away only for stuff where I need the features of a big DSLR and those times are becoming rarer. I changed from the WB850 for one reason. It is SLOW between shots. Super camera but it just took too much time between shots AND I wanted a camera that would shoot in raw format. The Fuji is fast between shots and it shoots in raw BUT it won't let you run the camera and charge the battery in the camera through the USB port. Therefore I always had to carry extra batteries. After owning the Fuji for a few months, I found that I really did not need raw pictures. I just never used the raw file, only the jpg file. If you don't know what the raw format is, then you most likely don't need raw. I saw the 9700 in a store and was super impressed with the zoom. There is a BIG difference between the 21X me... The system works great! There are a few points that I would like to point out for installation. 1. Each controller will handle up to three doors or gates but you can add multiple controllers. I have 5 garage doors to control. I needed two controllers and three extra door sensors since each controller comes with one sensor. You must switch between controllers in the app to control each group of doors. The app does remember the last used controller. 2. The app will always default to Door #1 and you must swipe left or right to control Door #2 or #3. Therefore if there is one particular door that you use most, make that door #1. Each door can be labeled with a unique name and the name is what you will see in the app. I use the middle door mainly and I had to go back and rewire the middle door to the #1 terminal so that it would always show up as the default door and I did not need to swipe to select it. Name your controller the PERMANENT name that you want to call it initially. Once it...
  • Loss: model.SpladeMixedTopKLoss.SpladeColbertTopKLoss with these parameters:
    {
        "loss": "ColbertMultipleNegativesRankingLoss",
        "document_regularizer_weight": 0
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • weight_decay: 0.01
  • num_train_epochs: 2
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.15
  • save_only_model: True
  • fp16: True
  • dataloader_num_workers: 8
  • gradient_checkpointing: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.15
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: True
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 8
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: True
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss
0.0032 20 28.6428
0.0063 40 26.2256
0.0095 60 26.0737
0.0126 80 22.2104
0.0158 100 17.9675
0.0190 120 14.3645
0.0221 140 9.8382
0.0253 160 6.1231
0.0285 180 4.5984
0.0316 200 3.6251
0.0348 220 3.3179
0.0379 240 2.8121
0.0411 260 2.3196
0.0443 280 2.1138
0.0474 300 2.1616
0.0506 320 2.3001
0.0537 340 1.7455
0.0569 360 1.7734
0.0601 380 1.7507
0.0632 400 1.8376
0.0664 420 1.6355
0.0696 440 1.6548
0.0727 460 1.7548
0.0759 480 1.7677
0.0790 500 1.7335
0.0822 520 1.6585
0.0854 540 1.7808
0.0885 560 1.5633
0.0917 580 1.5752
0.0948 600 1.6597
0.0980 620 1.4463
0.1012 640 1.6486
0.1043 660 1.8312
0.1075 680 1.6324
0.1107 700 1.5031
0.1138 720 1.5043
0.1170 740 1.7519
0.1201 760 1.5368
0.1233 780 1.5252
0.1265 800 1.6159
0.1296 820 1.7463
0.1328 840 1.8495
0.1359 860 1.7152
0.1391 880 1.6196
0.1423 900 1.5192
0.1454 920 1.7447
0.1486 940 1.6974
0.1518 960 1.5887
0.1549 980 1.4764
0.1581 1000 1.4227
0.1612 1020 1.3536
0.1644 1040 1.7506
0.1676 1060 1.5311
0.1707 1080 1.5044
0.1739 1100 1.3364
0.1770 1120 1.4623
0.1802 1140 1.4804
0.1834 1160 1.702
0.1865 1180 1.3781
0.1897 1200 1.3378
0.1929 1220 1.459
0.1960 1240 1.3585
0.1992 1260 1.3483
0.2023 1280 1.2617
0.2055 1300 1.3285
0.2087 1320 1.4407
0.2118 1340 1.2957
0.2150 1360 1.4965
0.2181 1380 1.2716
0.2213 1400 1.305
0.2245 1420 1.5987
0.2276 1440 1.9617
0.2308 1460 1.692
0.2340 1480 1.4688
0.2371 1500 1.2138
0.2403 1520 1.3798
0.2434 1540 1.2668
0.2466 1560 1.4434
0.2498 1580 1.4343
0.2529 1600 1.2343
0.2561 1620 1.3365
0.2592 1640 1.3023
0.2624 1660 1.4274
0.2656 1680 1.3786
0.2687 1700 1.4343
0.2719 1720 1.5181
0.2751 1740 1.1963
0.2782 1760 1.2356
0.2814 1780 1.23
0.2845 1800 1.3572
0.2877 1820 1.3385
0.2909 1840 1.3498
0.2940 1860 1.2505
0.2972 1880 1.3876
0.3003 1900 1.3779
0.3035 1920 1.2894
0.3067 1940 1.2486
0.3098 1960 1.2844
0.3130 1980 1.3135
0.3162 2000 1.1267
0.3193 2020 1.1558
0.3225 2040 1.3313
0.3256 2060 1.3522
0.3288 2080 1.2318
0.3320 2100 1.3701
0.3351 2120 1.1667
0.3383 2140 1.2692
0.3414 2160 1.2353
0.3446 2180 1.0708
0.3478 2200 1.2122
0.3509 2220 1.1419
0.3541 2240 1.2176
0.3573 2260 1.2348
0.3604 2280 1.234
0.3636 2300 1.2236
0.3667 2320 1.1314
0.3699 2340 1.2094
0.3731 2360 1.1324
0.3762 2380 1.1505
0.3794 2400 1.2998
0.3825 2420 1.1047
0.3857 2440 1.214
0.3889 2460 1.1673
0.3920 2480 1.0922
0.3952 2500 1.1501
0.3984 2520 1.1135
0.4015 2540 1.2597
0.4047 2560 1.1811
0.4078 2580 1.4975
0.4110 2600 1.2545
0.4142 2620 1.2265
0.4173 2640 1.2435
0.4205 2660 1.0913
0.4236 2680 1.1109
0.4268 2700 1.1235
0.4300 2720 1.2064
0.4331 2740 1.2203
0.4363 2760 1.1381
0.4395 2780 1.1552
0.4426 2800 1.246
0.4458 2820 1.1758
0.4489 2840 1.2303
0.4521 2860 1.1303
0.4553 2880 1.1296
0.4584 2900 1.1419
0.4616 2920 1.2288
0.4647 2940 1.1064
0.4679 2960 1.2217
0.4711 2980 1.1936
0.4742 3000 1.3667
0.4774 3020 1.373
0.4806 3040 1.1946
0.4837 3060 1.5584
0.4869 3080 1.2366
0.4900 3100 1.2799
0.4932 3120 1.286
0.4964 3140 1.1875
0.4995 3160 1.1452
0.5027 3180 1.2692
0.5058 3200 1.1087
0.5090 3220 1.379
0.5122 3240 1.0955
0.5153 3260 0.9732
0.5185 3280 1.3688
0.5217 3300 3.7253
0.5248 3320 9.729
0.5280 3340 8.5322
0.5311 3360 1.6707
0.5343 3380 2.5016
0.5375 3400 5.182
0.5406 3420 2.1036
0.5438 3440 1.5086
0.5469 3460 1.3835
0.5501 3480 1.3316
0.5533 3500 1.0839
0.5564 3520 1.1241
0.5596 3540 1.2075
0.5628 3560 1.326
0.5659 3580 1.2169
0.5691 3600 1.1474
0.5722 3620 1.228
0.5754 3640 1.0549
0.5786 3660 1.154
0.5817 3680 1.1328
0.5849 3700 1.1913
0.5880 3720 1.0713
0.5912 3740 1.1421
0.5944 3760 0.9968
0.5975 3780 1.0329
0.6007 3800 1.079
0.6039 3820 1.0308
0.6070 3840 1.1002
0.6102 3860 0.9787
0.6133 3880 1.0471
0.6165 3900 1.1687
0.6197 3920 2.0557
0.6228 3940 1.0667
0.6260 3960 1.1894
0.6291 3980 1.072
0.6323 4000 1.0059
0.6355 4020 0.9931
0.6386 4040 1.0642
0.6418 4060 1.074
0.6450 4080 1.9425
0.6481 4100 0.9978
0.6513 4120 1.087
0.6544 4140 1.0515
0.6576 4160 1.0739
0.6608 4180 1.1908
0.6639 4200 1.0785
0.6671 4220 0.9379
0.6702 4240 0.9539
0.6734 4260 1.0695
0.6766 4280 0.9849
0.6797 4300 1.2731
0.6829 4320 1.1422
0.6861 4340 1.1778
0.6892 4360 1.988
0.6924 4380 1.2742
0.6955 4400 1.1552
0.6987 4420 1.0634
0.7019 4440 1.1205
0.7050 4460 1.0362
0.7082 4480 0.9509
0.7113 4500 1.0206
0.7145 4520 1.1059
0.7177 4540 1.0915
0.7208 4560 1.3803
0.7240 4580 1.3414
0.7272 4600 1.785
0.7303 4620 0.93
0.7335 4640 1.0316
0.7366 4660 0.9974
0.7398 4680 1.7038
0.7430 4700 1.4334
0.7461 4720 6.8806
0.7493 4740 2.4809
0.7525 4760 1.0461
0.7556 4780 1.3042
0.7588 4800 1.8298
0.7619 4820 1.4291
0.7651 4840 1.3777
0.7683 4860 1.1557
0.7714 4880 1.181
0.7746 4900 1.0431
0.7777 4920 0.9924
0.7809 4940 1.2762
0.7841 4960 1.3096
0.7872 4980 1.2653
0.7904 5000 1.1159
0.7936 5020 1.3001
0.7967 5040 0.9852
0.7999 5060 1.2979
0.8030 5080 1.123
0.8062 5100 1.2087
0.8094 5120 0.9877
0.8125 5140 1.1369
0.8157 5160 1.5903
0.8188 5180 1.4377
0.8220 5200 1.0149
0.8252 5220 0.9692
0.8283 5240 1.0828
0.8315 5260 1.5313
0.8347 5280 0.9266
0.8378 5300 1.0082
0.8410 5320 1.0804
0.8441 5340 1.0393
0.8473 5360 1.0193
0.8505 5380 0.9763
0.8536 5400 1.7999
0.8568 5420 0.9753
0.8599 5440 0.8948
0.8631 5460 0.9001
0.8663 5480 1.2805
0.8694 5500 0.8856
0.8726 5520 0.9528
0.8758 5540 1.1261
0.8789 5560 1.0244
0.8821 5580 0.9389
0.8852 5600 1.1378
0.8884 5620 0.9005
0.8916 5640 1.0643
0.8947 5660 1.0409
0.8979 5680 1.1111
0.9010 5700 1.527
0.9042 5720 1.2022
0.9074 5740 1.134
0.9105 5760 1.1128
0.9137 5780 1.4697
0.9169 5800 1.1559
0.9200 5820 1.2828
0.9232 5840 1.2694
0.9263 5860 1.1258
0.9295 5880 1.1675
0.9327 5900 1.1709
0.9358 5920 1.5698
0.9390 5940 1.0853
0.9421 5960 1.4761
0.9453 5980 1.0478
0.9485 6000 0.9513
0.9516 6020 0.9381
0.9548 6040 1.0799
0.9580 6060 1.5161
0.9611 6080 1.0702
0.9643 6100 1.5374
0.9674 6120 1.524
0.9706 6140 1.0181
0.9738 6160 1.0289
0.9769 6180 1.0142
0.9801 6200 0.8989
0.9832 6220 0.9607
0.9864 6240 0.8816
0.9896 6260 0.9233
0.9927 6280 0.8896
0.9959 6300 1.0924
0.9991 6320 0.968
1.0022 6340 0.909
1.0054 6360 0.9127
1.0085 6380 0.9888
1.0117 6400 0.9214
1.0149 6420 1.0435
1.0180 6440 1.0115
1.0212 6460 0.9155
1.0243 6480 0.7896
1.0275 6500 0.8496
1.0307 6520 0.8769
1.0338 6540 0.8401
1.0370 6560 0.9762
1.0402 6580 0.8426
1.0433 6600 0.8695
1.0465 6620 0.8763
1.0496 6640 0.9235
1.0528 6660 0.881
1.0560 6680 0.9031
1.0591 6700 0.8607
1.0623 6720 0.8593
1.0654 6740 0.9486
1.0686 6760 0.9008
1.0718 6780 0.8607
1.0749 6800 0.9738
1.0781 6820 0.9142
1.0813 6840 0.9307
1.0844 6860 0.8854
1.0876 6880 0.8043
1.0907 6900 0.8476
1.0939 6920 0.811
1.0971 6940 0.8351
1.1002 6960 0.8359
1.1034 6980 0.859
1.1065 7000 0.9768
1.1097 7020 0.7727
1.1129 7040 0.8607
1.1160 7060 0.8446
1.1192 7080 1.0285
1.1224 7100 0.7571
1.1255 7120 0.7987
1.1287 7140 0.8789
1.1318 7160 0.8377
1.1350 7180 0.7203
1.1382 7200 0.8824
1.1413 7220 0.909
1.1445 7240 0.8797
1.1476 7260 0.7876
1.1508 7280 0.8024
1.1540 7300 0.8083
1.1571 7320 0.8453
1.1603 7340 0.844
1.1635 7360 0.84
1.1666 7380 0.8231
1.1698 7400 0.9652
1.1729 7420 0.8199
1.1761 7440 0.8569
1.1793 7460 0.8032
1.1824 7480 0.7358
1.1856 7500 0.8545
1.1887 7520 0.8115
1.1919 7540 0.8587
1.1951 7560 0.7829
1.1982 7580 0.8701
1.2014 7600 0.8066
1.2046 7620 0.8028
1.2077 7640 0.8269
1.2109 7660 0.8146
1.2140 7680 0.7742
1.2172 7700 0.8023
1.2204 7720 0.8261
1.2235 7740 0.8389
1.2267 7760 0.8576
1.2298 7780 0.764
1.2330 7800 0.9024
1.2362 7820 0.8104
1.2393 7840 0.769
1.2425 7860 0.7804
1.2457 7880 0.8267
1.2488 7900 0.7403
1.2520 7920 0.8371
1.2551 7940 0.965
1.2583 7960 0.8832
1.2615 7980 0.7371
1.2646 8000 0.8073
1.2678 8020 0.8241
1.2709 8040 0.7952
1.2741 8060 0.7955
1.2773 8080 0.8176
1.2804 8100 0.7168
1.2836 8120 0.7675
1.2868 8140 0.7554
1.2899 8160 0.8476
1.2931 8180 0.8156
1.2962 8200 0.7345
1.2994 8220 0.7445
1.3026 8240 0.8129
1.3057 8260 0.8321
1.3089 8280 0.777
1.3120 8300 0.7601
1.3152 8320 0.8386
1.3184 8340 0.8023
1.3215 8360 0.734
1.3247 8380 0.7604
1.3279 8400 0.7662
1.3310 8420 0.7875
1.3342 8440 0.7987
1.3373 8460 0.7414
1.3405 8480 0.801
1.3437 8500 0.7287
1.3468 8520 0.6786
1.3500 8540 0.7428
1.3531 8560 0.7375
1.3563 8580 0.689
1.3595 8600 0.8682
1.3626 8620 0.7152
1.3658 8640 0.8519
1.3690 8660 0.7737
1.3721 8680 0.7976
1.3753 8700 0.7806
1.3784 8720 0.8074
1.3816 8740 0.7799
1.3848 8760 0.7566
1.3879 8780 0.775
1.3911 8800 0.717
1.3942 8820 0.7135
1.3974 8840 0.8414
1.4006 8860 0.8132
1.4037 8880 0.712
1.4069 8900 0.7556
1.4101 8920 0.7766
1.4132 8940 0.8162
1.4164 8960 0.7816
1.4195 8980 0.7431
1.4227 9000 0.7273
1.4259 9020 0.7382
1.4290 9040 0.786
1.4322 9060 0.7608
1.4353 9080 0.7246
1.4385 9100 0.9673
1.4417 9120 0.7476
1.4448 9140 0.7798
1.4480 9160 0.7981
1.4512 9180 1.038
1.4543 9200 0.7107
1.4575 9220 0.7464
1.4606 9240 0.7481
1.4638 9260 0.734
1.4670 9280 0.8064
1.4701 9300 0.7194
1.4733 9320 0.7925
1.4764 9340 0.7638
1.4796 9360 1.0023
1.4828 9380 0.7646
1.4859 9400 0.6717
1.4891 9420 0.7554
1.4923 9440 0.7571
1.4954 9460 0.692
1.4986 9480 0.7567
1.5017 9500 0.7497
1.5049 9520 0.793
1.5081 9540 0.7369
1.5112 9560 0.7192
1.5144 9580 0.8147
1.5175 9600 1.0065
1.5207 9620 0.7092
1.5239 9640 0.7562
1.5270 9660 0.7591
1.5302 9680 0.7395
1.5334 9700 0.973
1.5365 9720 0.6733
1.5397 9740 0.7755
1.5428 9760 0.6654
1.5460 9780 0.7118
1.5492 9800 0.6827
1.5523 9820 0.9226
1.5555 9840 0.7468
1.5586 9860 0.7771
1.5618 9880 0.8062
1.5650 9900 0.7018
1.5681 9920 0.779
1.5713 9940 0.7385
1.5745 9960 0.7734
1.5776 9980 0.5872
1.5808 10000 0.7984
1.5839 10020 0.6556
1.5871 10040 0.763
1.5903 10060 0.6973
1.5934 10080 0.8943
1.5966 10100 0.6099
1.5997 10120 0.6872
1.6029 10140 0.6117
1.6061 10160 0.7191
1.6092 10180 0.6835
1.6124 10200 0.7652
1.6156 10220 0.6382
1.6187 10240 0.7626
1.6219 10260 0.6621
1.6250 10280 0.7596
1.6282 10300 0.6681
1.6314 10320 0.7242
1.6345 10340 0.8251
1.6377 10360 0.7695
1.6408 10380 0.6834
1.6440 10400 0.9807
1.6472 10420 0.664
1.6503 10440 0.6363
1.6535 10460 0.8276
1.6567 10480 0.7193
1.6598 10500 0.7666
1.6630 10520 0.7701
1.6661 10540 0.6138
1.6693 10560 0.766
1.6725 10580 0.7487
1.6756 10600 0.7803
1.6788 10620 0.7253
1.6819 10640 0.6903
1.6851 10660 0.7668
1.6883 10680 0.6539
1.6914 10700 0.7182
1.6946 10720 0.664
1.6978 10740 0.969
1.7009 10760 0.658
1.7041 10780 0.6847
1.7072 10800 0.6823
1.7104 10820 0.6856
1.7136 10840 0.9269
1.7167 10860 0.6424
1.7199 10880 0.7232
1.7230 10900 0.7651
1.7262 10920 0.7386
1.7294 10940 0.6512
1.7325 10960 0.6933
1.7357 10980 0.7798
1.7389 11000 1.0798
1.7420 11020 0.6725
1.7452 11040 0.6857
1.7483 11060 0.7417
1.7515 11080 0.6224
1.7547 11100 0.716
1.7578 11120 0.6733
1.7610 11140 0.6824
1.7641 11160 0.6968
1.7673 11180 0.7176
1.7705 11200 0.6751
1.7736 11220 0.7181
1.7768 11240 0.639
1.7800 11260 0.6679
1.7831 11280 0.8706
1.7863 11300 0.6419
1.7894 11320 0.6952
1.7926 11340 0.6709
1.7958 11360 0.6926
1.7989 11380 0.7631
1.8021 11400 0.7042
1.8052 11420 0.6538
1.8084 11440 0.894
1.8116 11460 0.6807
1.8147 11480 0.7875
1.8179 11500 0.6582
1.8211 11520 0.7407
1.8242 11540 0.7286
1.8274 11560 0.6443
1.8305 11580 0.7002
1.8337 11600 0.6918
1.8369 11620 0.7157
1.8400 11640 0.7565
1.8432 11660 0.663
1.8463 11680 0.6053
1.8495 11700 0.7206
1.8527 11720 0.6682
1.8558 11740 0.7064
1.8590 11760 0.73
1.8622 11780 0.7108
1.8653 11800 0.6975
1.8685 11820 0.7245
1.8716 11840 0.686
1.8748 11860 0.6269
1.8780 11880 0.6523
1.8811 11900 0.7276
1.8843 11920 0.695
1.8874 11940 0.678
1.8906 11960 0.6504
1.8938 11980 0.5766
1.8969 12000 0.6935
1.9001 12020 0.6321
1.9033 12040 0.6369
1.9064 12060 0.6187
1.9096 12080 0.7079
1.9127 12100 0.6413
1.9159 12120 0.639
1.9191 12140 0.716
1.9222 12160 0.6784
1.9254 12180 0.7079
1.9285 12200 0.6504
1.9317 12220 0.7201
1.9349 12240 0.7279
1.9380 12260 0.9232
1.9412 12280 0.6213
1.9444 12300 0.6959
1.9475 12320 0.7559
1.9507 12340 0.7514
1.9538 12360 0.6578
1.9570 12380 0.7104
1.9602 12400 0.6662
1.9633 12420 0.7136
1.9665 12440 0.6415
1.9696 12460 0.7226
1.9728 12480 0.7787
1.9760 12500 0.6803
1.9791 12520 0.6908
1.9823 12540 0.7203
1.9855 12560 0.6811
1.9886 12580 0.6963
1.9918 12600 0.714
1.9949 12620 0.7004
1.9981 12640 0.6596

Framework Versions

  • Python: 3.11.14
  • Sentence Transformers: 5.1.1
  • Transformers: 4.57.1
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.11.0
  • Datasets: 4.2.0
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

FlopsLoss

@article{paria2020minimizing,
    title={Minimizing flops to learn efficient sparse representations},
    author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
    journal={arXiv preprint arXiv:2004.05665},
    year={2020}
}
Downloads last month
5
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for UBC-SLIME/sparcol-large-k1024