Edit model card

SentenceTransformer

This is a sentence-transformers model trained on the triplets dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • triplets

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NomicBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("lv12/esci-nomic-embed-text-v1_5_4")
# Run inference
sentences = [
    'search_query: karoke set 2 microphone for adults',
    'search_document: Starion KS829-B Bluetooth Karaoke Machine l Pedestal Design w/Light Show l Two Karaoke Microphones, Starion, Black',
    'search_document: EARISE T26 Portable Karaoke Machine Bluetooth Speaker with Wireless Microphone, Rechargeable PA System with FM Radio, Audio Recording, Remote Control, Supports TF Card/USB, Perfect for Party, EARISE, ',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.7298
dot_accuracy 0.2832
manhattan_accuracy 0.7282
euclidean_accuracy 0.7299
max_accuracy 0.7299

Semantic Similarity

Metric Value
pearson_cosine 0.4148
spearman_cosine 0.3997
pearson_manhattan 0.3771
spearman_manhattan 0.3699
pearson_euclidean 0.3778
spearman_euclidean 0.3708
pearson_dot 0.3814
spearman_dot 0.3817
pearson_max 0.4148
spearman_max 0.3997

Information Retrieval

Metric Value
cosine_accuracy@10 0.967
cosine_precision@10 0.6951
cosine_recall@10 0.6217
cosine_ndcg@10 0.83
cosine_mrr@10 0.9111
cosine_map@10 0.7758
dot_accuracy@10 0.946
dot_precision@10 0.6369
dot_recall@10 0.5693
dot_ndcg@10 0.7669
dot_mrr@10 0.8754
dot_map@10 0.6962

Training Details

Training Dataset

triplets

  • Dataset: triplets
  • Size: 1,600,000 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 7 tokens
    • mean: 11.03 tokens
    • max: 39 tokens
    • min: 10 tokens
    • mean: 39.86 tokens
    • max: 104 tokens
    • min: 9 tokens
    • mean: 39.73 tokens
    • max: 159 tokens
  • Samples:
    anchor positive negative
    search_query: udt hydraulic fluid search_document: Triax Agra UTTO XL Synthetic Blend Tractor Transmission and Hydraulic Oil, 6,000 Hour Life, 50% Less wear, 36F Pour Point, Replaces All OEM Tractor Fluids (5 Gallon Pail), TRIAX, search_document: Shell Rotella T5 Synthetic Blend 15W-40 Diesel Engine Oil (1-Gallon, Case of 3), Shell Rotella,
    search_query: cheetah print iphone xs case search_document: iPhone Xs Case, iPhone Xs Case,Doowear Leopard Cheetah Protective Cover Shell For Girls Women,Slim Fit Anti Scratch Shockproof Soft TPU Bumper Flexible Rubber Gel Silicone Case for iPhone Xs / X-1, Ebetterr, 1 search_document: iPhone Xs & iPhone X Case, J.west Luxury Sparkle Bling Translucent Leopard Print Soft Silicone Phone Case Cover for Girls Women Flex Slim Design Pattern Drop Protective Case for iPhone Xs/x 5.8 inch, J.west, Leopard
    search_query: platform shoes search_document: Teva Women's Flatform Universal Platform Sandal, Black, 5 M US, Teva, Black search_document: Vans Women's Old Skool Platform Trainers, (Black/White Y28), 5 UK 38 EU, Vans, Black/White
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.8
    }
    

Evaluation Dataset

triplets

  • Dataset: triplets
  • Size: 16,000 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 7 tokens
    • mean: 11.02 tokens
    • max: 29 tokens
    • min: 10 tokens
    • mean: 38.78 tokens
    • max: 87 tokens
    • min: 9 tokens
    • mean: 38.81 tokens
    • max: 91 tokens
  • Samples:
    anchor positive negative
    search_query: hogknobz search_document: Black 2014-2015 HDsmallPARTS/LocEzy Saddlebag Mounting Hardware Knobs are replacement/compatible for Saddlebag Quick Release Pins on Harley Davidson Touring Motorcycles Theft Deterrent, LocEzy, search_document: HANSWD Saddlebag Support Bars Brackets For SUZUKI YAMAHA KAWASAKI (Black), HANSWD, Black
    search_query: tile sticker key finder search_document: Tile Sticker (2020) 2-pack - Small, Adhesive Bluetooth Tracker, Item Locator and Finder for Remotes, Headphones, Gadgets and More, Tile, search_document: Tile Pro Combo (2017) - 2 Pack (1 x Sport, 1 x Style) - Discontinued by Manufacturer, Tile, Graphite/Gold
    search_query: adobe incense burner search_document: AM Incense Burner Frankincense Resin - Luxury Globe Charcoal Bakhoor Burners for Office & Home Decor (Brown), AM, Brown search_document: semli Large Incense Burner Backflow Incense Burner Holder Incense Stick Holder Home Office Decor, Semli,
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.COSINE",
        "triplet_margin": 0.8
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 16
  • gradient_accumulation_steps: 2
  • learning_rate: 1e-07
  • num_train_epochs: 5
  • lr_scheduler_type: polynomial
  • lr_scheduler_kwargs: {'lr_end': 1e-08, 'power': 2.0}
  • warmup_ratio: 0.05
  • dataloader_drop_last: True
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: 4
  • load_best_model_at_end: True
  • gradient_checkpointing: True
  • auto_find_batch_size: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • learning_rate: 1e-07
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: polynomial
  • lr_scheduler_kwargs: {'lr_end': 1e-08, 'power': 2.0}
  • warmup_ratio: 0.05
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: 4
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: True
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: True
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss triplets loss cosine_accuracy cosine_map@10 spearman_cosine
0.0008 10 0.7505 - - - -
0.0016 20 0.7499 - - - -
0.0024 30 0.7524 - - - -
0.0032 40 0.7486 - - - -
0.004 50 0.7493 - - - -
0.0048 60 0.7476 - - - -
0.0056 70 0.7483 - - - -
0.0064 80 0.7487 - - - -
0.0072 90 0.7496 - - - -
0.008 100 0.7515 0.7559 0.7263 0.7684 0.3941
0.0088 110 0.7523 - - - -
0.0096 120 0.7517 - - - -
0.0104 130 0.7534 - - - -
0.0112 140 0.746 - - - -
0.012 150 0.7528 - - - -
0.0128 160 0.7511 - - - -
0.0136 170 0.7491 - - - -
0.0144 180 0.752 - - - -
0.0152 190 0.7512 - - - -
0.016 200 0.7513 0.7557 0.7259 0.7688 0.3942
0.0168 210 0.7505 - - - -
0.0176 220 0.7481 - - - -
0.0184 230 0.7516 - - - -
0.0192 240 0.7504 - - - -
0.02 250 0.7498 - - - -
0.0208 260 0.7506 - - - -
0.0216 270 0.7486 - - - -
0.0224 280 0.7471 - - - -
0.0232 290 0.7511 - - - -
0.024 300 0.7506 0.7553 0.7258 0.7692 0.3943
0.0248 310 0.7485 - - - -
0.0256 320 0.7504 - - - -
0.0264 330 0.7456 - - - -
0.0272 340 0.7461 - - - -
0.028 350 0.7496 - - - -
0.0288 360 0.7518 - - - -
0.0296 370 0.7514 - - - -
0.0304 380 0.7479 - - - -
0.0312 390 0.7507 - - - -
0.032 400 0.7511 0.7547 0.7258 0.7695 0.3945
0.0328 410 0.7491 - - - -
0.0336 420 0.7487 - - - -
0.0344 430 0.7496 - - - -
0.0352 440 0.7464 - - - -
0.036 450 0.7518 - - - -
0.0368 460 0.7481 - - - -
0.0376 470 0.7493 - - - -
0.0384 480 0.753 - - - -
0.0392 490 0.7475 - - - -
0.04 500 0.7498 0.7540 0.7262 0.7700 0.3948
0.0408 510 0.7464 - - - -
0.0416 520 0.7506 - - - -
0.0424 530 0.747 - - - -
0.0432 540 0.7462 - - - -
0.044 550 0.75 - - - -
0.0448 560 0.7522 - - - -
0.0456 570 0.7452 - - - -
0.0464 580 0.7475 - - - -
0.0472 590 0.7507 - - - -
0.048 600 0.7494 0.7531 0.7269 0.7707 0.3951
0.0488 610 0.7525 - - - -
0.0496 620 0.7446 - - - -
0.0504 630 0.7457 - - - -
0.0512 640 0.7462 - - - -
0.052 650 0.7478 - - - -
0.0528 660 0.7459 - - - -
0.0536 670 0.7465 - - - -
0.0544 680 0.7495 - - - -
0.0552 690 0.7513 - - - -
0.056 700 0.7445 0.7520 0.7274 0.7705 0.3954
0.0568 710 0.7446 - - - -
0.0576 720 0.746 - - - -
0.0584 730 0.7452 - - - -
0.0592 740 0.7459 - - - -
0.06 750 0.7419 - - - -
0.0608 760 0.7462 - - - -
0.0616 770 0.7414 - - - -
0.0624 780 0.7444 - - - -
0.0632 790 0.7419 - - - -
0.064 800 0.7438 0.7508 0.7273 0.7712 0.3957
0.0648 810 0.7503 - - - -
0.0656 820 0.7402 - - - -
0.0664 830 0.7435 - - - -
0.0672 840 0.741 - - - -
0.068 850 0.7386 - - - -
0.0688 860 0.7416 - - - -
0.0696 870 0.7473 - - - -
0.0704 880 0.7438 - - - -
0.0712 890 0.7458 - - - -
0.072 900 0.7446 0.7494 0.7279 0.7718 0.3961
0.0728 910 0.7483 - - - -
0.0736 920 0.7458 - - - -
0.0744 930 0.7473 - - - -
0.0752 940 0.7431 - - - -
0.076 950 0.7428 - - - -
0.0768 960 0.7385 - - - -
0.0776 970 0.7438 - - - -
0.0784 980 0.7406 - - - -
0.0792 990 0.7426 - - - -
0.08 1000 0.7372 0.7478 0.7282 0.7725 0.3965
0.0808 1010 0.7396 - - - -
0.0816 1020 0.7398 - - - -
0.0824 1030 0.7376 - - - -
0.0832 1040 0.7417 - - - -
0.084 1050 0.7408 - - - -
0.0848 1060 0.7415 - - - -
0.0856 1070 0.7468 - - - -
0.0864 1080 0.7427 - - - -
0.0872 1090 0.7371 - - - -
0.088 1100 0.7375 0.7460 0.7279 0.7742 0.3970
0.0888 1110 0.7434 - - - -
0.0896 1120 0.7441 - - - -
0.0904 1130 0.7378 - - - -
0.0912 1140 0.735 - - - -
0.092 1150 0.739 - - - -
0.0928 1160 0.7408 - - - -
0.0936 1170 0.7346 - - - -
0.0944 1180 0.7389 - - - -
0.0952 1190 0.7367 - - - -
0.096 1200 0.7358 0.7440 0.729 0.7747 0.3975
0.0968 1210 0.7381 - - - -
0.0976 1220 0.7405 - - - -
0.0984 1230 0.7348 - - - -
0.0992 1240 0.737 - - - -
0.1 1250 0.7393 - - - -
0.1008 1260 0.7411 - - - -
0.1016 1270 0.7359 - - - -
0.1024 1280 0.7276 - - - -
0.1032 1290 0.7364 - - - -
0.104 1300 0.7333 0.7418 0.7293 0.7747 0.3979
0.1048 1310 0.7367 - - - -
0.1056 1320 0.7352 - - - -
0.1064 1330 0.7333 - - - -
0.1072 1340 0.737 - - - -
0.108 1350 0.7361 - - - -
0.1088 1360 0.7299 - - - -
0.1096 1370 0.7339 - - - -
0.1104 1380 0.7349 - - - -
0.1112 1390 0.7318 - - - -
0.112 1400 0.7336 0.7394 0.7292 0.7749 0.3983
0.1128 1410 0.7326 - - - -
0.1136 1420 0.7317 - - - -
0.1144 1430 0.7315 - - - -
0.1152 1440 0.7321 - - - -
0.116 1450 0.7284 - - - -
0.1168 1460 0.7308 - - - -
0.1176 1470 0.7287 - - - -
0.1184 1480 0.727 - - - -
0.1192 1490 0.7298 - - - -
0.12 1500 0.7306 0.7368 0.7301 0.7755 0.3988
0.1208 1510 0.7269 - - - -
0.1216 1520 0.7299 - - - -
0.1224 1530 0.7256 - - - -
0.1232 1540 0.721 - - - -
0.124 1550 0.7274 - - - -
0.1248 1560 0.7251 - - - -
0.1256 1570 0.7248 - - - -
0.1264 1580 0.7244 - - - -
0.1272 1590 0.7275 - - - -
0.128 1600 0.7264 0.7339 0.7298 0.7756 0.3991
0.1288 1610 0.7252 - - - -
0.1296 1620 0.7287 - - - -
0.1304 1630 0.7263 - - - -
0.1312 1640 0.7216 - - - -
0.132 1650 0.7231 - - - -
0.1328 1660 0.728 - - - -
0.1336 1670 0.7309 - - - -
0.1344 1680 0.7243 - - - -
0.1352 1690 0.7239 - - - -
0.136 1700 0.7219 0.7309 0.7302 0.7768 0.3994
0.1368 1710 0.7212 - - - -
0.1376 1720 0.7217 - - - -
0.1384 1730 0.7118 - - - -
0.1392 1740 0.7226 - - - -
0.14 1750 0.7185 - - - -
0.1408 1760 0.7228 - - - -
0.1416 1770 0.7257 - - - -
0.1424 1780 0.7177 - - - -
0.1432 1790 0.722 - - - -
0.144 1800 0.712 0.7276 0.7307 0.7763 0.3997
0.1448 1810 0.7193 - - - -
0.1456 1820 0.7138 - - - -
0.1464 1830 0.7171 - - - -
0.1472 1840 0.7191 - - - -
0.148 1850 0.7172 - - - -
0.1488 1860 0.7168 - - - -
0.1496 1870 0.7111 - - - -
0.1504 1880 0.7203 - - - -
0.1512 1890 0.7095 - - - -
0.152 1900 0.7064 0.7240 0.7301 0.7762 0.3998
0.1528 1910 0.7147 - - - -
0.1536 1920 0.7098 - - - -
0.1544 1930 0.7193 - - - -
0.1552 1940 0.7096 - - - -
0.156 1950 0.7107 - - - -
0.1568 1960 0.7146 - - - -
0.1576 1970 0.7106 - - - -
0.1584 1980 0.7079 - - - -
0.1592 1990 0.7097 - - - -
0.16 2000 0.71 0.7202 0.7298 0.7758 0.3997

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.0.1
  • Transformers: 4.38.2
  • PyTorch: 2.1.2+cu121
  • Accelerate: 0.27.2
  • Datasets: 2.19.1
  • Tokenizers: 0.15.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification}, 
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
3
Safetensors
Model size
137M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Evaluation results