SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2 on the csv dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • csv

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("yyzheng00/all-mpnet-base-v2_snomed_expression")
# Run inference
sentences = [
    '|Neoplasm of anterior wall of nasopharynx (disorder)| + |Neoplasm of uncertain behavior of nasopharynx (disorder)| : { |Finding site (attribute)| = |Structure of anterior wall of nasopharynx (body structure)|, |Associated morphology (attribute)| = |Neoplasm of uncertain behavior (morphologic abnormality)| }',
    'Neoplasm of uncertain behavior of lateral wall of nasopharynx (disorder)',
    'Secondary angle-closure glaucoma - synechial (disorder)',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.9049
spearman_cosine 0.8556

Training Details

Training Dataset

csv

  • Dataset: csv
  • Size: 360,886 training samples
  • Columns: text_a, text_b, and label
  • Approximate statistics based on the first 1000 samples:
    text_a text_b label
    type string string int
    details
    • min: 28 tokens
    • mean: 101.13 tokens
    • max: 357 tokens
    • min: 7 tokens
    • mean: 15.29 tokens
    • max: 60 tokens
    • 0: ~51.40%
    • 1: ~48.60%
  • Samples:
    text_a text_b label
    Risk assessment (procedure) : {
    Chronic inflammatory disorder (disorder) +
    Imaging of head (procedure) +
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Evaluation Dataset

csv

  • Dataset: csv
  • Size: 360,886 evaluation samples
  • Columns: text_a, text_b, and label
  • Approximate statistics based on the first 1000 samples:
    text_a text_b label
    type string string int
    details
    • min: 25 tokens
    • mean: 101.18 tokens
    • max: 366 tokens
    • min: 7 tokens
    • mean: 15.21 tokens
    • max: 52 tokens
    • 0: ~51.30%
    • 1: ~48.70%
  • Samples:
    text_a text_b label
    Disorder of fetal abdominal region (disorder) +
    Computed tomography of pelvis for brachytherapy planning (procedure) +
    Product containing only hydroxyzine in oral dose form (medicinal product form) :
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss sts-dev_spearman_cosine
0.0055 100 5.2922 3.9427 0.6159
0.0111 200 3.2766 2.8638 0.7437
0.0166 300 2.8445 2.4816 0.7833
0.0222 400 2.5209 2.2995 0.7974
0.0277 500 2.5298 2.1033 0.8072
0.0333 600 2.0427 2.1055 0.8114
0.0388 700 2.1367 2.0634 0.8121
0.0443 800 2.2486 1.7848 0.8210
0.0499 900 1.921 1.9666 0.8190
0.0554 1000 1.9962 1.9688 0.8180
0.0610 1100 1.5203 2.0695 0.8187
0.0665 1200 2.0616 1.7060 0.8223
0.0720 1300 2.0793 1.8158 0.8254
0.0776 1400 2.0766 1.8549 0.8213
0.0831 1500 1.5608 1.8045 0.8241
0.0887 1600 1.7671 1.9724 0.8196
0.0942 1700 2.1665 2.2623 0.8033
0.0998 1800 1.9596 1.8070 0.8224
0.1053 1900 1.5704 1.8142 0.8265
0.1108 2000 2.0749 2.0596 0.8205
0.1164 2100 1.9445 1.7458 0.8279
0.1219 2200 1.6043 2.0309 0.8242
0.1275 2300 1.5723 1.7440 0.8286
0.1330 2400 1.7905 1.5584 0.8319
0.1385 2500 2.0777 1.7437 0.8254
0.1441 2600 1.7563 1.6852 0.8322
0.1496 2700 1.6565 1.8196 0.8268
0.1552 2800 1.5064 1.6763 0.8302
0.1607 2900 1.9221 1.7317 0.8279
0.1663 3000 1.7803 1.8330 0.8225
0.1718 3100 1.3559 1.9419 0.8278
0.1773 3200 1.5309 1.5263 0.8345
0.1829 3300 1.6429 1.7952 0.8290
0.1884 3400 1.4676 1.8284 0.8270
0.1940 3500 1.5167 1.6084 0.8295
0.1995 3600 1.7605 1.6362 0.8334
0.2050 3700 1.6812 1.4205 0.8348
0.2106 3800 1.4537 1.6432 0.8341
0.2161 3900 1.6718 1.2594 0.8382
0.2217 4000 1.3892 1.4798 0.8351
0.2272 4100 1.7261 1.3948 0.8354
0.2328 4200 1.6611 1.4519 0.8368
0.2383 4300 1.3181 1.2844 0.8389
0.2438 4400 1.4356 1.3015 0.8392
0.2494 4500 1.4077 1.3217 0.8381
0.2549 4600 1.2534 1.5767 0.8340
0.2605 4700 1.6881 1.2737 0.8398
0.2660 4800 1.4572 1.2570 0.8408
0.2715 4900 1.2339 1.1919 0.8423
0.2771 5000 1.2871 1.3166 0.8398
0.2826 5100 1.3532 1.4045 0.8360
0.2882 5200 1.2731 1.4843 0.8384
0.2937 5300 1.3776 1.1347 0.8423
0.2993 5400 1.2179 1.5040 0.8373
0.3048 5500 1.41 1.2401 0.8418
0.3103 5600 1.3901 1.1494 0.8416
0.3159 5700 1.4007 1.2487 0.8414
0.3214 5800 1.3444 1.4062 0.8397
0.3270 5900 1.3671 1.3194 0.8410
0.3325 6000 1.2401 1.2642 0.8411
0.3380 6100 1.4102 1.3317 0.8392
0.3436 6200 1.1672 1.0846 0.8438
0.3491 6300 1.3595 1.2747 0.8387
0.3547 6400 1.0956 1.4071 0.8392
0.3602 6500 1.539 1.2683 0.8413
0.3658 6600 1.3078 1.2173 0.8430
0.3713 6700 1.3562 1.0733 0.8447
0.3768 6800 1.3009 1.3561 0.8408
0.3824 6900 1.4319 1.1958 0.8432
0.3879 7000 1.0702 1.1325 0.8437
0.3935 7100 1.2339 0.9852 0.8465
0.3990 7200 0.8772 1.2658 0.8419
0.4045 7300 1.3411 1.1585 0.8438
0.4101 7400 1.1518 1.1572 0.8439
0.4156 7500 1.0287 0.9960 0.8456
0.4212 7600 1.2913 1.1595 0.8437
0.4267 7700 1.1006 1.1575 0.8437
0.4323 7800 1.3463 1.0478 0.8459
0.4378 7900 1.0428 1.0495 0.8461
0.4433 8000 1.0657 1.0442 0.8465
0.4489 8100 1.1002 1.0223 0.8475
0.4544 8200 1.1596 1.0066 0.8474
0.4600 8300 1.3218 1.0403 0.8460
0.4655 8400 1.1482 1.1177 0.8457
0.4710 8500 1.0033 1.1743 0.8448
0.4766 8600 1.0772 1.1071 0.8464
0.4821 8700 0.775 1.2731 0.8438
0.4877 8800 0.8859 0.9293 0.8491
0.4932 8900 0.7837 1.0760 0.8462
0.4988 9000 0.7768 1.0135 0.8470
0.5043 9100 1.0103 0.9691 0.8477
0.5098 9200 1.0219 1.2059 0.8441
0.5154 9300 0.9093 1.0895 0.8461
0.5209 9400 1.0176 0.9229 0.8489
0.5265 9500 1.3811 0.9470 0.8483
0.5320 9600 0.8338 1.0048 0.8477
0.5375 9700 0.7105 1.0591 0.8464
0.5431 9800 1.0313 0.9789 0.8482
0.5486 9900 1.0308 0.8741 0.8499
0.5542 10000 0.7353 0.9419 0.8482
0.5597 10100 0.7683 1.0695 0.8473
0.5653 10200 1.1728 0.9705 0.8494
0.5708 10300 0.8578 0.9633 0.8493
0.5763 10400 1.0095 0.7799 0.8514
0.5819 10500 1.0157 1.0333 0.8485
0.5874 10600 0.8164 0.8596 0.8509
0.5930 10700 0.9278 0.8256 0.8516
0.5985 10800 0.5919 1.0104 0.8493
0.6040 10900 0.6931 0.9957 0.8492
0.6096 11000 1.1545 0.9758 0.8494
0.6151 11100 1.1061 1.0360 0.8493
0.6207 11200 0.7954 0.9362 0.8509
0.6262 11300 0.6365 0.9504 0.8511
0.6318 11400 0.992 0.8553 0.8521
0.6373 11500 0.6971 0.8763 0.8520
0.6428 11600 0.8162 0.9527 0.8504
0.6484 11700 0.8973 0.8722 0.8519
0.6539 11800 0.7652 0.9417 0.8510
0.6595 11900 0.7305 0.8955 0.8519
0.6650 12000 0.8555 0.9007 0.8510
0.6705 12100 0.7165 0.7924 0.8530
0.6761 12200 0.7939 0.8607 0.8516
0.6816 12300 0.9873 0.7780 0.8533
0.6872 12400 0.7197 0.9380 0.8508
0.6927 12500 1.076 0.8041 0.8531
0.6983 12600 0.6853 0.8800 0.8517
0.7038 12700 0.9403 0.8181 0.8527
0.7093 12800 0.8598 0.7641 0.8536
0.7149 12900 0.628 0.7479 0.8540
0.7204 13000 1.0517 0.7611 0.8536
0.7260 13100 0.5099 0.8426 0.8521
0.7315 13200 0.751 0.8133 0.8526
0.7370 13300 0.572 0.8344 0.8524
0.7426 13400 0.8213 0.7869 0.8528
0.7481 13500 0.6046 0.7810 0.8528
0.7537 13600 0.7211 0.7502 0.8537
0.7592 13700 0.7443 0.7398 0.8538
0.7648 13800 0.6644 0.8257 0.8529
0.7703 13900 0.8948 0.7271 0.8536
0.7758 14000 0.6886 0.7607 0.8531
0.7814 14100 0.8322 0.7143 0.8540
0.7869 14200 0.6965 0.7270 0.8540
0.7925 14300 0.6478 0.7368 0.8541
0.7980 14400 0.6877 0.7690 0.8532
0.8035 14500 0.6289 0.7316 0.8538
0.8091 14600 0.9058 0.6514 0.8548
0.8146 14700 0.5971 0.6980 0.8542
0.8202 14800 0.5774 0.7124 0.8539
0.8257 14900 0.6134 0.7480 0.8534
0.8313 15000 0.6962 0.6284 0.8551
0.8368 15100 0.5934 0.7099 0.8540
0.8423 15200 0.7791 0.6925 0.8542
0.8479 15300 0.5418 0.6774 0.8544
0.8534 15400 0.7526 0.6380 0.8552
0.8590 15500 0.694 0.6967 0.8543
0.8645 15600 0.5813 0.6864 0.8543
0.8700 15700 0.726 0.6325 0.8552
0.8756 15800 0.5094 0.6491 0.8549
0.8811 15900 0.5728 0.6549 0.8549
0.8867 16000 0.5272 0.6723 0.8548
0.8922 16100 0.6896 0.6786 0.8546
0.8978 16200 0.5666 0.6629 0.8550
0.9033 16300 0.7312 0.6801 0.8549
0.9088 16400 0.6451 0.6779 0.8549
0.9144 16500 0.6572 0.6374 0.8556
0.9199 16600 0.5052 0.6672 0.8551
0.9255 16700 0.5395 0.6686 0.8550
0.9310 16800 0.4715 0.6840 0.8547
0.9365 16900 0.7149 0.6576 0.8552
0.9421 17000 0.5066 0.6533 0.8553
0.9476 17100 0.6382 0.6509 0.8552
0.9532 17200 0.5585 0.6729 0.8550
0.9587 17300 0.5953 0.6505 0.8554
0.9643 17400 0.3545 0.6487 0.8555
0.9698 17500 0.8031 0.6451 0.8555
0.9753 17600 0.8531 0.6366 0.8557
0.9809 17700 0.7154 0.6365 0.8557
0.9864 17800 0.3339 0.6339 0.8557
0.9920 17900 0.5858 0.6410 0.8556
0.9975 18000 0.7509 0.6400 0.8556

Framework Versions

  • Python: 3.11.1
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.0
  • PyTorch: 2.1.1+cu121
  • Accelerate: 1.2.0
  • Datasets: 2.18.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}
Downloads last month
5
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for yyzheng00/all-mpnet-base-v2_snomed_expression

Finetuned
(215)
this model

Evaluation results