Add new SentenceTransformer model.
language: []
library_name: sentence-transformers
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - dataset_size:10K<n<100K
  - loss:CosineSimilarityLoss
base_model: pierreinalco/distilbert-base-uncased-sts
  - pearson_cosine
  - spearman_cosine
  - pearson_manhattan
  - spearman_manhattan
  - pearson_euclidean
  - spearman_euclidean
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
  - source_sentence: '[SYNTAX] Inversion is a common syntactic feature in questions.'
      - >-
        [SYNTAX] DNA transcription is a common biological mechanism regulating
        RNA synthesis.
      - >-
        [SYNTAX] Fermions and bosons are the two broad categories of subatomic
      - >-
        Extensive legislative debate is often required when amending existing
        public policies.
  - source_sentence: The examination of meaning in language is known as semantics.
      - 'Semantics is the study of meaning in language. '
      - >-
        [SYNTAX] Extreme weather events are becoming more frequent due to
        climate change.
      - >-
        Regular practice is essential to ensure the success of musical
  - source_sentence: Marine life thrives in ecosystems teeming with diverse species.
      - Climate change modifies the balance of ecosystems around the globe.
      - >-
        [SYNTAX] One key focus of archaeology is the exploration of ancient
      - >-
        By examining butcher marks, scientists can infer ancient dietary
  - source_sentence: Bicyclists rode swiftly in the park while a gentle breeze blew.
      - >-
        Urban parks offer residents vital green spaces for recreation and
      - >-
        Contour farming follows the natural shape of the land to improve water
      - >-
        Skipping breakfast can affect your energy levels and concentration
        throughout the day.
  - source_sentence: Fossil fuel reserves are finite and will eventually be depleted.
      - >-
        Trace fossils, like footprints and burrows, reveal the behavior of
        ancient organisms.
      - >-
        Electric trains are more environmentally friendly compared to
        diesel-powered ones.
      - >-
        A declining atmospheric pressure frequently indicates the imminent
        arrival of a storm.
pipeline_tag: sentence-similarity
  - name: SentenceTransformer based on pierreinalco/distilbert-base-uncased-sts
      - task:
          type: semantic-similarity
          name: Semantic Similarity
          name: custom dev
          type: custom-dev
          - type: pearson_cosine
            value: 0.9199550350229381
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.8477353426901187
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.922270207368092
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.8455601721195604
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.9225814550760436
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.8455566196441302
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.9112758242260417
            name: Pearson Dot
          - type: spearman_dot
            value: 0.8381909699446571
            name: Spearman Dot
          - type: pearson_max
            value: 0.9225814550760436
            name: Pearson Max
          - type: spearman_max
            value: 0.8477353426901187
            name: Spearman Max
      - task:
          type: semantic-similarity
          name: Semantic Similarity
          name: custom test
          type: custom-test
          - type: pearson_cosine
            value: 0.9124658569127322
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.8453565105014698
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.9161256101176948
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.845382323611419
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.9165265409472989
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.8457233262812305
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.903021036040846
            name: Pearson Dot
          - type: spearman_dot
            value: 0.8319052098219432
            name: Spearman Dot
          - type: pearson_max
            value: 0.9165265409472989
            name: Pearson Max
          - type: spearman_max
            value: 0.8457233262812305
            name: Spearman Max

SentenceTransformer based on pierreinalco/distilbert-base-uncased-sts

This is a sentence-transformers model finetuned from pierreinalco/distilbert-base-uncased-sts. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})


Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Fossil fuel reserves are finite and will eventually be depleted.',
    'Trace fossils, like footprints and burrows, reveal the behavior of ancient organisms.',
    'Electric trains are more environmentally friendly compared to diesel-powered ones.',
embeddings = model.encode(sentences)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
# [3, 3]



Semantic Similarity

Metric Value
pearson_cosine 0.92
spearman_cosine 0.8477
pearson_manhattan 0.9223
spearman_manhattan 0.8456
pearson_euclidean 0.9226
spearman_euclidean 0.8456
pearson_dot 0.9113
spearman_dot 0.8382
pearson_max 0.9226
spearman_max 0.8477

Semantic Similarity

Metric Value
pearson_cosine 0.9125
spearman_cosine 0.8454
pearson_manhattan 0.9161
spearman_manhattan 0.8454
pearson_euclidean 0.9165
spearman_euclidean 0.8457
pearson_dot 0.903
spearman_dot 0.8319
pearson_max 0.9165
spearman_max 0.8457

Training Details

Training Dataset

Unnamed Dataset

  • Size: 19,352 training samples
  • Columns: s1, s2, and label
  • Approximate statistics based on the first 1000 samples:
    s1 s2 label
    type string string int
    • min: 10 tokens
    • mean: 19.85 tokens
    • max: 38 tokens
    • min: 11 tokens
    • mean: 20.47 tokens
    • max: 34 tokens
    • 0: ~51.40%
    • 1: ~48.60%
  • Samples:
    s1 s2 label
    Resources and funding are essential for the successful rollout of any new curriculum. For any new curriculum to be successfully rolled out, it is essential to have resources and funding. 1
    Upgrading to LED lighting is a simple step toward improving energy efficiency in buildings. Upgrading to new software is a simple step toward improving technology adoption in companies. 0
    Ethnicity and language often intersect in interesting and complex ways. Ethnicity and culture often diverge in unexpected and straightforward ways. 0
  • Loss: CosineSimilarityLoss with these parameters:
        "loss_fct": "torch.nn.modules.loss.MSELoss"

Evaluation Dataset

Unnamed Dataset

  • Size: 2,419 evaluation samples
  • Columns: s1, s2, and label
  • Approximate statistics based on the first 1000 samples:
    s1 s2 label
    type string string int
    • min: 10 tokens
    • mean: 19.91 tokens
    • max: 39 tokens
    • min: 11 tokens
    • mean: 20.41 tokens
    • max: 38 tokens
    • 0: ~52.90%
    • 1: ~47.10%
  • Samples:
    s1 s2 label
    [SYNTAX] Consuming too much processed sugar can lead to insulin resistance and diabetes. [SYNTAX] Drinking too much water can help maintain proper hydration and overall health. 1
    Neutral tones and minimalist designs are staples of gender-neutral fashion. Colorful patterns and intricate designs are staples of traditional ceremonial attire. 0
    [SYNTAX] Policies focusing on sustainable agriculture practices are essential for ensuring food security in the face of climate change. [SYNTAX] Ensuring food security amidst climate change requires critical policies that emphasize sustainable agricultural practices. 0
  • Loss: CosineSimilarityLoss with these parameters:
        "loss_fct": "torch.nn.modules.loss.MSELoss"

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 10
  • warmup_ratio: 0.1
  • fp16: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss custom-dev_spearman_cosine custom-test_spearman_cosine
0.3300 100 0.2137 0.0971 0.8252 -
0.6601 200 0.0722 0.0516 0.8445 -
0.9901 300 0.0503 0.0440 0.8480 -
1.3201 400 0.0353 0.0417 0.8479 -
1.6502 500 0.032 0.0388 0.8500 -
1.9802 600 0.0312 0.0375 0.8484 -
2.3102 700 0.0175 0.0380 0.8494 -
2.6403 800 0.016 0.0368 0.8486 -
2.9703 900 0.0158 0.0367 0.8486 -
3.3003 1000 0.0087 0.0394 0.8463 -
3.6304 1100 0.0086 0.0371 0.8463 -
3.9604 1200 0.0098 0.0368 0.8475 -
4.2904 1300 0.0055 0.0384 0.8496 -
4.6205 1400 0.0057 0.0379 0.8466 -
4.9505 1500 0.0057 0.0389 0.8473 -
5.2805 1600 0.0037 0.0391 0.8482 -
5.6106 1700 0.0042 0.0379 0.8477 -
5.9406 1800 0.0039 0.0380 0.8479 -
6.2706 1900 0.0026 0.0390 0.8477 -
6.6007 2000 0.0028 0.0390 0.8475 -
6.9307 2100 0.0031 0.0385 0.8473 -
7.2607 2200 0.0022 0.0393 0.8473 -
7.5908 2300 0.0021 0.0391 0.8470 -
7.9208 2400 0.002 0.0387 0.8482 -
8.2508 2500 0.0013 0.0389 0.8482 -
8.5809 2600 0.0014 0.0392 0.8484 -
8.9109 2700 0.0018 0.0390 0.8479 -
9.2409 2800 0.0015 0.0393 0.8480 -
9.5710 2900 0.0012 0.0393 0.8479 -
9.9010 3000 0.0013 0.0394 0.8477 -
10.0 3030 - - - 0.8454

Framework Versions

  • Python: 3.11.9
  • Sentence Transformers: 3.0.0
  • Transformers: 4.41.2
  • PyTorch: 2.3.0+cu121
  • Accelerate: 0.30.1
  • Datasets: 2.19.1
  • Tokenizers: 0.19.1



