SentenceTransformer based on distilbert/distilbert-base-uncased
This is a sentence-transformers model finetuned from distilbert/distilbert-base-uncased. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: distilbert/distilbert-base-uncased
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'T ENGINE TRANS TOP LAT 90 Deg Front 2025 U717 G-S',
'T R F ACTIVE VENT SQUIB VOLT 90 Deg Front 2021 P702 VOLTS',
'T ENGINE TRANS TOP LAT 30 Deg Front Angular Left 2020 P558 G-S',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Semantic Similarity
- Dataset:
sts-dev
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | Value |
---|---|
pearson_cosine | 0.4518 |
spearman_cosine | 0.4762 |
pearson_manhattan | 0.4253 |
spearman_manhattan | 0.4638 |
pearson_euclidean | 0.4262 |
spearman_euclidean | 0.4652 |
pearson_dot | 0.3898 |
spearman_dot | 0.374 |
pearson_max | 0.4518 |
spearman_max | 0.4762 |
Semantic Similarity
- Dataset:
sts-dev
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | Value |
---|---|
pearson_cosine | 0.4412 |
spearman_cosine | 0.4671 |
pearson_manhattan | 0.4156 |
spearman_manhattan | 0.456 |
pearson_euclidean | 0.4167 |
spearman_euclidean | 0.4575 |
pearson_dot | 0.3753 |
spearman_dot | 0.3629 |
pearson_max | 0.4412 |
spearman_max | 0.4671 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 8,081,275 training samples
- Columns:
sentence1
,sentence2
, andscore
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 score type string string float details - min: 23 tokens
- mean: 31.48 tokens
- max: 40 tokens
- min: 16 tokens
- mean: 30.06 tokens
- max: 55 tokens
- min: 0.0
- mean: 0.44
- max: 1.0
- Samples:
sentence1 sentence2 score T L F DUMMY PELVIS VERT Dynamic Seat Sled Test 2025 U718 G-S
T SCS R2 HY REF 059 R C PLR REF Y SM LAT 90 Deg / Left Side Decel-4g 2020 CX483 G-S
0.21129386503072142
T L F DUMMY PELVIS VERT Dynamic Seat Sled Test 2025 U718 G-S
T R F DUMMY PELVIS VERT 75 Deg Oblique Right Side 10 in. Pole 2015 P552 G-S
0.4972955033248179
T L F DUMMY PELVIS VERT Dynamic Seat Sled Test 2025 U718 G-S
T SCS L1 HY REF 053 L B PLR REF Y SM LAT 90 Deg Front Bumper Override 2021 CX727 G-S
0.5701051768787058
- Loss:
CoSENTLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "pairwise_cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 1,726,581 evaluation samples
- Columns:
sentence1
,sentence2
, andscore
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 score type string string float details - min: 22 tokens
- mean: 25.0 tokens
- max: 30 tokens
- min: 16 tokens
- mean: 31.04 tokens
- max: 53 tokens
- min: 0.0
- mean: 0.44
- max: 1.0
- Samples:
sentence1 sentence2 score T R F ADAPTIVE TETHER VENT SQUIB VOLT 30 Deg Front Angular Right 20xx GENERIC VOLTS
T L F DUMMY T12 LONG 27 Deg Crabbed Left Side NHTSA 214 MDB to vehicle 2015 P552 G-S
0.6835618484879796
T R F ADAPTIVE TETHER VENT SQUIB VOLT 30 Deg Front Angular Right 20xx GENERIC VOLTS
T L F DUMMY R FEMUR LONG 90 Deg Front 2022 U553 G-S
0.666531064739
T R F ADAPTIVE TETHER VENT SQUIB VOLT 30 Deg Front Angular Right 20xx GENERIC VOLTS
T R F DUMMY NECK UPPER MZ LOAD 90 Deg Front 2019 P375ICA IN-LBS
0.46391834212079874
- Loss:
CoSENTLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "pairwise_cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 32per_device_eval_batch_size
: 32learning_rate
: 3e-05num_train_epochs
: 4warmup_ratio
: 0.1fp16
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 32per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonelearning_rate
: 3e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: linearwarmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 4ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Truedataloader_num_workers
: 0past_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Nonedeepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Trueskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falsefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Falseinclude_tokens_per_second
: Falseneftune_noise_alpha
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | loss | sts-dev_spearman_cosine |
---|---|---|---|---|
0.0317 | 1000 | 6.3069 | - | - |
0.0634 | 2000 | 6.1793 | - | - |
0.0950 | 3000 | 6.1607 | - | - |
0.1267 | 4000 | 6.1512 | - | - |
0.1584 | 5000 | 6.1456 | - | - |
0.1901 | 6000 | 6.1419 | - | - |
0.2218 | 7000 | 6.1398 | - | - |
0.2534 | 8000 | 6.1377 | - | - |
0.2851 | 9000 | 6.1352 | - | - |
0.3168 | 10000 | 6.1338 | - | - |
0.3485 | 11000 | 6.1332 | - | - |
0.3801 | 12000 | 6.1309 | - | - |
0.4118 | 13000 | 6.1315 | - | - |
0.4435 | 14000 | 6.1283 | - | - |
0.4752 | 15000 | 6.129 | - | - |
0.5069 | 16000 | 6.1271 | - | - |
0.5385 | 17000 | 6.1265 | - | - |
0.5702 | 18000 | 6.1238 | - | - |
0.6019 | 19000 | 6.1234 | - | - |
0.6336 | 20000 | 6.1225 | - | - |
0.6653 | 21000 | 6.1216 | - | - |
0.6969 | 22000 | 6.1196 | - | - |
0.7286 | 23000 | 6.1198 | - | - |
0.7603 | 24000 | 6.1178 | - | - |
0.7920 | 25000 | 6.117 | - | - |
0.8236 | 26000 | 6.1167 | - | - |
0.8553 | 27000 | 6.1165 | - | - |
0.8870 | 28000 | 6.1149 | - | - |
0.9187 | 29000 | 6.1146 | - | - |
0.9504 | 30000 | 6.113 | - | - |
0.9820 | 31000 | 6.1143 | - | - |
1.0 | 31567 | - | 6.1150 | 0.4829 |
1.0137 | 32000 | 6.1115 | - | - |
1.0454 | 33000 | 6.111 | - | - |
1.0771 | 34000 | 6.1091 | - | - |
1.1088 | 35000 | 6.1094 | - | - |
1.1404 | 36000 | 6.1078 | - | - |
1.1721 | 37000 | 6.1095 | - | - |
1.2038 | 38000 | 6.106 | - | - |
1.2355 | 39000 | 6.1071 | - | - |
1.2671 | 40000 | 6.1073 | - | - |
1.2988 | 41000 | 6.1064 | - | - |
1.3305 | 42000 | 6.1047 | - | - |
1.3622 | 43000 | 6.1054 | - | - |
1.3939 | 44000 | 6.1048 | - | - |
1.4255 | 45000 | 6.1053 | - | - |
1.4572 | 46000 | 6.1058 | - | - |
1.4889 | 47000 | 6.1037 | - | - |
1.5206 | 48000 | 6.1041 | - | - |
1.5523 | 49000 | 6.1023 | - | - |
1.5839 | 50000 | 6.1018 | - | - |
1.6156 | 51000 | 6.104 | - | - |
1.6473 | 52000 | 6.1004 | - | - |
1.6790 | 53000 | 6.1027 | - | - |
1.7106 | 54000 | 6.1017 | - | - |
1.7423 | 55000 | 6.1011 | - | - |
1.7740 | 56000 | 6.1002 | - | - |
1.8057 | 57000 | 6.0994 | - | - |
1.8374 | 58000 | 6.0985 | - | - |
1.8690 | 59000 | 6.0986 | - | - |
1.9007 | 60000 | 6.1006 | - | - |
1.9324 | 61000 | 6.0983 | - | - |
1.9641 | 62000 | 6.0983 | - | - |
1.9958 | 63000 | 6.0973 | - | - |
2.0 | 63134 | - | 6.1193 | 0.4828 |
2.0274 | 64000 | 6.0943 | - | - |
2.0591 | 65000 | 6.0941 | - | - |
2.0908 | 66000 | 6.0936 | - | - |
2.1225 | 67000 | 6.0909 | - | - |
2.1541 | 68000 | 6.0925 | - | - |
2.1858 | 69000 | 6.0932 | - | - |
2.2175 | 70000 | 6.0939 | - | - |
2.2492 | 71000 | 6.0919 | - | - |
2.2809 | 72000 | 6.0932 | - | - |
2.3125 | 73000 | 6.0916 | - | - |
2.3442 | 74000 | 6.0919 | - | - |
2.3759 | 75000 | 6.0919 | - | - |
2.4076 | 76000 | 6.0911 | - | - |
2.4393 | 77000 | 6.0924 | - | - |
2.4709 | 78000 | 6.0911 | - | - |
2.5026 | 79000 | 6.0922 | - | - |
2.5343 | 80000 | 6.0926 | - | - |
2.5660 | 81000 | 6.0911 | - | - |
2.5976 | 82000 | 6.0897 | - | - |
2.6293 | 83000 | 6.0922 | - | - |
2.6610 | 84000 | 6.0908 | - | - |
2.6927 | 85000 | 6.0884 | - | - |
2.7244 | 86000 | 6.0907 | - | - |
2.7560 | 87000 | 6.0904 | - | - |
2.7877 | 88000 | 6.0881 | - | - |
2.8194 | 89000 | 6.0902 | - | - |
2.8511 | 90000 | 6.088 | - | - |
2.8828 | 91000 | 6.0888 | - | - |
2.9144 | 92000 | 6.0884 | - | - |
2.9461 | 93000 | 6.0881 | - | - |
2.9778 | 94000 | 6.0896 | - | - |
3.0 | 94701 | - | 6.1225 | 0.4788 |
3.0095 | 95000 | 6.0857 | - | - |
3.0412 | 96000 | 6.0838 | - | - |
3.0728 | 97000 | 6.0843 | - | - |
3.1045 | 98000 | 6.0865 | - | - |
3.1362 | 99000 | 6.0827 | - | - |
3.1679 | 100000 | 6.0836 | - | - |
3.1995 | 101000 | 6.0837 | - | - |
3.2312 | 102000 | 6.0836 | - | - |
3.2629 | 103000 | 6.0837 | - | - |
3.2946 | 104000 | 6.084 | - | - |
3.3263 | 105000 | 6.0836 | - | - |
3.3579 | 106000 | 6.0808 | - | - |
3.3896 | 107000 | 6.0821 | - | - |
3.4213 | 108000 | 6.0817 | - | - |
3.4530 | 109000 | 6.082 | - | - |
3.4847 | 110000 | 6.083 | - | - |
3.5163 | 111000 | 6.0829 | - | - |
3.5480 | 112000 | 6.0832 | - | - |
3.5797 | 113000 | 6.0829 | - | - |
3.6114 | 114000 | 6.0837 | - | - |
3.6430 | 115000 | 6.082 | - | - |
3.6747 | 116000 | 6.0823 | - | - |
3.7064 | 117000 | 6.082 | - | - |
3.7381 | 118000 | 6.0833 | - | - |
3.7698 | 119000 | 6.0831 | - | - |
3.8014 | 120000 | 6.0814 | - | - |
3.8331 | 121000 | 6.0813 | - | - |
3.8648 | 122000 | 6.0797 | - | - |
3.8965 | 123000 | 6.0793 | - | - |
3.9282 | 124000 | 6.0818 | - | - |
3.9598 | 125000 | 6.0806 | - | - |
3.9915 | 126000 | 6.08 | - | - |
4.0 | 126268 | - | 6.1266 | 0.4671 |
Framework Versions
- Python: 3.10.6
- Sentence Transformers: 3.0.0
- Transformers: 4.35.0
- PyTorch: 2.1.0a0+4136153
- Accelerate: 0.30.1
- Datasets: 2.14.1
- Tokenizers: 0.14.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
CoSENTLoss
@online{kexuefm-8847,
title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
author={Su Jianlin},
year={2022},
month={Jan},
url={https://kexue.fm/archives/8847},
}
- Downloads last month
- 6
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for gkudirka/crash_encoder2-sts
Base model
distilbert/distilbert-base-uncasedEvaluation results
- Pearson Cosine on sts devself-reported0.452
- Spearman Cosine on sts devself-reported0.476
- Pearson Manhattan on sts devself-reported0.425
- Spearman Manhattan on sts devself-reported0.464
- Pearson Euclidean on sts devself-reported0.426
- Spearman Euclidean on sts devself-reported0.465
- Pearson Dot on sts devself-reported0.390
- Spearman Dot on sts devself-reported0.374
- Pearson Max on sts devself-reported0.452
- Spearman Max on sts devself-reported0.476