SentenceTransformer
This is a sentence-transformers model trained. It maps sentences & paragraphs to a 512-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 512 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 512, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("pankajrajdeo/182500_bioformer_8L")
# Run inference
sentences = [
'vägtrafikolyckor',
'accidente vial',
'trimeresurus andersoni',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 512]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 9,358,675 training samples
- Columns:
anchor
,positive
, andnegative
- Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 6 tokens
- mean: 12.84 tokens
- max: 23 tokens
- min: 3 tokens
- mean: 15.45 tokens
- max: 187 tokens
- min: 3 tokens
- mean: 14.75 tokens
- max: 91 tokens
- Samples:
anchor positive negative (131)i-makroaggregerat albumin
macroagrégats d'albumine humaine marquée à l'iode 131
1-acylglycerophosphorylinositol
(131)i-makroaggregerat albumin
albumin, radio-iodinated serum
allo-aromadendrane-10alpha,14-diol
(131)i-makroaggregerat albumin
serum albumin, radio iodinated
acquired zygomatic hyperplasia
- Loss:
TripletLoss
with these parameters:{ "distance_metric": "TripletDistanceMetric.EUCLIDEAN", "triplet_margin": 5 }
Evaluation Dataset
Unnamed Dataset
- Size: 820,102 evaluation samples
- Columns:
anchor
,positive
, andnegative
- Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 3 tokens
- mean: 10.54 tokens
- max: 20 tokens
- min: 3 tokens
- mean: 13.21 tokens
- max: 183 tokens
- min: 3 tokens
- mean: 14.98 tokens
- max: 322 tokens
- Samples:
anchor positive negative 15-ketosteryloleathydrolase
steroid esterase, lipoidal
glutamic acid-lysine-tyrosine terpolymer
15-ketosteryloleathydrolase
hydrolase, cholesterol ester
unionicola parvipora
15-ketosteryloleathydrolase
acylhydrolase, sterol ester
mayamaea fossalis var. fossalis
- Loss:
TripletLoss
with these parameters:{ "distance_metric": "TripletDistanceMetric.EUCLIDEAN", "triplet_margin": 5 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 128learning_rate
: 2e-05num_train_epochs
: 10warmup_ratio
: 0.1fp16
: Trueload_best_model_at_end
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 128per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 10max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | loss |
---|---|---|---|
0.0137 | 1000 | 2.6865 | - |
0.0274 | 2000 | 1.4053 | - |
0.0410 | 3000 | 0.9222 | - |
0.0547 | 4000 | 0.7162 | - |
0.0684 | 5000 | 0.6036 | - |
0.0821 | 6000 | 0.5245 | - |
0.0957 | 7000 | 0.4665 | - |
0.1094 | 8000 | 0.4215 | - |
0.1231 | 9000 | 0.3931 | - |
0.1368 | 10000 | 0.3661 | - |
0.1504 | 11000 | 0.348 | - |
0.1641 | 12000 | 0.3241 | - |
0.1778 | 13000 | 0.3108 | - |
0.1915 | 14000 | 0.2943 | - |
0.2052 | 15000 | 0.2817 | - |
0.2188 | 16000 | 0.2653 | - |
0.2325 | 17000 | 0.2562 | - |
0.2462 | 18000 | 0.2529 | - |
0.2599 | 19000 | 0.2438 | - |
0.2735 | 20000 | 0.2359 | - |
0.2872 | 21000 | 0.2237 | - |
0.3009 | 22000 | 0.2207 | - |
0.3146 | 23000 | 0.2143 | - |
0.3283 | 24000 | 0.2141 | - |
0.3419 | 25000 | 0.2024 | - |
0.3556 | 26000 | 0.196 | - |
0.3693 | 27000 | 0.1951 | - |
0.3830 | 28000 | 0.19 | - |
0.3966 | 29000 | 0.1864 | - |
0.4103 | 30000 | 0.1866 | - |
0.4240 | 31000 | 0.1797 | - |
0.4377 | 32000 | 0.1805 | - |
0.4513 | 33000 | 0.1681 | - |
0.4650 | 34000 | 0.1712 | - |
0.4787 | 35000 | 0.1698 | - |
0.4924 | 36000 | 0.1619 | - |
0.4992 | 36500 | - | 0.1407 |
0.5061 | 37000 | 0.1652 | - |
0.5197 | 38000 | 0.1622 | - |
0.5334 | 39000 | 0.1603 | - |
0.5471 | 40000 | 0.1518 | - |
0.5608 | 41000 | 0.1488 | - |
0.5744 | 42000 | 0.1531 | - |
0.5881 | 43000 | 0.1472 | - |
0.6018 | 44000 | 0.1454 | - |
0.6155 | 45000 | 0.1473 | - |
0.6291 | 46000 | 0.1411 | - |
0.6428 | 47000 | 0.1389 | - |
0.6565 | 48000 | 0.1375 | - |
0.6702 | 49000 | 0.1393 | - |
0.6839 | 50000 | 0.1366 | - |
0.6975 | 51000 | 0.134 | - |
0.7112 | 52000 | 0.1331 | - |
0.7249 | 53000 | 0.1323 | - |
0.7386 | 54000 | 0.1309 | - |
0.7522 | 55000 | 0.1254 | - |
0.7659 | 56000 | 0.1298 | - |
0.7796 | 57000 | 0.1244 | - |
0.7933 | 58000 | 0.1254 | - |
0.8069 | 59000 | 0.1205 | - |
0.8206 | 60000 | 0.1213 | - |
0.8343 | 61000 | 0.1226 | - |
0.8480 | 62000 | 0.1187 | - |
0.8617 | 63000 | 0.1158 | - |
0.8753 | 64000 | 0.1171 | - |
0.8890 | 65000 | 0.1137 | - |
0.9027 | 66000 | 0.1172 | - |
0.9164 | 67000 | 0.1169 | - |
0.9300 | 68000 | 0.1137 | - |
0.9437 | 69000 | 0.1145 | - |
0.9574 | 70000 | 0.1127 | - |
0.9711 | 71000 | 0.1126 | - |
0.9848 | 72000 | 0.1126 | - |
0.9984 | 73000 | 0.1078 | 0.0997 |
1.0121 | 74000 | 0.0999 | - |
1.0258 | 75000 | 0.1001 | - |
1.0395 | 76000 | 0.0962 | - |
1.0531 | 77000 | 0.0984 | - |
1.0668 | 78000 | 0.0982 | - |
1.0805 | 79000 | 0.098 | - |
1.0942 | 80000 | 0.0964 | - |
1.1078 | 81000 | 0.0964 | - |
1.1215 | 82000 | 0.0949 | - |
1.1352 | 83000 | 0.0929 | - |
1.1489 | 84000 | 0.0914 | - |
1.1626 | 85000 | 0.0918 | - |
1.1762 | 86000 | 0.0916 | - |
1.1899 | 87000 | 0.0891 | - |
1.2036 | 88000 | 0.0921 | - |
1.2173 | 89000 | 0.0925 | - |
1.2309 | 90000 | 0.091 | - |
1.2446 | 91000 | 0.0875 | - |
1.2583 | 92000 | 0.0898 | - |
1.2720 | 93000 | 0.0856 | - |
1.2856 | 94000 | 0.0866 | - |
1.2993 | 95000 | 0.0843 | - |
1.3130 | 96000 | 0.0848 | - |
1.3267 | 97000 | 0.0872 | - |
1.3404 | 98000 | 0.0853 | - |
1.3540 | 99000 | 0.0898 | - |
1.3677 | 100000 | 0.0831 | - |
1.3814 | 101000 | 0.0819 | - |
1.3951 | 102000 | 0.0842 | - |
1.4087 | 103000 | 0.083 | - |
1.4224 | 104000 | 0.0824 | - |
1.4361 | 105000 | 0.0802 | - |
1.4498 | 106000 | 0.0834 | - |
1.4634 | 107000 | 0.0833 | - |
1.4771 | 108000 | 0.0815 | - |
1.4908 | 109000 | 0.079 | - |
1.4976 | 109500 | - | 0.0820 |
1.5045 | 110000 | 0.0809 | - |
1.5182 | 111000 | 0.0784 | - |
1.5318 | 112000 | 0.0767 | - |
1.5455 | 113000 | 0.0782 | - |
1.5592 | 114000 | 0.0799 | - |
1.5729 | 115000 | 0.0787 | - |
1.5865 | 116000 | 0.0798 | - |
1.6002 | 117000 | 0.0821 | - |
1.6139 | 118000 | 0.0771 | - |
1.6276 | 119000 | 0.0758 | - |
1.6413 | 120000 | 0.0789 | - |
1.6549 | 121000 | 0.0777 | - |
1.6686 | 122000 | 0.0755 | - |
1.6823 | 123000 | 0.0774 | - |
1.6960 | 124000 | 0.0748 | - |
1.7096 | 125000 | 0.077 | - |
1.7233 | 126000 | 0.0755 | - |
1.7370 | 127000 | 0.0749 | - |
1.7507 | 128000 | 0.0718 | - |
1.7643 | 129000 | 0.0753 | - |
1.7780 | 130000 | 0.0728 | - |
1.7917 | 131000 | 0.0704 | - |
1.8054 | 132000 | 0.0719 | - |
1.8191 | 133000 | 0.0711 | - |
1.8327 | 134000 | 0.0713 | - |
1.8464 | 135000 | 0.0695 | - |
1.8601 | 136000 | 0.0716 | - |
1.8738 | 137000 | 0.0691 | - |
1.8874 | 138000 | 0.0692 | - |
1.9011 | 139000 | 0.0744 | - |
1.9148 | 140000 | 0.0726 | - |
1.9285 | 141000 | 0.0682 | - |
1.9421 | 142000 | 0.0695 | - |
1.9558 | 143000 | 0.0723 | - |
1.9695 | 144000 | 0.0711 | - |
1.9832 | 145000 | 0.0692 | - |
1.9969 | 146000 | 0.0694 | 0.0704 |
2.0105 | 147000 | 0.0572 | - |
2.0242 | 148000 | 0.0545 | - |
2.0379 | 149000 | 0.0549 | - |
2.0516 | 150000 | 0.0552 | - |
2.0652 | 151000 | 0.0551 | - |
2.0789 | 152000 | 0.0559 | - |
2.0926 | 153000 | 0.0582 | - |
2.1063 | 154000 | 0.0587 | - |
2.1199 | 155000 | 0.0529 | - |
2.1336 | 156000 | 0.059 | - |
2.1473 | 157000 | 0.0534 | - |
2.1610 | 158000 | 0.0547 | - |
2.1747 | 159000 | 0.0543 | - |
2.1883 | 160000 | 0.0558 | - |
2.2020 | 161000 | 0.0548 | - |
2.2157 | 162000 | 0.0534 | - |
2.2294 | 163000 | 0.0548 | - |
2.2430 | 164000 | 0.0546 | - |
2.2567 | 165000 | 0.053 | - |
2.2704 | 166000 | 0.0557 | - |
2.2841 | 167000 | 0.0541 | - |
2.2978 | 168000 | 0.0527 | - |
2.3114 | 169000 | 0.0542 | - |
2.3251 | 170000 | 0.0529 | - |
2.3388 | 171000 | 0.0554 | - |
2.3525 | 172000 | 0.054 | - |
2.3661 | 173000 | 0.0506 | - |
2.3798 | 174000 | 0.054 | - |
2.3935 | 175000 | 0.0525 | - |
2.4072 | 176000 | 0.0542 | - |
2.4208 | 177000 | 0.0546 | - |
2.4345 | 178000 | 0.0516 | - |
2.4482 | 179000 | 0.053 | - |
2.4619 | 180000 | 0.0542 | - |
2.4756 | 181000 | 0.0538 | - |
2.4892 | 182000 | 0.0536 | - |
2.4961 | 182500 | - | 0.0655 |
Framework Versions
- Python: 3.9.16
- Sentence Transformers: 3.1.1
- Transformers: 4.45.2
- PyTorch: 2.4.1+cu121
- Accelerate: 1.0.0
- Datasets: 3.0.1
- Tokenizers: 0.20.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
TripletLoss
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.