metadata
language: []
library_name: sentence-transformers
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:4068
- loss:MultipleNegativesRankingLoss
base_model: distilbert/distilroberta-base
datasets: []
metrics:
- pearson_cosine
- spearman_cosine
- pearson_manhattan
- spearman_manhattan
- pearson_euclidean
- spearman_euclidean
- pearson_dot
- spearman_dot
- pearson_max
- spearman_max
widget:
- source_sentence: >-
Proficiency in C# scripting is essential for creating custom scripts and
extensions to enhance ABBYY FlexiCapture and ABBYY Vantage functionality.
sentences:
- Successfully presented financial reports to executives
- Worked on improving user interfaces using HTML and CSS
- Created extensions to optimize data capture processes
- source_sentence: >-
Knowledgeable in supporting Cyber Security Operations and investigation
requests.
sentences:
- Assisted in incident response for security breaches
- Coordinated communication strategies for corporate events
- Developed mobile applications for e-commerce
- source_sentence: >-
Bachelor’s degree in Human Resources, Business Administration, Finance or
related field
sentences:
- prepared monthly production reports for management meetings
- Bachelor of Science in Human Resources Management
- Completed a course in Marketing Strategy
- source_sentence: >-
A strong interest in photography or videography is necessary for this
role.
sentences:
- produced short promotional videos for social media platforms
- Conducted training sessions for new software implementations
- conducted market research on competitor strategies
- source_sentence: Ability to work both independently and as part of a collaborative team.
sentences:
- Worked in isolation and avoided team interactions
- Participated in team meetings and contributed to group problem-solving
- Authored clear documentation for complex data processes
pipeline_tag: sentence-similarity
model-index:
- name: SentenceTransformer based on distilbert/distilroberta-base
results:
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts dev
type: sts-dev
metrics:
- type: pearson_cosine
value: 0.7992382726015851
name: Pearson Cosine
- type: spearman_cosine
value: 0.8047353015653143
name: Spearman Cosine
- type: pearson_manhattan
value: 0.7959439027738936
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.7940263609217374
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.7957522013263527
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.7941887779903888
name: Spearman Euclidean
- type: pearson_dot
value: 0.5317541949973523
name: Pearson Dot
- type: spearman_dot
value: 0.5390259111701268
name: Spearman Dot
- type: pearson_max
value: 0.7992382726015851
name: Pearson Max
- type: spearman_max
value: 0.8047353015653143
name: Spearman Max
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts test
type: sts-test
metrics:
- type: pearson_cosine
value: 0.7508747335014652
name: Pearson Cosine
- type: spearman_cosine
value: 0.7343818974365368
name: Spearman Cosine
- type: pearson_manhattan
value: 0.7429083946804279
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.7262987823076023
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.7419896002102524
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.7250585009844766
name: Spearman Euclidean
- type: pearson_dot
value: 0.4701047985009806
name: Pearson Dot
- type: spearman_dot
value: 0.47577938055391156
name: Spearman Dot
- type: pearson_max
value: 0.7508747335014652
name: Pearson Max
- type: spearman_max
value: 0.7343818974365368
name: Spearman Max
SentenceTransformer based on distilbert/distilroberta-base
This is a sentence-transformers model finetuned from distilbert/distilroberta-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: distilbert/distilroberta-base
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("trbeers/distilroberta-base-nli-v0.1")
# Run inference
sentences = [
'Ability to work both independently and as part of a collaborative team.',
'Participated in team meetings and contributed to group problem-solving',
'Worked in isolation and avoided team interactions',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Semantic Similarity
- Dataset:
sts-dev
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | Value |
---|---|
pearson_cosine | 0.7992 |
spearman_cosine | 0.8047 |
pearson_manhattan | 0.7959 |
spearman_manhattan | 0.794 |
pearson_euclidean | 0.7958 |
spearman_euclidean | 0.7942 |
pearson_dot | 0.5318 |
spearman_dot | 0.539 |
pearson_max | 0.7992 |
spearman_max | 0.8047 |
Semantic Similarity
- Dataset:
sts-test
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | Value |
---|---|
pearson_cosine | 0.7509 |
spearman_cosine | 0.7344 |
pearson_manhattan | 0.7429 |
spearman_manhattan | 0.7263 |
pearson_euclidean | 0.742 |
spearman_euclidean | 0.7251 |
pearson_dot | 0.4701 |
spearman_dot | 0.4758 |
pearson_max | 0.7509 |
spearman_max | 0.7344 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 4,068 training samples
- Columns:
anchor
,positive
, andnegative
- Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 8 tokens
- mean: 16.67 tokens
- max: 37 tokens
- min: 7 tokens
- mean: 11.82 tokens
- max: 22 tokens
- min: 5 tokens
- mean: 9.13 tokens
- max: 15 tokens
- Samples:
anchor positive negative Experience in managing meetings with program participants and tracking action items effectively.
Coordinated project meetings and followed up on team tasks
Assisted in developing marketing strategies
Ability to replace faulty electrical components with precision.
Conducted detailed inspections of wiring and circuits
Handled plumbing repairs and maintenance tasks
Knowledge of loss prevention, security, and safety protocols.
Implemented safety measures in warehouse operations
Worked as a sales associate
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 1,018 evaluation samples
- Columns:
anchor
,positive
, andnegative
- Approximate statistics based on the first 1000 samples:
anchor positive negative type string string string details - min: 6 tokens
- mean: 16.56 tokens
- max: 42 tokens
- min: 6 tokens
- mean: 11.77 tokens
- max: 20 tokens
- min: 5 tokens
- mean: 9.0 tokens
- max: 17 tokens
- Samples:
anchor positive negative The ability to complete a background investigation and drug screen is necessary for employment.
Conducted thorough background investigations for security personnel
Managed scheduling for office staff
Ability to create compelling business cases to drive organizational change.
Developed comprehensive business cases that successfully led to strategic organizational changes
Managed project timelines and budgets for software development projects
Proven understanding of ERP concepts and their applications in business.
Conducted workshops on business process improvement
Managed social media accounts
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 128per_device_eval_batch_size
: 128num_train_epochs
: 1warmup_ratio
: 0.1batch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 128per_device_eval_batch_size
: 128per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | loss | sts-dev_spearman_cosine | sts-test_spearman_cosine |
---|---|---|---|---|
0 | 0 | - | 0.6375 | - |
0.3125 | 10 | 2.0385 | 0.7770 | - |
0.625 | 20 | 1.5189 | 0.7980 | - |
0.9375 | 30 | 1.3685 | 0.8047 | - |
1.0 | 32 | - | - | 0.7344 |
Framework Versions
- Python: 3.10.11
- Sentence Transformers: 3.0.1
- Transformers: 4.41.2
- PyTorch: 2.3.1
- Accelerate: 0.31.0
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}