metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:897
- loss:TripletLoss
base_model: sentence-transformers/all-MiniLM-L6-v2
widget:
- source_sentence: >-
Well driller/borer and related mining worker operates, assembles and
monitors machines for cutting channels in a mine workface or for the
drilling and sinking of wells, extraction of ore, liquids and gases or for
a variety of other purposes.
sentences:
- >-
Prepare detailed drawings of architectural and structural features of
buildings or drawings and topographical relief maps used in civil
engineering projects, such as highways, bridges, and public works. Use
knowledge of building materials, engineering practices, and mathematics
to complete drawings.
- >-
Operate self-propelled mining machines that rip coal, metal and nonmetal
ores, rock, stone, or sand from the mine face and load it onto
conveyors, shuttle cars, or trucks in a continuous operation.
- >-
Conduct investigations related to suspected violations of federal,
state, or local laws to prevent or solve crimes.
- source_sentence: >-
Van driver drives a van to pick up and deliver non-mail documents and
parcels.
sentences:
- >-
Drive a light vehicle, such as a truck or van, with a capacity of less
than 26,001 pounds Gross Vehicle Weight (GVW), primarily to pick up
merchandise or packages from a distribution center and deliver. May load
and unload vehicle.
- >-
Plan, direct, or coordinate human resources activities and staff of an
organization.
- >-
Devise methods to improve oil and gas extraction and production and
determine the need for new or modified tool designs. Oversee drilling
and offer technical advice.
- source_sentence: >-
Library officer assists librarians by helping readers in the use of
library catalogues, databases, and indexes to locate books and other
materials. He/she also compiles records, sorts and shelves books or other
media, removes or repairs damaged books or other media, registers patrons
and checks materials in and out of the circulation process. He/she
replaces materials in shelving areas.
sentences:
- >-
Assist librarians by helping readers in the use of library catalogs,
databases, and indexes to locate books and other materials; and by
answering questions that require only brief consultation of standard
reference. Compile records; sort and shelve books or other media; remove
or repair damaged books or other media; register patrons; and check
materials in and out of the circulation process. Replace materials in
shelving area (stacks) or files. Includes bookmobile drivers who assist
with providing services in mobile libraries.
- >-
Perform engineering duties in planning and designing tools, engines,
machines, and other mechanically functioning equipment. Oversee
installation, operation, maintenance, and repair of equipment such as
centralized heat, gas, water, and steam systems.
- >-
Perform a variety of food preparation duties other than cooking, such as
preparing cold foods and shellfish, slicing meat, and brewing coffee or
tea.
- source_sentence: >-
Pre-press trades worker proofs, formats, sets and composes text and
graphics into a form suitable for use in various printing processes and
representation in other visual media.
sentences:
- >-
Directly supervise and coordinate activities of workers engaged in
landscaping or groundskeeping activities. Work may involve reviewing
contracts to ascertain service, machine, and workforce requirements;
answering inquiries from potential customers regarding methods,
material, and price ranges; and preparing estimates according to labor,
material, and machine costs.
- >-
Plan, direct, or coordinate transportation, storage, or distribution
activities in accordance with organizational policies and applicable
government laws or regulations. Includes logistics managers.
- >-
Engrave or etch metal, wood, rubber, or other materials. Includes such
workers as etcher-circuit processors, pantograph engravers, and silk
screen etchers.
- source_sentence: >-
Composer/Orchestrator writes musical compositions such as symphonies,
sonatas or operas. He/she translates compositions into standard musical
signs and symbols on scored music paper. He/she may write words to
accompany music. He/she adapts melodies to suit the type and style of
orchestras or bands and to produce various kinds of effects. He/she
determines instruments to be employed, writes musical scores to produce
the desired musical effect, rewrites music written for one instrument or
purpose into suitable forms for other instruments or purposes.
sentences:
- >-
Evaluate materials and develop machinery and processes to manufacture
materials for use in products that must meet specialized design and
performance specifications. Develop new uses for known materials.
Includes those engineers working with composite materials or
specializing in one type of material, such as graphite, metal and metal
alloys, ceramics and glass, plastics and polymers, and naturally
occurring materials. Includes metallurgists and metallurgical engineers,
ceramic engineers, and welding engineers.
- >-
Plan, direct, or coordinate the actual distribution or movement of a
product or service to the customer. Coordinate sales distribution by
establishing sales territories, quotas, and goals and establish training
programs for sales representatives. Analyze sales statistics gathered by
staff to determine sales potential and inventory requirements and
monitor the preferences of customers.
- >-
Conduct, direct, plan, and lead instrumental or vocal performances by
musical artists or groups, such as orchestras, bands, choirs, and glee
clubs; or create original works of music.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy
model-index:
- name: SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
results:
- task:
type: triplet
name: Triplet
dataset:
name: job description eval
type: job-description-eval
metrics:
- type: cosine_accuracy
value: 0.7288888692855835
name: Cosine Accuracy
SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ANGKJ1995/all-MiniLM-L6-v2-job-description")
# Run inference
sentences = [
'Composer/Orchestrator writes musical compositions such as symphonies, sonatas or operas. He/she translates compositions into standard musical signs and symbols on scored music paper. He/she may write words to accompany music. He/she adapts melodies to suit the type and style of orchestras or bands and to produce various kinds of effects. He/she determines instruments to be employed, writes musical scores to produce the desired musical effect, rewrites music written for one instrument or purpose into suitable forms for other instruments or purposes.',
'Conduct, direct, plan, and lead instrumental or vocal performances by musical artists or groups, such as orchestras, bands, choirs, and glee clubs; or create original works of music.',
'Evaluate materials and develop machinery and processes to manufacture materials for use in products that must meet specialized design and performance specifications. Develop new uses for known materials. Includes those engineers working with composite materials or specializing in one type of material, such as graphite, metal and metal alloys, ceramics and glass, plastics and polymers, and naturally occurring materials. Includes metallurgists and metallurgical engineers, ceramic engineers, and welding engineers.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Triplet
- Dataset:
job-description-eval
- Evaluated with
TripletEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.7289 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 897 training samples
- Columns:
SSOC_DESCRIPTION
,ONET_DESCRIPTION
, andshuffled_ONET_DESCRIPTION
- Approximate statistics based on the first 897 samples:
SSOC_DESCRIPTION ONET_DESCRIPTION shuffled_ONET_DESCRIPTION type string string string details - min: 14 tokens
- mean: 66.05 tokens
- max: 166 tokens
- min: 9 tokens
- mean: 44.67 tokens
- max: 161 tokens
- min: 7 tokens
- mean: 44.52 tokens
- max: 161 tokens
- Samples:
SSOC_DESCRIPTION ONET_DESCRIPTION shuffled_ONET_DESCRIPTION Consumer audio/video equipment/radar broadcasting/transmitting equipment fitter/mechanic fits, adjusts, installs and repairs radio, television, transmitters, receivers and radar equipment in factory, workshop or place of use. He/she specialises in television transmitters/receivers, radar equipment, radio transmitters/receivers and two way radio communications equipment. He/she examines drawings and wiring diagrams, and diagnoses faults with aid of testing equipment.
Repair, test, adjust, or install electronic equipment, such as industrial controls, transmitters, and antennas.
Conduct programs of compensation and benefits and job analysis for employer. May specialize in specific areas, such as position classification and pension programs.
Window cleaner washes and polishes windows and other glass fittings. He/she uses cleaning tools such as sponges and detergents to clean and polish windows, mirrors and other glass surfaces of buildings, both on the interior and exterior. He/she uses specific ladders to clean taller buildings with safety belts for support.
Keep buildings in clean and orderly condition. Perform heavy cleaning duties, such as cleaning floors, shampooing rugs, washing walls and glass, and removing rubbish. Duties may include tending furnace and boiler, performing routine maintenance activities, notifying management of need for repairs, and cleaning snow or debris from sidewalk.
Service automobiles, buses, trucks, boats, and other automotive or marine vehicles with fuel, lubricants, and accessories. Collect payment for services and supplies. May lubricate vehicle, change motor oil, refill antifreeze, or replace lights or other accessories, such as windshield wiper blades or fan belts. May repair or replace tires.
Instrumentalist plays one or more musical instruments as a soloist, accompanist or member of an orchestra, band or other musical group. He/she studies and rehearses scores, tunes instruments to the proper pitch, plays music by manipulating keys, bows, valves, strings or percussion devices, depending on the type of instrument being played. He/she may improvise or transpose music or compose or arrange music. In an orchestra, he/she is usually designated according to the instrument played such as violinist, drummer or pianist.
Play one or more musical instruments or sing. May perform on stage, for broadcasting, or for sound or video recording.
Drive a light vehicle, such as a truck or van, with a capacity of less than 26,001 pounds Gross Vehicle Weight (GVW), primarily to pick up merchandise or packages from a distribution center and deliver. May load and unload vehicle.
- Loss:
TripletLoss
with these parameters:{ "distance_metric": "TripletDistanceMetric.EUCLIDEAN", "triplet_margin": 5 }
Evaluation Dataset
Unnamed Dataset
- Size: 225 evaluation samples
- Columns:
SSOC_DESCRIPTION
,ONET_DESCRIPTION
, andshuffled_ONET_DESCRIPTION
- Approximate statistics based on the first 225 samples:
SSOC_DESCRIPTION ONET_DESCRIPTION shuffled_ONET_DESCRIPTION type string string string details - min: 16 tokens
- mean: 64.88 tokens
- max: 130 tokens
- min: 7 tokens
- mean: 43.49 tokens
- max: 161 tokens
- min: 9 tokens
- mean: 44.06 tokens
- max: 161 tokens
- Samples:
SSOC_DESCRIPTION ONET_DESCRIPTION shuffled_ONET_DESCRIPTION Salesperson (door-to-door) describes, demonstrates and sells goods and services and solicits business for establishments by approaching or visiting potential customers, usually residents in private homes, by going from door to door. He/she gives details of what establishment can supply and quotes prices and terms.
Contact new or existing customers to determine their solar equipment needs, suggest systems or equipment, or estimate costs.
Recruit, screen, interview, or place individuals within an organization. May perform other activities in multiple human resources areas.
Secretary performs a variety of administrative tasks to help keep an organisation running smoothly. He/she answers telephone calls, drafts and sends e-mails, maintains diaries, arranges appointments, takes messages, files documents, organises and services meetings, and manages databases.
Perform secretarial duties using specific knowledge of medical terminology and hospital, clinic, or laboratory procedures. Duties may include scheduling appointments, billing patients, and compiling and recording medical charts, reports, and correspondence.
Set up, operate, or tend forging machines to taper, shape, or form metal or plastic parts.
Purchasing agent buys machinery, equipment, raw materials, services and other supplies for use by the enterprise. He/she ascertains the requirements of the enterprise and studies market information on varieties and qualities available. He/she interviews vendors to ascertain their ability to meet the organisation’s specific requirements for design, performance, price and delivery. He/she may approve bills for payment.
Purchase machinery, equipment, tools, parts, supplies, or services necessary for the operation of an establishment. Purchase raw or semifinished materials for manufacturing. May negotiate contracts.
Evaluate and treat musculoskeletal injuries or illnesses. Provide preventive, therapeutic, emergency, and rehabilitative care.
- Loss:
TripletLoss
with these parameters:{ "distance_metric": "TripletDistanceMetric.EUCLIDEAN", "triplet_margin": 5 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 16per_device_eval_batch_size
: 16learning_rate
: 1e-05num_train_epochs
: 16warmup_ratio
: 0.1fp16
: Trueload_best_model_at_end
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 1e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 16max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Validation Loss | job-description-eval_cosine_accuracy |
---|---|---|---|
-1 | -1 | - | 0.1867 |
1.0 | 57 | 4.5738 | 0.4844 |
2.0 | 114 | 4.3775 | 0.7022 |
3.0 | 171 | 4.2681 | 0.7289 |
Framework Versions
- Python: 3.11.11
- Sentence Transformers: 3.4.1
- Transformers: 4.48.3
- PyTorch: 2.5.1+cu124
- Accelerate: 1.3.0
- Datasets: 3.3.2
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
TripletLoss
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}