--- tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:1077240 - loss:MultipleNegativesRankingLoss base_model: Qwen/Qwen2.5-0.5B-Instruct widget: - source_sentence: Who is the father of philosophy? sentences: - 'Charles Sanders Peirce Charles Sanders Peirce (/pɜːrs/[9] "purse"; 10September 1839 – 19April 1914) was an American philosopher, logician, mathematician, and scientist who is sometimes known as "the father of pragmatism". He was educated as a chemist and employed as a scientist for 30 years. Today he is appreciated largely for his contributions to logic, mathematics, philosophy, scientific methodology, and semiotics, and for his founding of pragmatism.' - 'Georg Wilhelm Friedrich Hegel According to Hegel, "Heraclitus is the one who first declared the nature of the infinite and first grasped nature as in itself infinite, that is, its essence as process. The origin of philosophy is to be dated from Heraclitus. His is the persistent Idea that is the same in all philosophers up to the present day, as it was the Idea of Plato and Aristotle". For Hegel, Heraclitus''s great achievements were to have understood the nature of the infinite, which for Hegel includes understanding the inherent contradictoriness and negativity of reality; and to have grasped that reality is becoming or process and that "being" and "nothingness" are mere empty abstractions. According to Hegel, Heraclitus''s "obscurity" comes from his being a true (in Hegel''s terms "speculative") philosopher who grasped the ultimate philosophical truth and therefore expressed himself in a way that goes beyond the abstract and limited nature of common sense and is difficult to grasp by those who operate within common sense. Hegel asserted that in Heraclitus he had an antecedent for his logic: "[...] there is no proposition of Heraclitus which I have not adopted in my logic".' - 'History of nuclear weapons The notion of using a fission weapon to ignite a process of nuclear fusion can be dated back to 1942. At the first major theoretical conference on the development of an atomic bomb hosted by J. Robert Oppenheimer at the University of California, Berkeley, participant Edward Teller directed the majority of the discussion towards Enrico Fermi''s idea of a "Super" bomb that would use the same reactions that powered the Sun itself.' - source_sentence: When was Father's Day first celebrated in America? sentences: - 'Father''s Day (United States) Father''s Day was founded in Spokane, Washington at the YMCA in 1910 by Sonora Smart Dodd, who was born in Arkansas.[4] Its first celebration was in the Spokane YMCA on June 19, 1910.[4][5] Her father, the Civil War veteran William Jackson Smart, was a single parent who raised his six children there.[4] After hearing a sermon about Jarvis'' Mother''s Day at Central Methodist Episcopal Church in 1909, she told her pastor that fathers should have a similar holiday honoring them.[4][6] Although she initially suggested June 5, her father''s birthday, the pastors did not have enough time to prepare their sermons, and the celebration was deferred to the third Sunday of June.[7][8]' - 'Father''s Day In [[Peru]], Father''s Day is celebrated on the third Sunday of June and is not a public holiday. People usually give a present to their fathers and spend time with him mostly during a family meal.' - 'Sacramento River The Sacramento and its wide natural floodplain were once abundant in fish and other aquatic creatures, notably one of the southernmost large runs of chinook salmon in North America. For about 12,000 years, humans have depended on the vast natural resources of the watershed, which had one of the densest Native American populations in California. The river has provided a route for trade and travel since ancient times. Hundreds of tribes sharing regional customs and traditions inhabited the Sacramento Valley, first coming into contact with European explorers in the late 1700s. The Spanish explorer Gabriel Moraga named the river Rio de los Sacramentos in 1808, later shortened and anglicized into Sacramento.' - source_sentence: What is the population of Austria in 2018? sentences: - 'Utah State Capitol The Utah State Capitol is the house of government for the U.S. state of Utah. The building houses the chambers and offices of the Utah State Legislature, the offices of the Governor, Lieutenant Governor, Attorney General, the State Auditor and their staffs. The capitol is the main building of the Utah State Capitol Complex, which is located on Capitol Hill, overlooking downtown Salt Lake City.' - 'Same-sex marriage in Austria A September 2018 poll for "Österreich" found that 74% of Austrians supported same-sex marriage and 26% were against.' - 'Demographics of Austria Population 8,793,370 (July 2018 est.) country comparison to the world: 96th' - source_sentence: What language family is Malay? sentences: - 'Malay language Malay is a member of the Austronesian family of languages, which includes languages from Southeast Asia and the Pacific Ocean, with a smaller number in continental Asia. Malagasy, a geographic outlier spoken in Madagascar in the Indian Ocean, is also a member of this language family. Although each language of the family is mutually unintelligible, their similarities are rather striking. Many roots have come virtually unchanged from their common ancestor, Proto-Austronesian language. There are many cognates found in the languages'' words for kinship, health, body parts and common animals. Numbers, especially, show remarkable similarities.' - 'Filipinos of Malay descent In the Philippines, there is misconception and often mixing between the two definitions. Filipinos consider Malays as being the natives of the Philippines, Indonesia, Malaysia and Brunei. Consequently, Filipinos consider themselves Malay when in reality, they are referring to the Malay Race. Filipinos in Singapore also prefer to be considered Malay, but their desire to be labeled as part of the ethnic group was rejected by the Singaporean government. Paradoxically, a minor percentage of Filipinos prefer the Spanish influence and may associate themselves with being Hispanic, and have made no realistic attempts to promote and/or revive the Malay language in the Philippines.' - 'Preferred provider organization In health insurance in the United States, a preferred provider organization (PPO), sometimes referred to as a participating provider organization or preferred provider option, is a managed care organization of medical doctors, hospitals, and other health care providers who have agreed with an insurer or a third-party administrator to provide health care at reduced rates to the insurer''s or administrator''s clients.' - source_sentence: When was ABC formed? sentences: - 'American Broadcasting Company ABC launched as a radio network on October 12, 1943, serving as the successor to the NBC Blue Network, which had been purchased by Edward J. Noble. It extended its operations to television in 1948, following in the footsteps of established broadcast networks CBS and NBC. In the mid-1950s, ABC merged with United Paramount Theatres, a chain of movie theaters that formerly operated as a subsidiary of Paramount Pictures. Leonard Goldenson, who had been the head of UPT, made the new television network profitable by helping develop and greenlight many successful series. In the 1980s, after purchasing an 80% interest in cable sports channel ESPN, the network''s corporate parent, American Broadcasting Companies, Inc., merged with Capital Cities Communications, owner of several print publications, and television and radio stations. In 1996, most of Capital Cities/ABC''s assets were purchased by The Walt Disney Company.' - 'Roman concrete Roman concrete, also called opus caementicium, was a material used in construction during the late Roman Republic until the fading of the Roman Empire. Roman concrete was based on a hydraulic-setting cement. Recently, it has been found that it materially differs in several ways from modern concrete which is based on Portland cement. Roman concrete is durable due to its incorporation of volcanic ash, which prevents cracks from spreading. By the middle of the 1st century, the material was used frequently, often brick-faced, although variations in aggregate allowed different arrangements of materials. Further innovative developments in the material, called the Concrete Revolution, contributed to structurally complicated forms, such as the Pantheon dome, the world''s largest and oldest unreinforced concrete dome.[1]' - 'Americans Battling Communism Americans Battling Communism, Inc. (ABC) was an anti-communist organization created following an October 1947 speech by Pennsylvania Judge Blair Gunther that called for an "ABC movement" to educate America about communism. Chartered in November 1947 by Harry Alan Sherman, a local lawyer active in various anti-communist organizations, the group took part in such activities as blacklisting by disclosing the names of people suspected of being communists. Its members included local judges and lawyers active in the McCarthy-era prosecution of communists.' pipeline_tag: sentence-similarity library_name: sentence-transformers metrics: - pearson_cosine - spearman_cosine model-index: - name: SentenceTransformer based on Qwen/Qwen2.5-0.5B-Instruct results: - task: type: semantic-similarity name: Semantic Similarity dataset: name: sts dev 896 type: sts-dev-896 metrics: - type: pearson_cosine value: 0.7512795462804751 name: Pearson Cosine - type: spearman_cosine value: 0.7602862030369626 name: Spearman Cosine - task: type: semantic-similarity name: Semantic Similarity dataset: name: sts dev 768 type: sts-dev-768 metrics: - type: pearson_cosine value: 0.7504358517848402 name: Pearson Cosine - type: spearman_cosine value: 0.7590404004512833 name: Spearman Cosine --- # SentenceTransformer based on Qwen/Qwen2.5-0.5B-Instruct This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct). It maps sentences & paragraphs to a 896-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) - **Maximum Sequence Length:** 1024 tokens - **Output Dimensionality:** 896 dimensions - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: Qwen2Model (1): Pooling({'word_embedding_dimension': 896, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("AlexWortega/qwen3k") # Run inference sentences = [ 'When was ABC formed?', "American Broadcasting Company\nABC launched as a radio network on October 12, 1943, serving as the successor to the NBC Blue Network, which had been purchased by Edward J. Noble. It extended its operations to television in 1948, following in the footsteps of established broadcast networks CBS and NBC. In the mid-1950s, ABC merged with United Paramount Theatres, a chain of movie theaters that formerly operated as a subsidiary of Paramount Pictures. Leonard Goldenson, who had been the head of UPT, made the new television network profitable by helping develop and greenlight many successful series. In the 1980s, after purchasing an 80% interest in cable sports channel ESPN, the network's corporate parent, American Broadcasting Companies, Inc., merged with Capital Cities Communications, owner of several print publications, and television and radio stations. In 1996, most of Capital Cities/ABC's assets were purchased by The Walt Disney Company.", 'Americans Battling Communism\nAmericans Battling Communism, Inc. (ABC) was an anti-communist organization created following an October 1947 speech by Pennsylvania Judge Blair Gunther that called for an "ABC movement" to educate America about communism. Chartered in November 1947 by Harry Alan Sherman, a local lawyer active in various anti-communist organizations, the group took part in such activities as blacklisting by disclosing the names of people suspected of being communists. Its members included local judges and lawyers active in the McCarthy-era prosecution of communists.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 896] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Semantic Similarity * Datasets: `sts-dev-896` and `sts-dev-768` * Evaluated with [EmbeddingSimilarityEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator) | Metric | sts-dev-896 | sts-dev-768 | |:--------------------|:------------|:------------| | pearson_cosine | 0.7513 | 0.7504 | | **spearman_cosine** | **0.7603** | **0.759** | ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 1,077,240 training samples * Columns: query, response, and negative * Approximate statistics based on the first 1000 samples: | | query | response | negative | |:--------|:---------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| | type | string | string | string | | details | | | | * Samples: | query | response | negative | |:--------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Was there a year 0? | Year zero
Year zero does not exist in the anno Domini system usually used to number years in the Gregorian calendar and in its predecessor, the Julian calendar. In this system, the year 1 BC is followed by AD 1. However, there is a year zero in astronomical year numbering (where it coincides with the Julian year 1 BC) and in ISO 8601:2004 (where it coincides with the Gregorian year 1 BC) as well as in all Buddhist and Hindu calendars.
| 504
Year 504 (DIV) was a leap year starting on Thursday (link will display the full calendar) of the Julian calendar. At the time, it was known as the Year of the Consulship of Nicomachus without colleague (or, less frequently, year 1257 "Ab urbe condita"). The denomination 504 for this year has been used since the early medieval period, when the Anno Domini calendar era became the prevalent method in Europe for naming years.
| | When is the dialectical method used? | Dialectic
Dialectic or dialectics (Greek: διαλεκτική, dialektikḗ; related to dialogue), also known as the dialectical method, is at base a discourse between two or more people holding different points of view about a subject but wishing to establish the truth through reasoned arguments. Dialectic resembles debate, but the concept excludes subjective elements such as emotional appeal and the modern pejorative sense of rhetoric.[1][2] Dialectic may be contrasted with the didactic method, wherein one side of the conversation teaches the other. Dialectic is alternatively known as minor logic, as opposed to major logic or critique.
| Derek Bentley case
Another factor in the posthumous defence was that a "confession" recorded by Bentley, which was claimed by the prosecution to be a "verbatim record of dictated monologue", was shown by forensic linguistics methods to have been largely edited by policemen. Linguist Malcolm Coulthard showed that certain patterns, such as the frequency of the word "then" and the grammatical use of "then" after the grammatical subject ("I then" rather than "then I"), were not consistent with Bentley's use of language (his idiolect), as evidenced in court testimony. These patterns fit better the recorded testimony of the policemen involved. This is one of the earliest uses of forensic linguistics on record.
| | What do Grasshoppers eat? | Grasshopper
Grasshoppers are plant-eaters, with a few species at times becoming serious pests of cereals, vegetables and pasture, especially when they swarm in their millions as locusts and destroy crops over wide areas. They protect themselves from predators by camouflage; when detected, many species attempt to startle the predator with a brilliantly-coloured wing-flash while jumping and (if adult) launching themselves into the air, usually flying for only a short distance. Other species such as the rainbow grasshopper have warning coloration which deters predators. Grasshoppers are affected by parasites and various diseases, and many predatory creatures feed on both nymphs and adults. The eggs are the subject of attack by parasitoids and predators.
| Groundhog
Very often the dens of groundhogs provide homes for other animals including skunks, red foxes, and cottontail rabbits. The fox and skunk feed upon field mice, grasshoppers, beetles and other creatures that destroy farm crops. In aiding these animals, the groundhog indirectly helps the farmer. In addition to providing homes for itself and other animals, the groundhog aids in soil improvement by bringing subsoil to the surface. The groundhog is also a valuable game animal and is considered a difficult sport when hunted in a fair manner. In some parts of Appalachia, they are eaten.
| * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 20.0, "similarity_fct": "cos_sim" } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: steps - `per_device_train_batch_size`: 12 - `per_device_eval_batch_size`: 12 - `gradient_accumulation_steps`: 4 - `num_train_epochs`: 1 - `warmup_ratio`: 0.3 - `bf16`: True - `batch_sampler`: no_duplicates #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: steps - `prediction_loss_only`: True - `per_device_train_batch_size`: 12 - `per_device_eval_batch_size`: 12 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 4 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 5e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 1 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.3 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: True - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `eval_use_gather_object`: False - `average_tokens_across_devices`: False - `prompts`: None - `batch_sampler`: no_duplicates - `multi_dataset_batch_sampler`: proportional
### Training Logs
Click to expand | Epoch | Step | Training Loss | sts-dev-896_spearman_cosine | sts-dev-768_spearman_cosine | |:------:|:----:|:-------------:|:---------------------------:|:---------------------------:| | 0.0004 | 10 | 2.2049 | - | - | | 0.0009 | 20 | 2.3168 | - | - | | 0.0013 | 30 | 2.3544 | - | - | | 0.0018 | 40 | 2.2519 | - | - | | 0.0022 | 50 | 2.1809 | - | - | | 0.0027 | 60 | 2.1572 | - | - | | 0.0031 | 70 | 2.1855 | - | - | | 0.0036 | 80 | 2.5887 | - | - | | 0.0040 | 90 | 2.883 | - | - | | 0.0045 | 100 | 2.8557 | - | - | | 0.0049 | 110 | 2.9356 | - | - | | 0.0053 | 120 | 2.8833 | - | - | | 0.0058 | 130 | 2.8394 | - | - | | 0.0062 | 140 | 2.923 | - | - | | 0.0067 | 150 | 2.8191 | - | - | | 0.0071 | 160 | 2.8658 | - | - | | 0.0076 | 170 | 2.8252 | - | - | | 0.0080 | 180 | 2.8312 | - | - | | 0.0085 | 190 | 2.7761 | - | - | | 0.0089 | 200 | 2.7193 | - | - | | 0.0094 | 210 | 2.724 | - | - | | 0.0098 | 220 | 2.7484 | - | - | | 0.0102 | 230 | 2.7262 | - | - | | 0.0107 | 240 | 2.6964 | - | - | | 0.0111 | 250 | 2.6676 | - | - | | 0.0116 | 260 | 2.6715 | - | - | | 0.0120 | 270 | 2.6145 | - | - | | 0.0125 | 280 | 2.6191 | - | - | | 0.0129 | 290 | 1.9812 | - | - | | 0.0134 | 300 | 1.6413 | - | - | | 0.0138 | 310 | 1.6126 | - | - | | 0.0143 | 320 | 1.3599 | - | - | | 0.0147 | 330 | 1.2996 | - | - | | 0.0151 | 340 | 1.2654 | - | - | | 0.0156 | 350 | 1.9409 | - | - | | 0.0160 | 360 | 2.1287 | - | - | | 0.0165 | 370 | 1.8442 | - | - | | 0.0169 | 380 | 1.6837 | - | - | | 0.0174 | 390 | 1.5489 | - | - | | 0.0178 | 400 | 1.4382 | - | - | | 0.0183 | 410 | 1.4848 | - | - | | 0.0187 | 420 | 1.3481 | - | - | | 0.0192 | 430 | 1.3467 | - | - | | 0.0196 | 440 | 1.3977 | - | - | | 0.0201 | 450 | 1.26 | - | - | | 0.0205 | 460 | 1.2412 | - | - | | 0.0209 | 470 | 1.316 | - | - | | 0.0214 | 480 | 1.3501 | - | - | | 0.0218 | 490 | 1.2246 | - | - | | 0.0223 | 500 | 1.2271 | - | - | | 0.0227 | 510 | 1.1871 | - | - | | 0.0232 | 520 | 1.1685 | - | - | | 0.0236 | 530 | 1.1624 | - | - | | 0.0241 | 540 | 1.1911 | - | - | | 0.0245 | 550 | 1.1978 | - | - | | 0.0250 | 560 | 1.1228 | - | - | | 0.0254 | 570 | 1.1091 | - | - | | 0.0258 | 580 | 1.1433 | - | - | | 0.0263 | 590 | 1.0638 | - | - | | 0.0267 | 600 | 1.0515 | - | - | | 0.0272 | 610 | 1.175 | - | - | | 0.0276 | 620 | 1.0943 | - | - | | 0.0281 | 630 | 1.1226 | - | - | | 0.0285 | 640 | 0.9871 | - | - | | 0.0290 | 650 | 1.0171 | - | - | | 0.0294 | 660 | 1.0169 | - | - | | 0.0299 | 670 | 0.9643 | - | - | | 0.0303 | 680 | 0.9563 | - | - | | 0.0307 | 690 | 0.9841 | - | - | | 0.0312 | 700 | 1.0349 | - | - | | 0.0316 | 710 | 0.8958 | - | - | | 0.0321 | 720 | 0.9225 | - | - | | 0.0325 | 730 | 0.842 | - | - | | 0.0330 | 740 | 0.9104 | - | - | | 0.0334 | 750 | 0.8927 | - | - | | 0.0339 | 760 | 0.8508 | - | - | | 0.0343 | 770 | 0.8835 | - | - | | 0.0348 | 780 | 0.9531 | - | - | | 0.0352 | 790 | 0.926 | - | - | | 0.0356 | 800 | 0.8718 | - | - | | 0.0361 | 810 | 0.8261 | - | - | | 0.0365 | 820 | 0.8169 | - | - | | 0.0370 | 830 | 0.8525 | - | - | | 0.0374 | 840 | 0.8504 | - | - | | 0.0379 | 850 | 0.7625 | - | - | | 0.0383 | 860 | 0.8259 | - | - | | 0.0388 | 870 | 0.7558 | - | - | | 0.0392 | 880 | 0.7898 | - | - | | 0.0397 | 890 | 0.7694 | - | - | | 0.0401 | 900 | 0.7429 | - | - | | 0.0405 | 910 | 0.6666 | - | - | | 0.0410 | 920 | 0.7407 | - | - | | 0.0414 | 930 | 0.6665 | - | - | | 0.0419 | 940 | 0.7597 | - | - | | 0.0423 | 950 | 0.7035 | - | - | | 0.0428 | 960 | 0.7166 | - | - | | 0.0432 | 970 | 0.6889 | - | - | | 0.0437 | 980 | 0.7541 | - | - | | 0.0441 | 990 | 0.7175 | - | - | | 0.0446 | 1000 | 0.7389 | 0.6420 | 0.6403 | | 0.0450 | 1010 | 0.7142 | - | - | | 0.0454 | 1020 | 0.7301 | - | - | | 0.0459 | 1030 | 0.7299 | - | - | | 0.0463 | 1040 | 0.6759 | - | - | | 0.0468 | 1050 | 0.7036 | - | - | | 0.0472 | 1060 | 0.6286 | - | - | | 0.0477 | 1070 | 0.595 | - | - | | 0.0481 | 1080 | 0.6099 | - | - | | 0.0486 | 1090 | 0.6377 | - | - | | 0.0490 | 1100 | 0.6309 | - | - | | 0.0495 | 1110 | 0.6306 | - | - | | 0.0499 | 1120 | 0.557 | - | - | | 0.0504 | 1130 | 0.5898 | - | - | | 0.0508 | 1140 | 0.5896 | - | - | | 0.0512 | 1150 | 0.6399 | - | - | | 0.0517 | 1160 | 0.5923 | - | - | | 0.0521 | 1170 | 0.5787 | - | - | | 0.0526 | 1180 | 0.591 | - | - | | 0.0530 | 1190 | 0.5714 | - | - | | 0.0535 | 1200 | 0.6047 | - | - | | 0.0539 | 1210 | 0.5904 | - | - | | 0.0544 | 1220 | 0.543 | - | - | | 0.0548 | 1230 | 0.6033 | - | - | | 0.0553 | 1240 | 0.5445 | - | - | | 0.0557 | 1250 | 0.5217 | - | - | | 0.0561 | 1260 | 0.5835 | - | - | | 0.0566 | 1270 | 0.5353 | - | - | | 0.0570 | 1280 | 0.5887 | - | - | | 0.0575 | 1290 | 0.5967 | - | - | | 0.0579 | 1300 | 0.5036 | - | - | | 0.0584 | 1310 | 0.5915 | - | - | | 0.0588 | 1320 | 0.5719 | - | - | | 0.0593 | 1330 | 0.5238 | - | - | | 0.0597 | 1340 | 0.5647 | - | - | | 0.0602 | 1350 | 0.538 | - | - | | 0.0606 | 1360 | 0.5457 | - | - | | 0.0610 | 1370 | 0.5169 | - | - | | 0.0615 | 1380 | 0.4967 | - | - | | 0.0619 | 1390 | 0.4864 | - | - | | 0.0624 | 1400 | 0.5133 | - | - | | 0.0628 | 1410 | 0.5587 | - | - | | 0.0633 | 1420 | 0.4691 | - | - | | 0.0637 | 1430 | 0.5186 | - | - | | 0.0642 | 1440 | 0.4907 | - | - | | 0.0646 | 1450 | 0.5281 | - | - | | 0.0651 | 1460 | 0.4741 | - | - | | 0.0655 | 1470 | 0.4452 | - | - | | 0.0659 | 1480 | 0.4771 | - | - | | 0.0664 | 1490 | 0.4289 | - | - | | 0.0668 | 1500 | 0.4551 | - | - | | 0.0673 | 1510 | 0.4558 | - | - | | 0.0677 | 1520 | 0.5159 | - | - | | 0.0682 | 1530 | 0.4296 | - | - | | 0.0686 | 1540 | 0.4548 | - | - | | 0.0691 | 1550 | 0.4439 | - | - | | 0.0695 | 1560 | 0.4295 | - | - | | 0.0700 | 1570 | 0.4466 | - | - | | 0.0704 | 1580 | 0.4717 | - | - | | 0.0708 | 1590 | 0.492 | - | - | | 0.0713 | 1600 | 0.4566 | - | - | | 0.0717 | 1610 | 0.4451 | - | - | | 0.0722 | 1620 | 0.4715 | - | - | | 0.0726 | 1630 | 0.4573 | - | - | | 0.0731 | 1640 | 0.3972 | - | - | | 0.0735 | 1650 | 0.5212 | - | - | | 0.0740 | 1660 | 0.4381 | - | - | | 0.0744 | 1670 | 0.4552 | - | - | | 0.0749 | 1680 | 0.4767 | - | - | | 0.0753 | 1690 | 0.4398 | - | - | | 0.0757 | 1700 | 0.4801 | - | - | | 0.0762 | 1710 | 0.3751 | - | - | | 0.0766 | 1720 | 0.4407 | - | - | | 0.0771 | 1730 | 0.4305 | - | - | | 0.0775 | 1740 | 0.3938 | - | - | | 0.0780 | 1750 | 0.4748 | - | - | | 0.0784 | 1760 | 0.428 | - | - | | 0.0789 | 1770 | 0.404 | - | - | | 0.0793 | 1780 | 0.4261 | - | - | | 0.0798 | 1790 | 0.359 | - | - | | 0.0802 | 1800 | 0.4422 | - | - | | 0.0807 | 1810 | 0.4748 | - | - | | 0.0811 | 1820 | 0.4352 | - | - | | 0.0815 | 1830 | 0.4032 | - | - | | 0.0820 | 1840 | 0.4124 | - | - | | 0.0824 | 1850 | 0.4486 | - | - | | 0.0829 | 1860 | 0.429 | - | - | | 0.0833 | 1870 | 0.4189 | - | - | | 0.0838 | 1880 | 0.3658 | - | - | | 0.0842 | 1890 | 0.4297 | - | - | | 0.0847 | 1900 | 0.4215 | - | - | | 0.0851 | 1910 | 0.3726 | - | - | | 0.0856 | 1920 | 0.3736 | - | - | | 0.0860 | 1930 | 0.4287 | - | - | | 0.0864 | 1940 | 0.4402 | - | - | | 0.0869 | 1950 | 0.4353 | - | - | | 0.0873 | 1960 | 0.3622 | - | - | | 0.0878 | 1970 | 0.3557 | - | - | | 0.0882 | 1980 | 0.4107 | - | - | | 0.0887 | 1990 | 0.3982 | - | - | | 0.0891 | 2000 | 0.453 | 0.7292 | 0.7261 | | 0.0896 | 2010 | 0.3971 | - | - | | 0.0900 | 2020 | 0.4374 | - | - | | 0.0905 | 2030 | 0.4322 | - | - | | 0.0909 | 2040 | 0.3945 | - | - | | 0.0913 | 2050 | 0.356 | - | - | | 0.0918 | 2060 | 0.4182 | - | - | | 0.0922 | 2070 | 0.3694 | - | - | | 0.0927 | 2080 | 0.3989 | - | - | | 0.0931 | 2090 | 0.4237 | - | - | | 0.0936 | 2100 | 0.3961 | - | - | | 0.0940 | 2110 | 0.4264 | - | - | | 0.0945 | 2120 | 0.3609 | - | - | | 0.0949 | 2130 | 0.4154 | - | - | | 0.0954 | 2140 | 0.3661 | - | - | | 0.0958 | 2150 | 0.3328 | - | - | | 0.0962 | 2160 | 0.3456 | - | - | | 0.0967 | 2170 | 0.3478 | - | - | | 0.0971 | 2180 | 0.3339 | - | - | | 0.0976 | 2190 | 0.3833 | - | - | | 0.0980 | 2200 | 0.3238 | - | - | | 0.0985 | 2210 | 0.3871 | - | - | | 0.0989 | 2220 | 0.4009 | - | - | | 0.0994 | 2230 | 0.4115 | - | - | | 0.0998 | 2240 | 0.4024 | - | - | | 0.1003 | 2250 | 0.35 | - | - | | 0.1007 | 2260 | 0.3649 | - | - | | 0.1011 | 2270 | 0.3615 | - | - | | 0.1016 | 2280 | 0.3898 | - | - | | 0.1020 | 2290 | 0.3866 | - | - | | 0.1025 | 2300 | 0.3904 | - | - | | 0.1029 | 2310 | 0.3321 | - | - | | 0.1034 | 2320 | 0.3803 | - | - | | 0.1038 | 2330 | 0.3831 | - | - | | 0.1043 | 2340 | 0.403 | - | - | | 0.1047 | 2350 | 0.3803 | - | - | | 0.1052 | 2360 | 0.3463 | - | - | | 0.1056 | 2370 | 0.3987 | - | - | | 0.1060 | 2380 | 0.3731 | - | - | | 0.1065 | 2390 | 0.353 | - | - | | 0.1069 | 2400 | 0.3166 | - | - | | 0.1074 | 2410 | 0.3895 | - | - | | 0.1078 | 2420 | 0.4025 | - | - | | 0.1083 | 2430 | 0.3798 | - | - | | 0.1087 | 2440 | 0.2991 | - | - | | 0.1092 | 2450 | 0.3094 | - | - | | 0.1096 | 2460 | 0.3669 | - | - | | 0.1101 | 2470 | 0.3412 | - | - | | 0.1105 | 2480 | 0.3697 | - | - | | 0.1110 | 2490 | 0.369 | - | - | | 0.1114 | 2500 | 0.3393 | - | - | | 0.1118 | 2510 | 0.4232 | - | - | | 0.1123 | 2520 | 0.3445 | - | - | | 0.1127 | 2530 | 0.4165 | - | - | | 0.1132 | 2540 | 0.3721 | - | - | | 0.1136 | 2550 | 0.3476 | - | - | | 0.1141 | 2560 | 0.2847 | - | - | | 0.1145 | 2570 | 0.3609 | - | - | | 0.1150 | 2580 | 0.3017 | - | - | | 0.1154 | 2590 | 0.374 | - | - | | 0.1159 | 2600 | 0.3365 | - | - | | 0.1163 | 2610 | 0.393 | - | - | | 0.1167 | 2620 | 0.3623 | - | - | | 0.1172 | 2630 | 0.3538 | - | - | | 0.1176 | 2640 | 0.3206 | - | - | | 0.1181 | 2650 | 0.3962 | - | - | | 0.1185 | 2660 | 0.3087 | - | - | | 0.1190 | 2670 | 0.3482 | - | - | | 0.1194 | 2680 | 0.3616 | - | - | | 0.1199 | 2690 | 0.3955 | - | - | | 0.1203 | 2700 | 0.3915 | - | - | | 0.1208 | 2710 | 0.3782 | - | - | | 0.1212 | 2720 | 0.3576 | - | - | | 0.1216 | 2730 | 0.3544 | - | - | | 0.1221 | 2740 | 0.3572 | - | - | | 0.1225 | 2750 | 0.3107 | - | - | | 0.1230 | 2760 | 0.3579 | - | - | | 0.1234 | 2770 | 0.3571 | - | - | | 0.1239 | 2780 | 0.3694 | - | - | | 0.1243 | 2790 | 0.3674 | - | - | | 0.1248 | 2800 | 0.3373 | - | - | | 0.1252 | 2810 | 0.3362 | - | - | | 0.1257 | 2820 | 0.3225 | - | - | | 0.1261 | 2830 | 0.3609 | - | - | | 0.1265 | 2840 | 0.3681 | - | - | | 0.1270 | 2850 | 0.4059 | - | - | | 0.1274 | 2860 | 0.3047 | - | - | | 0.1279 | 2870 | 0.3446 | - | - | | 0.1283 | 2880 | 0.3507 | - | - | | 0.1288 | 2890 | 0.3124 | - | - | | 0.1292 | 2900 | 0.3712 | - | - | | 0.1297 | 2910 | 0.3394 | - | - | | 0.1301 | 2920 | 0.3869 | - | - | | 0.1306 | 2930 | 0.3449 | - | - | | 0.1310 | 2940 | 0.3752 | - | - | | 0.1314 | 2950 | 0.3341 | - | - | | 0.1319 | 2960 | 0.3329 | - | - | | 0.1323 | 2970 | 0.36 | - | - | | 0.1328 | 2980 | 0.3788 | - | - | | 0.1332 | 2990 | 0.3834 | - | - | | 0.1337 | 3000 | 0.3426 | 0.7603 | 0.7590 |
### Framework Versions - Python: 3.10.12 - Sentence Transformers: 3.3.0 - Transformers: 4.46.2 - PyTorch: 2.1.0+cu118 - Accelerate: 1.1.1 - Datasets: 3.1.0 - Tokenizers: 0.20.3 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```