metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:2859594
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: Qwen/Qwen2.5-0.5B-Instruct
widget:
- source_sentence: How old is Garry Marshall?
sentences:
- >-
Garry Marshall
On the morning of July 19, 2016, Marshall died at a hospital in Burbank,
California at the age of 81 due to complications of pneumonia after
suffering a stroke.[20][21]
- >-
Gregg Marshall
Michael Gregg Marshall (born February 27, 1963) is an American college
basketball coach who currently leads the Shockers team at Wichita State
University. Marshall has coached his teams to appearances in the NCAA
Men's Division I Basketball Tournament in twelve of his eighteen years
as a head coach. He is the most successful head coach in Wichita State
University history (261 wins), and is also the most successful head
coach in Winthrop University history (194 wins).
- >-
Guillotine
For a period of time after its invention, the guillotine was called a
louisette. However, it was later named after Guillotin who had proposed
that a less painful method of execution should be found in place of the
breaking wheel, though he opposed the death penalty and bemoaned the
association of the guillotine with his name.
- source_sentence: Are there cherry trees in Cherry Springs State Park?
sentences:
- >-
Cherry Springs State Park
Awards and press recognition have come to Cherry Springs and its staff.
Thom Bemus, who initiated and coordinates the Stars-n-Parks program, was
named DCNR's 2002Volunteer of the Year.[66] In 2007the park's Dark Sky
Programming and staff received the Environmental Education Excellence in
Programming award from the Pennsylvania Recreation and Parks
Society.[67] Operations manager Chip Harrison and his wife Maxine, who
directs the Dark Sky Fund, received a 2008award from the Pennsylvania
Outdoor Lighting Council for "steadfast adherence and active promotion
of the principles of responsible outdoor lighting at Cherry Springs
State Park".[68] The DCNR has named Cherry Springs one of "25 Must-See
Pennsylvania State Parks", specifically for having the "darkest night
skies on the east coast".[69] Cherry Springs State Park was featured in
the national press in 2003when USA Today named it one of "10Great Places
to get some stars in your eyes",[70] in 2006when National Geographic
Adventure featured it in "Pennsylvania: The Wild, Wild East",[71] and in
The New York Times in 2007.[53] All these were before it was named an
International Dark Sky Park by the International Dark-Sky Association in
2008.[38]
- >-
Cantonese
Although Cantonese shares a lot of vocabulary with Mandarin, the two
varieties are mutually unintelligible because of differences in
pronunciation, grammar and lexicon. Sentence structure, in particular
the placement of verbs, sometimes differs between the two varieties. A
notable difference between Cantonese and Mandarin is how the spoken word
is written; both can be recorded verbatim, but very few Cantonese
speakers are knowledgeable in the full Cantonese written vocabulary, so
a non-verbatim formalized written form is adopted, which is more akin to
the Mandarin written form.[4][5] This results in the situation in which
a Cantonese and a Mandarin text may look similar but are pronounced
differently.
- >-
Cherry Springs State Park
Cherry Springs State Park is an 82-acre (33ha)[a] Pennsylvania state
park in Potter County, Pennsylvania, United States. The park was created
from land within the Susquehannock State Forest, and is on Pennsylvania
Route 44 in West Branch Township. Cherry Springs, named for a large
stand of Black Cherry trees in the park, is atop the dissected Allegheny
Plateau at an elevation of 2,300 feet (701m). It is popular with
astronomers and stargazers for having "some of the darkest night skies
on the east coast" of the United States, and was chosen by the
Pennsylvania Department of Conservation and Natural Resources (DCNR) and
its Bureau of Parks as one of "25 Must-See Pennsylvania State Parks".[4]
- source_sentence: How many regions are in Belgium?
sentences:
- >-
Pine City, Minnesota
Pine City is a city in Pine County, Minnesota, in East Central
Minnesota. Pine City is the county seat of, and the largest city in,
Pine County.[7] A portion of the city is located on the Mille Lacs
Indian Reservation. Founded as a railway town, it quickly became a
logging community and the surrounding lakes made it a resort town.
Today, it is an arts town and commuter town to jobs in the
Minneapolis–Saint Paul metropolitan area.[8] It is also a green city.[9]
The population was 3,127 at the 2010 census.
- >-
Provinces of Belgium
The country of Belgium is divided into three regions. Two of these
regions, the Flemish Region or Flanders, and Walloon Region, or
Wallonia, are each subdivided into five provinces. The third region, the
Brussels-Capital Region, is not divided into provinces, as it was
originally only a small part of a province itself.
- >-
United Belgian States
The United Belgian States was a confederal republic of eight provinces
which had their own governments, were sovereign and independent, and
were governed directly by the Sovereign Congress (; ), the confederal
government. The Sovereign Congress was seated in Brussels and consisted
of representatives of each of the eight provinces. The provinces of the
republic were divided into 11 smaller separate territories, each with
their own regional identities:In 1789, a church-inspired popular revolt
broke out in reaction to the emperor's centralizing and anticlerical
policies. Two factions appeared: the "Statists" who opposed the reforms,
and the "Vonckists" named for Jan Frans Vonck who initially supported
the reforms but then joined the opposition, due to the clumsy way in
which the reforms were carried out.
- source_sentence: Are there black holes near the galactic nucleus?
sentences:
- >-
Supermassive black hole
In September 2014, data from different X-ray telescopes has shown that
the extremely small, dense, ultracompact dwarf galaxy M60-UCD1 hosts a
20 million solar mass black hole at its center, accounting for more than
10% of the total mass of the galaxy. The discovery is quite surprising,
since the black hole is five times more massive than the Milky Way's
black hole despite the galaxy being less than five-thousandths the mass
of the Milky Way.
- >-
Aquarela do Brasil
"Aquarela do Brasil" (Portuguese:[akwaˈɾɛlɐ du bɾaˈziw], Watercolor of
Brazil), written by Ary Barroso in 1939 and known in the
English-speaking world simply as "Brazil", is one of the most famous
Brazilian songs.
- >-
Supermassive black hole
The difficulty in forming a supermassive black hole resides in the need
for enough matter to be in a small enough volume. This matter needs to
have very little angular momentum in order for this to happen. Normally,
the process of accretion involves transporting a large initial endowment
of angular momentum outwards, and this appears to be the limiting factor
in black hole growth. This is a major component of the theory of
accretion disks. Gas accretion is the most efficient and also the most
conspicuous way in which black holes grow. The majority of the mass
growth of supermassive black holes is thought to occur through episodes
of rapid gas accretion, which are observable as active galactic nuclei
or quasars. Observations reveal that quasars were much more frequent
when the Universe was younger, indicating that supermassive black holes
formed and grew early. A major constraining factor for theories of
supermassive black hole formation is the observation of distant luminous
quasars, which indicate that supermassive black holes of billions of
solar masses had already formed when the Universe was less than one
billion years old. This suggests that supermassive black holes arose
very early in the Universe, inside the first massive galaxies.
- source_sentence: When did the July Monarchy end?
sentences:
- >-
July Monarchy
Despite the return of the House of Bourbon to power, France was much
changed from the era of the ancien régime. The egalitarianism and
liberalism of the revolutionaries remained an important force and the
autocracy and hierarchy of the earlier era could not be fully restored.
Economic changes, which had been underway long before the revolution,
had progressed further during the years of turmoil and were firmly
entrenched by 1815. These changes had seen power shift from the noble
landowners to the urban merchants. The administrative reforms of
Napoleon, such as the Napoleonic Code and efficient bureaucracy, also
remained in place. These changes produced a unified central government
that was fiscally sound and had much control over all areas of French
life, a sharp difference from the complicated mix of feudal and
absolutist traditions and institutions of pre-Revolutionary Bourbons.
- >-
Wachovia
Wachovia Corporation began on June 16, 1879 in Winston-Salem, North
Carolina as the Wachovia National Bank. The bank was co-founded by James
Alexander Gray and William Lemly.[9] In 1911, the bank merged with
Wachovia Loan and Trust Company, "the largest trust company between
Baltimore and New Orleans",[10] which had been founded on June 15, 1893.
Wachovia grew to become one of the largest banks in the Southeast partly
on the strength of its accounts from the R.J. Reynolds Tobacco Company,
which was also headquartered in Winston-Salem.[11] On December 12, 1986,
Wachovia purchased First Atlanta. Founded as Atlanta National Bank on
September 14, 1865, and later renamed to First National Bank of Atlanta,
this institution was the oldest national bank in Atlanta. This purchase
made Wachovia one of the few companies with dual headquarters: one in
Winston-Salem and one in Atlanta. In 1991, Wachovia entered the South
Carolina market by acquiring South Carolina National Corporation,[12]
founded as the Bank of Charleston in 1834. In 1998, Wachovia acquired
two Virginia-based banks, Jefferson National Bank and Central Fidelity
Bank. In 1997, Wachovia acquired both 1st United Bancorp and American
Bankshares Inc, giving its first entry into Florida. In 2000, Wachovia
made its final purchase, which was Republic Security Bank.
- >-
July Monarchy
The July Monarchy (French: Monarchie de Juillet) was a liberal
constitutional monarchy in France under Louis Philippe I, starting with
the July Revolution of 1830 and ending with the Revolution of 1848. It
marks the end of the Bourbon Restoration (1814–1830). It began with the
overthrow of the conservative government of Charles X, the last king of
the House of Bourbon.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
model-index:
- name: SentenceTransformer based on Qwen/Qwen2.5-0.5B-Instruct
results:
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts dev 896
type: sts-dev-896
metrics:
- type: pearson_cosine
value: 0.45729692013517886
name: Pearson Cosine
- type: spearman_cosine
value: 0.49645340246652353
name: Spearman Cosine
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts dev 768
type: sts-dev-768
metrics:
- type: pearson_cosine
value: 0.4455125981991164
name: Pearson Cosine
- type: spearman_cosine
value: 0.4896539219726307
name: Spearman Cosine
SentenceTransformer based on Qwen/Qwen2.5-0.5B-Instruct
This is a sentence-transformers model finetuned from Qwen/Qwen2.5-0.5B-Instruct. It maps sentences & paragraphs to a 896-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Qwen/Qwen2.5-0.5B-Instruct
- Maximum Sequence Length: 1024 tokens
- Output Dimensionality: 896 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: Qwen2Model
(1): Pooling({'word_embedding_dimension': 896, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("AlexWortega/qwen1k")
# Run inference
sentences = [
'When did the July Monarchy end?',
'July Monarchy\nThe July Monarchy (French: Monarchie de Juillet) was a liberal constitutional monarchy in France under Louis Philippe I, starting with the July Revolution of 1830 and ending with the Revolution of 1848. It marks the end of the Bourbon Restoration (1814–1830). It began with the overthrow of the conservative government of Charles X, the last king of the House of Bourbon.',
'July Monarchy\nDespite the return of the House of Bourbon to power, France was much changed from the era of the ancien régime. The egalitarianism and liberalism of the revolutionaries remained an important force and the autocracy and hierarchy of the earlier era could not be fully restored. Economic changes, which had been underway long before the revolution, had progressed further during the years of turmoil and were firmly entrenched by 1815. These changes had seen power shift from the noble landowners to the urban merchants. The administrative reforms of Napoleon, such as the Napoleonic Code and efficient bureaucracy, also remained in place. These changes produced a unified central government that was fiscally sound and had much control over all areas of French life, a sharp difference from the complicated mix of feudal and absolutist traditions and institutions of pre-Revolutionary Bourbons.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 896]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Semantic Similarity
- Datasets:
sts-dev-896
andsts-dev-768
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | sts-dev-896 | sts-dev-768 |
---|---|---|
pearson_cosine | 0.4573 | 0.4455 |
spearman_cosine | 0.4965 | 0.4897 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 2,859,594 training samples
- Columns:
query
,response
, andnegative
- Approximate statistics based on the first 1000 samples:
query response negative type string string string details - min: 4 tokens
- mean: 8.76 tokens
- max: 26 tokens
- min: 23 tokens
- mean: 141.88 tokens
- max: 532 tokens
- min: 4 tokens
- mean: 134.02 tokens
- max: 472 tokens
- Samples:
query response negative Was there a year 0?
Year zero
Year zero does not exist in the anno Domini system usually used to number years in the Gregorian calendar and in its predecessor, the Julian calendar. In this system, the year 1 BC is followed by AD 1. However, there is a year zero in astronomical year numbering (where it coincides with the Julian year 1 BC) and in ISO 8601:2004 (where it coincides with the Gregorian year 1 BC) as well as in all Buddhist and Hindu calendars.504
Year 504 (DIV) was a leap year starting on Thursday (link will display the full calendar) of the Julian calendar. At the time, it was known as the Year of the Consulship of Nicomachus without colleague (or, less frequently, year 1257 "Ab urbe condita"). The denomination 504 for this year has been used since the early medieval period, when the Anno Domini calendar era became the prevalent method in Europe for naming years.When is the dialectical method used?
Dialectic
Dialectic or dialectics (Greek: διαλεκτική, dialektikḗ; related to dialogue), also known as the dialectical method, is at base a discourse between two or more people holding different points of view about a subject but wishing to establish the truth through reasoned arguments. Dialectic resembles debate, but the concept excludes subjective elements such as emotional appeal and the modern pejorative sense of rhetoric.[1][2] Dialectic may be contrasted with the didactic method, wherein one side of the conversation teaches the other. Dialectic is alternatively known as minor logic, as opposed to major logic or critique.Derek Bentley case
Another factor in the posthumous defence was that a "confession" recorded by Bentley, which was claimed by the prosecution to be a "verbatim record of dictated monologue", was shown by forensic linguistics methods to have been largely edited by policemen. Linguist Malcolm Coulthard showed that certain patterns, such as the frequency of the word "then" and the grammatical use of "then" after the grammatical subject ("I then" rather than "then I"), were not consistent with Bentley's use of language (his idiolect), as evidenced in court testimony. These patterns fit better the recorded testimony of the policemen involved. This is one of the earliest uses of forensic linguistics on record.What do Grasshoppers eat?
Grasshopper
Grasshoppers are plant-eaters, with a few species at times becoming serious pests of cereals, vegetables and pasture, especially when they swarm in their millions as locusts and destroy crops over wide areas. They protect themselves from predators by camouflage; when detected, many species attempt to startle the predator with a brilliantly-coloured wing-flash while jumping and (if adult) launching themselves into the air, usually flying for only a short distance. Other species such as the rainbow grasshopper have warning coloration which deters predators. Grasshoppers are affected by parasites and various diseases, and many predatory creatures feed on both nymphs and adults. The eggs are the subject of attack by parasitoids and predators.Groundhog
Very often the dens of groundhogs provide homes for other animals including skunks, red foxes, and cottontail rabbits. The fox and skunk feed upon field mice, grasshoppers, beetles and other creatures that destroy farm crops. In aiding these animals, the groundhog indirectly helps the farmer. In addition to providing homes for itself and other animals, the groundhog aids in soil improvement by bringing subsoil to the surface. The groundhog is also a valuable game animal and is considered a difficult sport when hunted in a fair manner. In some parts of Appalachia, they are eaten. - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 896, 768 ], "matryoshka_weights": [ 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 12per_device_eval_batch_size
: 12gradient_accumulation_steps
: 4num_train_epochs
: 1warmup_ratio
: 0.3bf16
: Truebatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 12per_device_eval_batch_size
: 12per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 4eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.3warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | sts-dev-896_spearman_cosine | sts-dev-768_spearman_cosine |
---|---|---|---|---|
0.0002 | 10 | 4.4351 | - | - |
0.0003 | 20 | 4.6508 | - | - |
0.0005 | 30 | 4.7455 | - | - |
0.0007 | 40 | 4.5427 | - | - |
0.0008 | 50 | 4.3982 | - | - |
0.0010 | 60 | 4.3755 | - | - |
0.0012 | 70 | 4.4105 | - | - |
0.0013 | 80 | 5.2227 | - | - |
0.0015 | 90 | 5.8062 | - | - |
0.0017 | 100 | 5.7645 | - | - |
0.0018 | 110 | 5.9261 | - | - |
0.0020 | 120 | 5.8301 | - | - |
0.0022 | 130 | 5.7602 | - | - |
0.0023 | 140 | 5.9392 | - | - |
0.0025 | 150 | 5.7523 | - | - |
0.0027 | 160 | 5.8585 | - | - |
0.0029 | 170 | 5.7916 | - | - |
0.0030 | 180 | 5.8157 | - | - |
0.0032 | 190 | 5.7102 | - | - |
0.0034 | 200 | 5.5844 | - | - |
0.0035 | 210 | 5.5463 | - | - |
0.0037 | 220 | 5.5823 | - | - |
0.0039 | 230 | 5.5514 | - | - |
0.0040 | 240 | 5.5646 | - | - |
0.0042 | 250 | 5.5783 | - | - |
0.0044 | 260 | 5.5344 | - | - |
0.0045 | 270 | 5.523 | - | - |
0.0047 | 280 | 5.4969 | - | - |
0.0049 | 290 | 5.5407 | - | - |
0.0050 | 300 | 5.6171 | - | - |
0.0052 | 310 | 5.5581 | - | - |
0.0054 | 320 | 5.8903 | - | - |
0.0055 | 330 | 5.8675 | - | - |
0.0057 | 340 | 5.745 | - | - |
0.0059 | 350 | 5.6041 | - | - |
0.0060 | 360 | 5.5476 | - | - |
0.0062 | 370 | 5.3964 | - | - |
0.0064 | 380 | 5.3564 | - | - |
0.0065 | 390 | 5.3054 | - | - |
0.0067 | 400 | 5.2779 | - | - |
0.0069 | 410 | 5.206 | - | - |
0.0070 | 420 | 5.2168 | - | - |
0.0072 | 430 | 5.1645 | - | - |
0.0074 | 440 | 5.1797 | - | - |
0.0076 | 450 | 5.2526 | - | - |
0.0077 | 460 | 5.1768 | - | - |
0.0079 | 470 | 5.3519 | - | - |
0.0081 | 480 | 5.2982 | - | - |
0.0082 | 490 | 5.3229 | - | - |
0.0084 | 500 | 5.3758 | - | - |
0.0086 | 510 | 5.2478 | - | - |
0.0087 | 520 | 5.1799 | - | - |
0.0089 | 530 | 5.1088 | - | - |
0.0091 | 540 | 4.977 | - | - |
0.0092 | 550 | 4.9108 | - | - |
0.0094 | 560 | 4.811 | - | - |
0.0096 | 570 | 4.7203 | - | - |
0.0097 | 580 | 4.6499 | - | - |
0.0099 | 590 | 4.4548 | - | - |
0.0101 | 600 | 4.2891 | - | - |
0.0102 | 610 | 4.1881 | - | - |
0.0104 | 620 | 4.6 | - | - |
0.0106 | 630 | 4.5365 | - | - |
0.0107 | 640 | 4.3086 | - | - |
0.0109 | 650 | 4.0452 | - | - |
0.0111 | 660 | 3.9041 | - | - |
0.0112 | 670 | 4.3938 | - | - |
0.0114 | 680 | 4.3198 | - | - |
0.0116 | 690 | 4.1294 | - | - |
0.0117 | 700 | 4.077 | - | - |
0.0119 | 710 | 3.9174 | - | - |
0.0121 | 720 | 4.1629 | - | - |
0.0123 | 730 | 3.9611 | - | - |
0.0124 | 740 | 3.7768 | - | - |
0.0126 | 750 | 3.5842 | - | - |
0.0128 | 760 | 3.1196 | - | - |
0.0129 | 770 | 3.6288 | - | - |
0.0131 | 780 | 3.273 | - | - |
0.0133 | 790 | 2.7889 | - | - |
0.0134 | 800 | 2.5096 | - | - |
0.0136 | 810 | 1.8878 | - | - |
0.0138 | 820 | 2.3423 | - | - |
0.0139 | 830 | 1.7687 | - | - |
0.0141 | 840 | 2.0781 | - | - |
0.0143 | 850 | 2.4598 | - | - |
0.0144 | 860 | 1.7667 | - | - |
0.0146 | 870 | 2.6247 | - | - |
0.0148 | 880 | 1.916 | - | - |
0.0149 | 890 | 2.0817 | - | - |
0.0151 | 900 | 2.3679 | - | - |
0.0153 | 910 | 1.418 | - | - |
0.0154 | 920 | 2.7353 | - | - |
0.0156 | 930 | 1.992 | - | - |
0.0158 | 940 | 1.4564 | - | - |
0.0159 | 950 | 1.4154 | - | - |
0.0161 | 960 | 0.9499 | - | - |
0.0163 | 970 | 1.6304 | - | - |
0.0164 | 980 | 0.9264 | - | - |
0.0166 | 990 | 1.3278 | - | - |
0.0168 | 1000 | 1.686 | 0.4965 | 0.4897 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.3.0
- Transformers: 4.46.2
- PyTorch: 2.1.0+cu118
- Accelerate: 1.1.1
- Datasets: 3.1.0
- Tokenizers: 0.20.3
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}