metadata
base_model: microsoft/deberta-v3-small
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
- pearson_manhattan
- spearman_manhattan
- pearson_euclidean
- spearman_euclidean
- pearson_dot
- spearman_dot
- pearson_max
- spearman_max
- cosine_accuracy
- cosine_accuracy_threshold
- cosine_f1
- cosine_f1_threshold
- cosine_precision
- cosine_recall
- cosine_ap
- dot_accuracy
- dot_accuracy_threshold
- dot_f1
- dot_f1_threshold
- dot_precision
- dot_recall
- dot_ap
- manhattan_accuracy
- manhattan_accuracy_threshold
- manhattan_f1
- manhattan_f1_threshold
- manhattan_precision
- manhattan_recall
- manhattan_ap
- euclidean_accuracy
- euclidean_accuracy_threshold
- euclidean_f1
- euclidean_f1_threshold
- euclidean_precision
- euclidean_recall
- euclidean_ap
- max_accuracy
- max_accuracy_threshold
- max_f1
- max_f1_threshold
- max_precision
- max_recall
- max_ap
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:32500
- loss:GISTEmbedLoss
widget:
- source_sentence: A picture of a white gas range with figurines above.
sentences:
- A nerdy woman brushing her teeth with a friend nearby.
- a white stove turned off with a digital clock
- >-
The plasma membrane also contains other molecules, primarily other
lipids and proteins. The green molecules in Figure above , for example,
are the lipid cholesterol. Molecules of cholesterol help the plasma
membrane keep its shape. Many of the proteins in the plasma membrane
assist other substances in crossing the membrane.
- source_sentence: who makes the kentucky derby garland of roses
sentences:
- >-
Accrington strengthened their position in the play-off places with a
hard-fought win over struggling Dagenham.
- >-
tidal energy can be used to produce electricity. Ocean thermal is energy
derived from waves and also from tidal waves.
Ocean thermal energy can be used to produce electricity.
- >-
Kentucky Derby Trophy The Kroger Company has been the official florist
of the Kentucky Derby since 1987. After taking over the duties from the
Kingsley Walker florist, Kroger began constructing the prestigious
garland in one of its local stores for the public to view on Derby Eve.
The preservation of the garland and crowds of spectators watching its
construction are a testament to the prestige and mystique of the Garland
of Roses.
- source_sentence: what is the difference between a general sense and a special sense?
sentences:
- >-
Ian Curtis ( of Touching from a distance) Ian Kevin Curtis was an
English musician and singer-songwriter. He is best known as the lead
singer and lyricist of the post-punk band Joy Division. Joy Division
released its debut album, Unknown Pleasures, in 1979 and recorded its
follow-up, Closer, in 1980. Curtis, who suffered from epilepsy and
depression, committed suicide on 18 May 1980, on the eve of Joy
Division's first North American tour, resulting in the band's
dissolution and the subsequent formation of New Order. Curtis was known
for his baritone voice, dance style, and songwriting filled with imagery
of desolation, emptiness and alienation. In 1995, Curtis's widow Deborah
published Touching from a Distance: Ian Curtis and Joy Division, a
biography of the singer. His life and death Ian Kevin Curtis was an
English musician and singer-songwriter. He is best known as the lead
singer and lyricist of the post-punk band Joy Division. Joy Division
released its debut album, Unknown Pleasures, in 1979 and recorded its
follow-up, Closer, in 1980. Curtis, who suffered from epilepsy and
depression, committed suicide on 18 May 1980, on the eve of Joy
Division's first North American tour, resulting in the band's
dissolution and the subsequent formation of New Order. Curtis was known
for his baritone voice, dance style, and songwriting filled with imagery
of desolation, emptiness and alienation. In 1995, Curtis's widow Deborah
published Touching from a Distance: Ian Curtis and Joy Division, a
biography of the singer. His life and death have been dramatised in the
films 24 Hour Party People (2002) and Control (2007). ...more
- >-
The human body has two basic types of senses, called special senses and
general senses. Special senses have specialized sense organs that gather
sensory information and change it into nerve impulses. ... General
senses, in contrast, are all associated with the sense of touch. They
lack special sense organs.
- >-
Captain Hook Barrie states in the novel that "Hook was not his true
name. To reveal who he really was would even at this date set the
country in a blaze", and relates that Peter Pan began their rivalry by
feeding the pirate's hand to the crocodile. He is said to be
"Blackbeard's bo'sun" and "the only man of whom Barbecue was afraid".[5]
(In Robert Louis Stevenson's Treasure Island, one of the names Long John
Silver goes by is Barbecue.)[6]
- source_sentence: >-
Retzius was born in Stockholm , son of the anatomist Anders Jahan Retzius
( and grandson of the naturalist and chemist Anders Retzius ) .
sentences:
- >-
Retzius was born in Stockholm , the son of anatomist Anders Jahan
Retzius ( and grandson of the naturalist and chemist Anders Retzius ) .
- >-
As of 14 March , over 156,000 cases of COVID-19 have been reported in
around 140 countries and territories ; more than 5,800 people have died
from the disease and around 75,000 have recovered .
- A person sitting on a stool on the street.
- source_sentence: who was the first person who made the violin
sentences:
- >-
Alice in Chains Alice in Chains is an American rock band from Seattle,
Washington, formed in 1987 by guitarist and vocalist Jerry Cantrell and
drummer Sean Kinney,[1] who recruited bassist Mike Starr[1] and lead
vocalist Layne Staley.[1][2][3] Starr was replaced by Mike Inez in
1993.[4] After Staley's death in 2002, William DuVall joined in 2006 as
co-lead vocalist and rhythm guitarist. The band took its name from
Staley's previous group, the glam metal band Alice N' Chains.[5][2]
- as distance from an object decreases , that object will appear larger
- >-
Violin The first makers of violins probably borrowed from various
developments of the Byzantine lira. These included the rebec;[13] the
Arabic rebab; the vielle (also known as the fidel or viuola); and the
lira da braccio[11][14] The violin in its present form emerged in early
16th-century northern Italy. The earliest pictures of violins, albeit
with three strings, are seen in northern Italy around 1530, at around
the same time as the words "violino" and "vyollon" are seen in Italian
and French documents. One of the earliest explicit descriptions of the
instrument, including its tuning, is from the Epitome musical by Jambe
de Fer, published in Lyon in 1556.[15] By this time, the violin had
already begun to spread throughout Europe.
model-index:
- name: SentenceTransformer based on microsoft/deberta-v3-small
results:
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts test
type: sts-test
metrics:
- type: pearson_cosine
value: 0.1561600438268545
name: Pearson Cosine
- type: spearman_cosine
value: 0.22356441354815124
name: Spearman Cosine
- type: pearson_manhattan
value: 0.2216924674035587
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.24997065610359018
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.1908690981304929
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.22363767136304896
name: Spearman Euclidean
- type: pearson_dot
value: 0.15588248423807516
name: Pearson Dot
- type: spearman_dot
value: 0.22337189362164545
name: Spearman Dot
- type: pearson_max
value: 0.2216924674035587
name: Pearson Max
- type: spearman_max
value: 0.24997065610359018
name: Spearman Max
- task:
type: binary-classification
name: Binary Classification
dataset:
name: allNLI dev
type: allNLI-dev
metrics:
- type: cosine_accuracy
value: 0.666015625
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.9797871112823486
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 0.504258943781942
name: Cosine F1
- type: cosine_f1_threshold
value: 0.8929213285446167
name: Cosine F1 Threshold
- type: cosine_precision
value: 0.357487922705314
name: Cosine Precision
- type: cosine_recall
value: 0.8554913294797688
name: Cosine Recall
- type: cosine_ap
value: 0.4008449937025217
name: Cosine Ap
- type: dot_accuracy
value: 0.666015625
name: Dot Accuracy
- type: dot_accuracy_threshold
value: 752.6634521484375
name: Dot Accuracy Threshold
- type: dot_f1
value: 0.504258943781942
name: Dot F1
- type: dot_f1_threshold
value: 685.9220581054688
name: Dot F1 Threshold
- type: dot_precision
value: 0.357487922705314
name: Dot Precision
- type: dot_recall
value: 0.8554913294797688
name: Dot Recall
- type: dot_ap
value: 0.40071344979441287
name: Dot Ap
- type: manhattan_accuracy
value: 0.66796875
name: Manhattan Accuracy
- type: manhattan_accuracy_threshold
value: 144.52613830566406
name: Manhattan Accuracy Threshold
- type: manhattan_f1
value: 0.5075987841945289
name: Manhattan F1
- type: manhattan_f1_threshold
value: 267.046875
name: Manhattan F1 Threshold
- type: manhattan_precision
value: 0.3443298969072165
name: Manhattan Precision
- type: manhattan_recall
value: 0.9653179190751445
name: Manhattan Recall
- type: manhattan_ap
value: 0.4008700157620745
name: Manhattan Ap
- type: euclidean_accuracy
value: 0.666015625
name: Euclidean Accuracy
- type: euclidean_accuracy_threshold
value: 5.572628974914551
name: Euclidean Accuracy Threshold
- type: euclidean_f1
value: 0.504258943781942
name: Euclidean F1
- type: euclidean_f1_threshold
value: 12.826179504394531
name: Euclidean F1 Threshold
- type: euclidean_precision
value: 0.357487922705314
name: Euclidean Precision
- type: euclidean_recall
value: 0.8554913294797688
name: Euclidean Recall
- type: euclidean_ap
value: 0.40083962142052487
name: Euclidean Ap
- type: max_accuracy
value: 0.66796875
name: Max Accuracy
- type: max_accuracy_threshold
value: 752.6634521484375
name: Max Accuracy Threshold
- type: max_f1
value: 0.5075987841945289
name: Max F1
- type: max_f1_threshold
value: 685.9220581054688
name: Max F1 Threshold
- type: max_precision
value: 0.357487922705314
name: Max Precision
- type: max_recall
value: 0.9653179190751445
name: Max Recall
- type: max_ap
value: 0.4008700157620745
name: Max Ap
- task:
type: binary-classification
name: Binary Classification
dataset:
name: Qnli dev
type: Qnli-dev
metrics:
- type: cosine_accuracy
value: 0.591796875
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.9479926824569702
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 0.6291834002677376
name: Cosine F1
- type: cosine_f1_threshold
value: 0.7761930823326111
name: Cosine F1 Threshold
- type: cosine_precision
value: 0.4598825831702544
name: Cosine Precision
- type: cosine_recall
value: 0.9957627118644068
name: Cosine Recall
- type: cosine_ap
value: 0.5658036772817674
name: Cosine Ap
- type: dot_accuracy
value: 0.59375
name: Dot Accuracy
- type: dot_accuracy_threshold
value: 724.091064453125
name: Dot Accuracy Threshold
- type: dot_f1
value: 0.6291834002677376
name: Dot F1
- type: dot_f1_threshold
value: 596.2498779296875
name: Dot F1 Threshold
- type: dot_precision
value: 0.4598825831702544
name: Dot Precision
- type: dot_recall
value: 0.9957627118644068
name: Dot Recall
- type: dot_ap
value: 0.5657459555147606
name: Dot Ap
- type: manhattan_accuracy
value: 0.6171875
name: Manhattan Accuracy
- type: manhattan_accuracy_threshold
value: 202.07958984375
name: Manhattan Accuracy Threshold
- type: manhattan_f1
value: 0.6291834002677376
name: Manhattan F1
- type: manhattan_f1_threshold
value: 307.9236145019531
name: Manhattan F1 Threshold
- type: manhattan_precision
value: 0.4598825831702544
name: Manhattan Precision
- type: manhattan_recall
value: 0.9957627118644068
name: Manhattan Recall
- type: manhattan_ap
value: 0.5891966424964378
name: Manhattan Ap
- type: euclidean_accuracy
value: 0.591796875
name: Euclidean Accuracy
- type: euclidean_accuracy_threshold
value: 8.938886642456055
name: Euclidean Accuracy Threshold
- type: euclidean_f1
value: 0.6291834002677376
name: Euclidean F1
- type: euclidean_f1_threshold
value: 18.542938232421875
name: Euclidean F1 Threshold
- type: euclidean_precision
value: 0.4598825831702544
name: Euclidean Precision
- type: euclidean_recall
value: 0.9957627118644068
name: Euclidean Recall
- type: euclidean_ap
value: 0.5658036772817674
name: Euclidean Ap
- type: max_accuracy
value: 0.6171875
name: Max Accuracy
- type: max_accuracy_threshold
value: 724.091064453125
name: Max Accuracy Threshold
- type: max_f1
value: 0.6291834002677376
name: Max F1
- type: max_f1_threshold
value: 596.2498779296875
name: Max F1 Threshold
- type: max_precision
value: 0.4598825831702544
name: Max Precision
- type: max_recall
value: 0.9957627118644068
name: Max Recall
- type: max_ap
value: 0.5891966424964378
name: Max Ap
SentenceTransformer based on microsoft/deberta-v3-small
This is a sentence-transformers model finetuned from microsoft/deberta-v3-small. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: microsoft/deberta-v3-small
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model
(1): AdvancedWeightedPooling(
(linear_cls_pj): Linear(in_features=768, out_features=768, bias=True)
(linear_cls_Qpj): Linear(in_features=768, out_features=768, bias=True)
(linear_mean_pj): Linear(in_features=768, out_features=768, bias=True)
(linear_attnOut): Linear(in_features=768, out_features=768, bias=True)
(mha): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
)
(layernorm_output): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(layernorm_weightedPooing): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(layernorm_pjCls): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(layernorm_pjMean): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(layernorm_attnOut): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
)
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("bobox/DeBERTa3-s-CustomPoolin-toytest-step1-checkpoints-tmp")
# Run inference
sentences = [
'who was the first person who made the violin',
'Violin The first makers of violins probably borrowed from various developments of the Byzantine lira. These included the rebec;[13] the Arabic rebab; the vielle (also known as the fidel or viuola); and the lira da braccio[11][14] The violin in its present form emerged in early 16th-century northern Italy. The earliest pictures of violins, albeit with three strings, are seen in northern Italy around 1530, at around the same time as the words "violino" and "vyollon" are seen in Italian and French documents. One of the earliest explicit descriptions of the instrument, including its tuning, is from the Epitome musical by Jambe de Fer, published in Lyon in 1556.[15] By this time, the violin had already begun to spread throughout Europe.',
"Alice in Chains Alice in Chains is an American rock band from Seattle, Washington, formed in 1987 by guitarist and vocalist Jerry Cantrell and drummer Sean Kinney,[1] who recruited bassist Mike Starr[1] and lead vocalist Layne Staley.[1][2][3] Starr was replaced by Mike Inez in 1993.[4] After Staley's death in 2002, William DuVall joined in 2006 as co-lead vocalist and rhythm guitarist. The band took its name from Staley's previous group, the glam metal band Alice N' Chains.[5][2]",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Semantic Similarity
- Dataset:
sts-test
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | Value |
---|---|
pearson_cosine | 0.1562 |
spearman_cosine | 0.2236 |
pearson_manhattan | 0.2217 |
spearman_manhattan | 0.25 |
pearson_euclidean | 0.1909 |
spearman_euclidean | 0.2236 |
pearson_dot | 0.1559 |
spearman_dot | 0.2234 |
pearson_max | 0.2217 |
spearman_max | 0.25 |
Binary Classification
- Dataset:
allNLI-dev
- Evaluated with
BinaryClassificationEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.666 |
cosine_accuracy_threshold | 0.9798 |
cosine_f1 | 0.5043 |
cosine_f1_threshold | 0.8929 |
cosine_precision | 0.3575 |
cosine_recall | 0.8555 |
cosine_ap | 0.4008 |
dot_accuracy | 0.666 |
dot_accuracy_threshold | 752.6635 |
dot_f1 | 0.5043 |
dot_f1_threshold | 685.9221 |
dot_precision | 0.3575 |
dot_recall | 0.8555 |
dot_ap | 0.4007 |
manhattan_accuracy | 0.668 |
manhattan_accuracy_threshold | 144.5261 |
manhattan_f1 | 0.5076 |
manhattan_f1_threshold | 267.0469 |
manhattan_precision | 0.3443 |
manhattan_recall | 0.9653 |
manhattan_ap | 0.4009 |
euclidean_accuracy | 0.666 |
euclidean_accuracy_threshold | 5.5726 |
euclidean_f1 | 0.5043 |
euclidean_f1_threshold | 12.8262 |
euclidean_precision | 0.3575 |
euclidean_recall | 0.8555 |
euclidean_ap | 0.4008 |
max_accuracy | 0.668 |
max_accuracy_threshold | 752.6635 |
max_f1 | 0.5076 |
max_f1_threshold | 685.9221 |
max_precision | 0.3575 |
max_recall | 0.9653 |
max_ap | 0.4009 |
Binary Classification
- Dataset:
Qnli-dev
- Evaluated with
BinaryClassificationEvaluator
Metric | Value |
---|---|
cosine_accuracy | 0.5918 |
cosine_accuracy_threshold | 0.948 |
cosine_f1 | 0.6292 |
cosine_f1_threshold | 0.7762 |
cosine_precision | 0.4599 |
cosine_recall | 0.9958 |
cosine_ap | 0.5658 |
dot_accuracy | 0.5938 |
dot_accuracy_threshold | 724.0911 |
dot_f1 | 0.6292 |
dot_f1_threshold | 596.2499 |
dot_precision | 0.4599 |
dot_recall | 0.9958 |
dot_ap | 0.5657 |
manhattan_accuracy | 0.6172 |
manhattan_accuracy_threshold | 202.0796 |
manhattan_f1 | 0.6292 |
manhattan_f1_threshold | 307.9236 |
manhattan_precision | 0.4599 |
manhattan_recall | 0.9958 |
manhattan_ap | 0.5892 |
euclidean_accuracy | 0.5918 |
euclidean_accuracy_threshold | 8.9389 |
euclidean_f1 | 0.6292 |
euclidean_f1_threshold | 18.5429 |
euclidean_precision | 0.4599 |
euclidean_recall | 0.9958 |
euclidean_ap | 0.5658 |
max_accuracy | 0.6172 |
max_accuracy_threshold | 724.0911 |
max_f1 | 0.6292 |
max_f1_threshold | 596.2499 |
max_precision | 0.4599 |
max_recall | 0.9958 |
max_ap | 0.5892 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 32,500 training samples
- Columns:
sentence1
andsentence2
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 type string string details - min: 4 tokens
- mean: 29.3 tokens
- max: 343 tokens
- min: 2 tokens
- mean: 57.53 tokens
- max: 512 tokens
- Samples:
sentence1 sentence2 A Slippery Dick is what type of creature?
The Slippery Dick (Juvenile) - Whats That Fish! Description Also known as Sand-reef Wrasses and Slippery Dick Wrasse. Found singly or in pairs or in groups constantly circling around reefs, sea grass beds and sandy areas. Colours highly variable especially between juvenile to adult. They feed on hard shell invertebrates. Length - 18cm Depth - 2-12m Widespread Western Atlantic & Caribbean Most reef fish seen by divers during the day are grazers, that cruise around just above the surface of the coral or snoop into crevices looking for algae, worms and small crustaceans. Wrasses have small protruding teeth and graze the bottom taking in a variety of snails, worms, crabs, shrimps and eggs. Any hard coats or thick shells are then ground down by their pharyngeal jaws and the delicacies inside digested. From juvenile to adult wrasses dramatically alter their colour and body shapes. Wrasses are always on the go during the day, but are the first to go to bed and the last to rise. Small wrasses dive below the sand to sleep and larger wrasses wedge themselves in crevasses. Related creatures Heads up! Many creatures change during their life. Juvenile fish become adults and some change shape or their colour. Some species change sex and others just get older. The following creature(s) are known relatives of the Slippery Dick (Juvenile). Click the image(s) to explore further or hover over to get a better view! Slippery Dick
e. in solids the atoms are closely locked in position and can only vibrate, in liquids the atoms and molecules are more loosely connected and can collide with and move past one another, while in gases the atoms or molecules are free to move independently, colliding frequently.
Within a substance, atoms that collide frequently and move independently of one another are most likely in a gas
In December 2015 , the film was ranked # 192 on IMDb .
As of December 2015 , it is the # 192 highest rated film on IMDb.
- Loss:
GISTEmbedLoss
with these parameters:{'guide': SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ), 'temperature': 0.025}
Evaluation Dataset
Unnamed Dataset
- Size: 1,664 evaluation samples
- Columns:
sentence1
andsentence2
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 type string string details - min: 4 tokens
- mean: 28.74 tokens
- max: 330 tokens
- min: 2 tokens
- mean: 56.55 tokens
- max: 512 tokens
- Samples:
sentence1 sentence2 What component of an organism, made up of many cells, in turn makes up an organ?
Diffusion Diffusion is a process where atoms or molecules move from areas of high concentration to areas of low concentration.
Diffusion is the process in which a substance naturally moves from an area of higher to lower concentration.
In the 1966 movie The Good, The Bad And The Ugly, Clint Eastwood played the Good" and Lee van Cleef played "the Bad", but who played "the Ugly"?
View All Photos (10) Movie Info In the last and the best installment of his so-called "Dollars" trilogy of Sergio Leone-directed "spaghetti westerns," Clint Eastwood reprised the role of a taciturn, enigmatic loner. Here he searches for a cache of stolen gold against rivals the Bad (Lee Van Cleef), a ruthless bounty hunter, and the Ugly (Eli Wallach), a Mexican bandit. Though dubbed "the Good," Eastwood's character is not much better than his opponents -- he is just smarter and shoots faster. The film's title reveals its ironic attitude toward the canonized heroes of the classical western. "The real West was the world of violence, fear, and brutal instincts," claimed Leone. "In pursuit of profit there is no such thing as good and evil, generosity or deviousness; everything depends on chance, and not the best wins but the luckiest." Immensely entertaining and beautifully shot in Techniscope by Tonino Delli Colli, the movie is a virtually definitive "spaghetti western," rivaled only by Leone's own Once Upon a Time in the West (1968). The main musical theme by Ennio Morricone hit #1 on the British pop charts. Originally released in Italy at 177 minutes, the movie was later cut for its international release. ~ Yuri German, Rovi Rating:
- Loss:
GISTEmbedLoss
with these parameters:{'guide': SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ), 'temperature': 0.025}
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 32per_device_eval_batch_size
: 256lr_scheduler_type
: cosine_with_min_lrlr_scheduler_kwargs
: {'num_cycles': 0.5, 'min_lr': 3.3333333333333337e-06}warmup_ratio
: 0.33save_safetensors
: Falsefp16
: Truepush_to_hub
: Truehub_model_id
: bobox/DeBERTa3-s-CustomPoolin-toytest-step1-checkpoints-tmphub_strategy
: all_checkpointsbatch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 32per_device_eval_batch_size
: 256per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 3max_steps
: -1lr_scheduler_type
: cosine_with_min_lrlr_scheduler_kwargs
: {'num_cycles': 0.5, 'min_lr': 3.3333333333333337e-06}warmup_ratio
: 0.33warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Falsesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Trueresume_from_checkpoint
: Nonehub_model_id
: bobox/DeBERTa3-s-CustomPoolin-toytest-step1-checkpoints-tmphub_strategy
: all_checkpointshub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | Validation Loss | sts-test_spearman_cosine | allNLI-dev_max_ap | Qnli-dev_max_ap |
---|---|---|---|---|---|---|
0.0010 | 1 | 4.9603 | - | - | - | - |
0.0020 | 2 | 28.2529 | - | - | - | - |
0.0030 | 3 | 27.6365 | - | - | - | - |
0.0039 | 4 | 6.1387 | - | - | - | - |
0.0049 | 5 | 5.5753 | - | - | - | - |
0.0059 | 6 | 5.6951 | - | - | - | - |
0.0069 | 7 | 6.3533 | - | - | - | - |
0.0079 | 8 | 27.3848 | - | - | - | - |
0.0089 | 9 | 3.8501 | - | - | - | - |
0.0098 | 10 | 27.911 | - | - | - | - |
0.0108 | 11 | 4.9042 | - | - | - | - |
0.0118 | 12 | 6.8003 | - | - | - | - |
0.0128 | 13 | 5.7317 | - | - | - | - |
0.0138 | 14 | 20.261 | - | - | - | - |
0.0148 | 15 | 27.9051 | - | - | - | - |
0.0157 | 16 | 5.5959 | - | - | - | - |
0.0167 | 17 | 5.8052 | - | - | - | - |
0.0177 | 18 | 4.5088 | - | - | - | - |
0.0187 | 19 | 7.3472 | - | - | - | - |
0.0197 | 20 | 5.8668 | - | - | - | - |
0.0207 | 21 | 6.4083 | - | - | - | - |
0.0217 | 22 | 6.011 | - | - | - | - |
0.0226 | 23 | 5.2394 | - | - | - | - |
0.0236 | 24 | 4.2966 | - | - | - | - |
0.0246 | 25 | 26.605 | - | - | - | - |
0.0256 | 26 | 6.2067 | - | - | - | - |
0.0266 | 27 | 6.0346 | - | - | - | - |
0.0276 | 28 | 5.4676 | - | - | - | - |
0.0285 | 29 | 6.4292 | - | - | - | - |
0.0295 | 30 | 26.6452 | - | - | - | - |
0.0305 | 31 | 18.8401 | - | - | - | - |
0.0315 | 32 | 7.4531 | - | - | - | - |
0.0325 | 33 | 4.8286 | - | - | - | - |
0.0335 | 34 | 5.0078 | - | - | - | - |
0.0344 | 35 | 5.4115 | - | - | - | - |
0.0354 | 36 | 5.4196 | - | - | - | - |
0.0364 | 37 | 4.5023 | - | - | - | - |
0.0374 | 38 | 5.376 | - | - | - | - |
0.0384 | 39 | 5.2303 | - | - | - | - |
0.0394 | 40 | 5.6694 | - | - | - | - |
0.0404 | 41 | 4.7825 | - | - | - | - |
0.0413 | 42 | 4.6507 | - | - | - | - |
0.0423 | 43 | 24.2072 | - | - | - | - |
0.0433 | 44 | 4.9285 | - | - | - | - |
0.0443 | 45 | 6.326 | - | - | - | - |
0.0453 | 46 | 4.5724 | - | - | - | - |
0.0463 | 47 | 4.754 | - | - | - | - |
0.0472 | 48 | 5.5443 | - | - | - | - |
0.0482 | 49 | 4.5764 | - | - | - | - |
0.0492 | 50 | 5.1434 | - | - | - | - |
0.0502 | 51 | 22.6991 | - | - | - | - |
0.0512 | 52 | 5.4277 | - | - | - | - |
0.0522 | 53 | 5.0178 | - | - | - | - |
0.0531 | 54 | 4.8779 | - | - | - | - |
0.0541 | 55 | 4.2884 | - | - | - | - |
0.0551 | 56 | 16.0994 | - | - | - | - |
0.0561 | 57 | 21.31 | - | - | - | - |
0.0571 | 58 | 4.9721 | - | - | - | - |
0.0581 | 59 | 5.143 | - | - | - | - |
0.0591 | 60 | 3.5933 | - | - | - | - |
0.0600 | 61 | 5.2559 | - | - | - | - |
0.0610 | 62 | 4.0757 | - | - | - | - |
0.0620 | 63 | 3.6612 | - | - | - | - |
0.0630 | 64 | 4.7505 | - | - | - | - |
0.0640 | 65 | 4.1979 | - | - | - | - |
0.0650 | 66 | 3.9982 | - | - | - | - |
0.0659 | 67 | 4.7065 | - | - | - | - |
0.0669 | 68 | 5.3413 | - | - | - | - |
0.0679 | 69 | 3.6964 | - | - | - | - |
0.0689 | 70 | 17.8774 | - | - | - | - |
0.0699 | 71 | 4.8154 | - | - | - | - |
0.0709 | 72 | 4.8356 | - | - | - | - |
0.0719 | 73 | 4.568 | - | - | - | - |
0.0728 | 74 | 4.0898 | - | - | - | - |
0.0738 | 75 | 3.4502 | - | - | - | - |
0.0748 | 76 | 3.7733 | - | - | - | - |
0.0758 | 77 | 4.5204 | - | - | - | - |
0.0768 | 78 | 4.2526 | - | - | - | - |
0.0778 | 79 | 4.4398 | - | - | - | - |
0.0787 | 80 | 4.0988 | - | - | - | - |
0.0797 | 81 | 3.9704 | - | - | - | - |
0.0807 | 82 | 4.3343 | - | - | - | - |
0.0817 | 83 | 4.2587 | - | - | - | - |
0.0827 | 84 | 15.0149 | - | - | - | - |
0.0837 | 85 | 14.6599 | - | - | - | - |
0.0846 | 86 | 4.0623 | - | - | - | - |
0.0856 | 87 | 3.7597 | - | - | - | - |
0.0866 | 88 | 4.3433 | - | - | - | - |
0.0876 | 89 | 4.0287 | - | - | - | - |
0.0886 | 90 | 4.6257 | - | - | - | - |
0.0896 | 91 | 13.4689 | - | - | - | - |
0.0906 | 92 | 4.6583 | - | - | - | - |
0.0915 | 93 | 4.2682 | - | - | - | - |
0.0925 | 94 | 4.468 | - | - | - | - |
0.0935 | 95 | 3.4333 | - | - | - | - |
0.0945 | 96 | 12.7654 | - | - | - | - |
0.0955 | 97 | 3.5577 | - | - | - | - |
0.0965 | 98 | 12.5875 | - | - | - | - |
0.0974 | 99 | 4.2206 | - | - | - | - |
0.0984 | 100 | 3.5981 | - | - | - | - |
0.0994 | 101 | 3.5575 | - | - | - | - |
0.1004 | 102 | 4.0271 | - | - | - | - |
0.1014 | 103 | 4.0803 | - | - | - | - |
0.1024 | 104 | 4.0886 | - | - | - | - |
0.1033 | 105 | 4.176 | - | - | - | - |
0.1043 | 106 | 4.6653 | - | - | - | - |
0.1053 | 107 | 4.3076 | - | - | - | - |
0.1063 | 108 | 8.7282 | - | - | - | - |
0.1073 | 109 | 3.4192 | - | - | - | - |
0.1083 | 110 | 10.6027 | - | - | - | - |
0.1093 | 111 | 4.0959 | - | - | - | - |
0.1102 | 112 | 4.2785 | - | - | - | - |
0.1112 | 113 | 3.9945 | - | - | - | - |
0.1122 | 114 | 10.0652 | - | - | - | - |
0.1132 | 115 | 3.8621 | - | - | - | - |
0.1142 | 116 | 4.3975 | - | - | - | - |
0.1152 | 117 | 9.7899 | - | - | - | - |
0.1161 | 118 | 4.3812 | - | - | - | - |
0.1171 | 119 | 3.8715 | - | - | - | - |
0.1181 | 120 | 3.8327 | - | - | - | - |
0.1191 | 121 | 3.5103 | - | - | - | - |
0.1201 | 122 | 9.3158 | - | - | - | - |
0.1211 | 123 | 3.7201 | - | - | - | - |
0.1220 | 124 | 3.4311 | - | - | - | - |
0.1230 | 125 | 3.7946 | - | - | - | - |
0.1240 | 126 | 4.0456 | - | - | - | - |
0.125 | 127 | 3.482 | - | - | - | - |
0.1260 | 128 | 3.1901 | - | - | - | - |
0.1270 | 129 | 3.414 | - | - | - | - |
0.1280 | 130 | 3.4967 | - | - | - | - |
0.1289 | 131 | 3.6594 | - | - | - | - |
0.1299 | 132 | 8.066 | - | - | - | - |
0.1309 | 133 | 3.7872 | - | - | - | - |
0.1319 | 134 | 4.0023 | - | - | - | - |
0.1329 | 135 | 3.7728 | - | - | - | - |
0.1339 | 136 | 3.1893 | - | - | - | - |
0.1348 | 137 | 3.3635 | - | - | - | - |
0.1358 | 138 | 4.0195 | - | - | - | - |
0.1368 | 139 | 4.1097 | - | - | - | - |
0.1378 | 140 | 3.7903 | - | - | - | - |
0.1388 | 141 | 3.5748 | - | - | - | - |
0.1398 | 142 | 3.8104 | - | - | - | - |
0.1407 | 143 | 8.0411 | - | - | - | - |
0.1417 | 144 | 3.4819 | - | - | - | - |
0.1427 | 145 | 3.452 | - | - | - | - |
0.1437 | 146 | 3.5861 | - | - | - | - |
0.1447 | 147 | 3.4324 | - | - | - | - |
0.1457 | 148 | 3.521 | - | - | - | - |
0.1467 | 149 | 3.8868 | - | - | - | - |
0.1476 | 150 | 8.1191 | - | - | - | - |
0.1486 | 151 | 3.6447 | - | - | - | - |
0.1496 | 152 | 2.9436 | - | - | - | - |
0.1506 | 153 | 8.1535 | 2.2032 | 0.2236 | 0.4009 | 0.5892 |
0.1516 | 154 | 3.9619 | - | - | - | - |
0.1526 | 155 | 3.1301 | - | - | - | - |
0.1535 | 156 | 3.0478 | - | - | - | - |
0.1545 | 157 | 3.2986 | - | - | - | - |
0.1555 | 158 | 3.2847 | - | - | - | - |
0.1565 | 159 | 3.6599 | - | - | - | - |
0.1575 | 160 | 3.2238 | - | - | - | - |
0.1585 | 161 | 2.8897 | - | - | - | - |
0.1594 | 162 | 3.9443 | - | - | - | - |
0.1604 | 163 | 3.3733 | - | - | - | - |
0.1614 | 164 | 3.7444 | - | - | - | - |
0.1624 | 165 | 3.4813 | - | - | - | - |
0.1634 | 166 | 2.6865 | - | - | - | - |
0.1644 | 167 | 2.7587 | - | - | - | - |
0.1654 | 168 | 3.3628 | - | - | - | - |
0.1663 | 169 | 3.0035 | - | - | - | - |
0.1673 | 170 | 10.1591 | - | - | - | - |
0.1683 | 171 | 3.5366 | - | - | - | - |
0.1693 | 172 | 8.4047 | - | - | - | - |
0.1703 | 173 | 3.8643 | - | - | - | - |
0.1713 | 174 | 3.3529 | - | - | - | - |
0.1722 | 175 | 3.7143 | - | - | - | - |
0.1732 | 176 | 3.3323 | - | - | - | - |
0.1742 | 177 | 3.1206 | - | - | - | - |
0.1752 | 178 | 3.1348 | - | - | - | - |
0.1762 | 179 | 7.6011 | - | - | - | - |
0.1772 | 180 | 3.7025 | - | - | - | - |
0.1781 | 181 | 10.5662 | - | - | - | - |
0.1791 | 182 | 8.966 | - | - | - | - |
0.1801 | 183 | 9.426 | - | - | - | - |
0.1811 | 184 | 3.0025 | - | - | - | - |
0.1821 | 185 | 7.0984 | - | - | - | - |
0.1831 | 186 | 7.3808 | - | - | - | - |
0.1841 | 187 | 2.8657 | - | - | - | - |
0.1850 | 188 | 6.5636 | - | - | - | - |
0.1860 | 189 | 3.4702 | - | - | - | - |
0.1870 | 190 | 5.9302 | - | - | - | - |
0.1880 | 191 | 3.2406 | - | - | - | - |
0.1890 | 192 | 3.4459 | - | - | - | - |
0.1900 | 193 | 5.269 | - | - | - | - |
0.1909 | 194 | 4.8605 | - | - | - | - |
0.1919 | 195 | 2.9891 | - | - | - | - |
0.1929 | 196 | 3.6681 | - | - | - | - |
0.1939 | 197 | 3.1589 | - | - | - | - |
0.1949 | 198 | 3.1835 | - | - | - | - |
0.1959 | 199 | 3.7561 | - | - | - | - |
0.1969 | 200 | 4.0891 | - | - | - | - |
0.1978 | 201 | 3.563 | - | - | - | - |
0.1988 | 202 | 3.7433 | - | - | - | - |
0.1998 | 203 | 3.3813 | - | - | - | - |
0.2008 | 204 | 5.2311 | - | - | - | - |
0.2018 | 205 | 3.3494 | - | - | - | - |
0.2028 | 206 | 3.3533 | - | - | - | - |
0.2037 | 207 | 3.688 | - | - | - | - |
0.2047 | 208 | 3.5342 | - | - | - | - |
0.2057 | 209 | 4.9381 | - | - | - | - |
0.2067 | 210 | 3.1839 | - | - | - | - |
0.2077 | 211 | 3.0465 | - | - | - | - |
0.2087 | 212 | 3.1232 | - | - | - | - |
0.2096 | 213 | 4.6297 | - | - | - | - |
0.2106 | 214 | 2.9834 | - | - | - | - |
0.2116 | 215 | 4.2231 | - | - | - | - |
0.2126 | 216 | 3.1458 | - | - | - | - |
0.2136 | 217 | 3.2525 | - | - | - | - |
0.2146 | 218 | 3.5971 | - | - | - | - |
0.2156 | 219 | 3.5616 | - | - | - | - |
0.2165 | 220 | 3.2378 | - | - | - | - |
0.2175 | 221 | 2.9075 | - | - | - | - |
0.2185 | 222 | 3.0391 | - | - | - | - |
0.2195 | 223 | 3.5573 | - | - | - | - |
0.2205 | 224 | 3.2092 | - | - | - | - |
0.2215 | 225 | 3.2646 | - | - | - | - |
0.2224 | 226 | 3.0886 | - | - | - | - |
0.2234 | 227 | 3.5241 | - | - | - | - |
0.2244 | 228 | 3.0111 | - | - | - | - |
0.2254 | 229 | 3.707 | - | - | - | - |
0.2264 | 230 | 5.3822 | - | - | - | - |
0.2274 | 231 | 3.2646 | - | - | - | - |
0.2283 | 232 | 2.7021 | - | - | - | - |
0.2293 | 233 | 3.5131 | - | - | - | - |
0.2303 | 234 | 3.103 | - | - | - | - |
0.2313 | 235 | 2.9535 | - | - | - | - |
0.2323 | 236 | 2.9631 | - | - | - | - |
0.2333 | 237 | 2.8068 | - | - | - | - |
0.2343 | 238 | 3.4251 | - | - | - | - |
0.2352 | 239 | 2.8495 | - | - | - | - |
0.2362 | 240 | 2.9972 | - | - | - | - |
0.2372 | 241 | 3.3509 | - | - | - | - |
0.2382 | 242 | 2.9234 | - | - | - | - |
0.2392 | 243 | 2.4086 | - | - | - | - |
0.2402 | 244 | 3.1282 | - | - | - | - |
0.2411 | 245 | 2.3352 | - | - | - | - |
0.2421 | 246 | 2.4706 | - | - | - | - |
0.2431 | 247 | 3.5449 | - | - | - | - |
0.2441 | 248 | 2.8963 | - | - | - | - |
0.2451 | 249 | 2.773 | - | - | - | - |
0.2461 | 250 | 2.355 | - | - | - | - |
0.2470 | 251 | 2.656 | - | - | - | - |
0.2480 | 252 | 2.6221 | - | - | - | - |
0.2490 | 253 | 8.6739 | - | - | - | - |
0.25 | 254 | 10.8242 | - | - | - | - |
0.2510 | 255 | 2.3408 | - | - | - | - |
0.2520 | 256 | 2.1221 | - | - | - | - |
0.2530 | 257 | 3.295 | - | - | - | - |
0.2539 | 258 | 2.5896 | - | - | - | - |
0.2549 | 259 | 2.1215 | - | - | - | - |
0.2559 | 260 | 9.4851 | - | - | - | - |
0.2569 | 261 | 2.1982 | - | - | - | - |
0.2579 | 262 | 3.0568 | - | - | - | - |
0.2589 | 263 | 2.6269 | - | - | - | - |
0.2598 | 264 | 2.4792 | - | - | - | - |
0.2608 | 265 | 1.9445 | - | - | - | - |
0.2618 | 266 | 2.4061 | - | - | - | - |
0.2628 | 267 | 8.3116 | - | - | - | - |
0.2638 | 268 | 8.0804 | - | - | - | - |
0.2648 | 269 | 2.1674 | - | - | - | - |
0.2657 | 270 | 7.1975 | - | - | - | - |
0.2667 | 271 | 5.9104 | - | - | - | - |
0.2677 | 272 | 2.498 | - | - | - | - |
0.2687 | 273 | 2.5249 | - | - | - | - |
0.2697 | 274 | 2.7152 | - | - | - | - |
0.2707 | 275 | 2.7904 | - | - | - | - |
0.2717 | 276 | 2.7745 | - | - | - | - |
0.2726 | 277 | 2.9741 | - | - | - | - |
0.2736 | 278 | 1.8215 | - | - | - | - |
0.2746 | 279 | 4.6844 | - | - | - | - |
0.2756 | 280 | 2.8613 | - | - | - | - |
0.2766 | 281 | 2.7147 | - | - | - | - |
0.2776 | 282 | 2.814 | - | - | - | - |
0.2785 | 283 | 2.3569 | - | - | - | - |
0.2795 | 284 | 2.672 | - | - | - | - |
0.2805 | 285 | 3.2052 | - | - | - | - |
0.2815 | 286 | 2.8056 | - | - | - | - |
0.2825 | 287 | 2.6268 | - | - | - | - |
0.2835 | 288 | 2.5641 | - | - | - | - |
0.2844 | 289 | 2.4475 | - | - | - | - |
0.2854 | 290 | 2.7377 | - | - | - | - |
0.2864 | 291 | 2.3831 | - | - | - | - |
0.2874 | 292 | 8.8069 | - | - | - | - |
0.2884 | 293 | 2.186 | - | - | - | - |
0.2894 | 294 | 2.3389 | - | - | - | - |
0.2904 | 295 | 1.9744 | - | - | - | - |
0.2913 | 296 | 2.4491 | - | - | - | - |
0.2923 | 297 | 2.5668 | - | - | - | - |
0.2933 | 298 | 2.1939 | - | - | - | - |
0.2943 | 299 | 2.2832 | - | - | - | - |
0.2953 | 300 | 2.7508 | - | - | - | - |
0.2963 | 301 | 2.5206 | - | - | - | - |
0.2972 | 302 | 2.3522 | - | - | - | - |
0.2982 | 303 | 2.7186 | - | - | - | - |
0.2992 | 304 | 2.1369 | - | - | - | - |
0.3002 | 305 | 9.7972 | - | - | - | - |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.2.1
- Transformers: 4.44.2
- PyTorch: 2.5.0+cu121
- Accelerate: 0.34.2
- Datasets: 3.0.2
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
GISTEmbedLoss
@misc{solatorio2024gistembed,
title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
author={Aivin V. Solatorio},
year={2024},
eprint={2402.16829},
archivePrefix={arXiv},
primaryClass={cs.LG}
}