bobox's picture
Cached Gist embedd loss
a86524d verified
---
base_model: microsoft/deberta-v3-small
datasets:
- tals/vitaminc
language:
- en
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
- pearson_manhattan
- spearman_manhattan
- pearson_euclidean
- spearman_euclidean
- pearson_dot
- spearman_dot
- pearson_max
- spearman_max
- cosine_accuracy
- cosine_accuracy_threshold
- cosine_f1
- cosine_f1_threshold
- cosine_precision
- cosine_recall
- cosine_ap
- dot_accuracy
- dot_accuracy_threshold
- dot_f1
- dot_f1_threshold
- dot_precision
- dot_recall
- dot_ap
- manhattan_accuracy
- manhattan_accuracy_threshold
- manhattan_f1
- manhattan_f1_threshold
- manhattan_precision
- manhattan_recall
- manhattan_ap
- euclidean_accuracy
- euclidean_accuracy_threshold
- euclidean_f1
- euclidean_f1_threshold
- euclidean_precision
- euclidean_recall
- euclidean_ap
- max_accuracy
- max_accuracy_threshold
- max_f1
- max_f1_threshold
- max_precision
- max_recall
- max_ap
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:225247
- loss:CachedGISTEmbedLoss
widget:
- source_sentence: what is exfo toolbox
sentences:
- Eye dilation from eye drops used for examination of the eye usually lasts from
4 to 24 hours, depending upon the strength of the drop and upon the individual
patient.
- Garden Grove is a city in northern Orange County in the U.S. state of California,
34 miles (55 km) south of Los Angeles. The population was 170,883 at the 2010
United States Census. State Route 22, also known as the Garden Grove Freeway,
passes through the city in an east-west direction.
- EXFO ToolBox Office is a product that offers you a collection of viewers and analyzers.
It enables you to manage and analyze results acquired from fiber optic test modules
and instruments.
- source_sentence: More than 273 people have died from the 2019-20 coronavirus outside
mainland China .
sentences:
- 'More than 3,700 people have died : around 3,100 in mainland China and around
550 in all other countries combined .'
- 'More than 3,200 people have died : almost 3,000 in mainland China and around
275 in other countries .'
- more than 4,900 deaths have been attributed to COVID-19 .
- source_sentence: Ultrasound, a diagnostic technology, uses high-frequency vibrations
transmitted into any tissue in contact with the transducer.
sentences:
- What diagnostic technology uses high-frequency vibrations transmitted into any
tissue in contact with the transducer?
- The abnormal cells cannot carry oxygen properly and can get stuck where?
- What type of organism is a bacteria?
- source_sentence: When you add moles of gas to a baloon by blowing it up, the volume
increases.
sentences:
- What shape is the lens of the eye?
- What happens to the volume of a balloon when you add moles of gas to it by blowing
up?
- Most turtle bodies are covered by a special bony or cartilaginous shell developed
from their what?
- source_sentence: What was the name of eleven rulers of the 19th and 20th Egyptian
dynasties?
sentences:
- 'Airlines Yugoslavia 1968 - 1968 Renamed ^ Comments : Aviogenex was formed on
21May1968 as Genex Airlines. Restarted under current name on 30Apr1969 & liquidated
in Feb2015 ^ Genealogy : Genex Airlines >Aviogenex 1968 - 1986 Renamed ^ Comments
: Adria Airways was formed on 14Mar1961 & operations started on 30Jun1961 as Adria
Airways, renamed to Inex in 1968 and back to Adria again in 1986. National airline
of Slovenia ^ Genealogy : Adria Airways >Inex Adria Airways >Adria Airways JAT
(Jugoslovenski Aerotransport) 1947 - 2003 Renamed ^ Comments : Air Serbia was
founded as Aeroput on 17Jun1927, renamed to JAT on 01Apr1947. Started ops on 15Apr1947,
Renamed again on 08Aug2003 to JAT Airways & reformed as Air Serbia on 26Oct2013
^ Genealogy : Aeroput >JAT (Jugoslovenski Aerotransport) >JAT Airways >Air Serbia
Jugoslovenski Aerotransport'
- List of Rulers of Ancient Egypt and Nubia | Lists of Rulers | Heilbrunn Timeline
of Art History | The Metropolitan Museum of Art The Metropolitan Museum of Art
List of Rulers of Ancient Egypt and Nubia See works of art 30.8.234 52.127.4 Our
knowledge of the succession of Egyptian kings is based on kinglists kept by the
ancient Egyptians themselves. The most famous are the Palermo Stone, which covers
the period from the earliest dynasties to the middle of Dynasty 5; the Abydos
Kinglist, which Seti I had carved on his temple at Abydos; and the Turin Canon,
a papyrus that covers the period from the earliest dynasties to the reign of Ramesses
II. All are incomplete or fragmentary. We also rely on the History of Egypt written
by Manetho in the third century B.C. A priest in the temple at Heliopolis, Manetho
had access to many original sources and it was he who divided the kings into the
thirty dynasties we use today. It is to this structure of dynasties and listed
kings that we now attempt to link an absolute chronology of dates in terms of
our own calendrical system. The process is made difficult by the fragmentary condition
of the kinglists and by differences in the calendrical years used at various times.
Some astronomical observations from the ancient Egyptians have survived, allowing
us to calculate absolute dates within a margin of error. Synchronisms with the
other civilizations of the ancient world are also of limited use.
- 'What is the "Jack Sprat" nursery rhyme? | Reference.com What is the "Jack Sprat"
nursery rhyme? A: Quick Answer "Jack Sprat" is a traditional English nursery rhyme
whose main verse says, "Jack Sprat could eat no fat. His wife could eat no lean.
And so between them both, you see, they licked the platter clean." Though it was
likely sung by children long before, "Jack Sprat" was first published around 1765
in the compilation "Mother Goose''s Melody." Full Answer According to Rhymes.org,
a U.K. website devoted to nursery rhyme lyrics and origins, the "Jack Sprat" nursery
rhyme has its origins in British history. In one interpretation, Jack Sprat was
King Charles I, who ruled England in the early part of the 17th century, and his
wife was Queen Henrietta Maria. Parliament refused to finance the king''s war
with Spain, which made him lean. However, the queen fattened the coffers by levying
an illegal war tax. In an alternative version, the "Jack Sprat" nursery rhyme
is linked to King Richard and his brother John of the Robin Hood legend. Jack
Sprat was King John, the usurper who tried to take over the crown when King Richard
went off to fight in the Crusades in the 12th century. When King Richard was captured,
John had to raise a ransom to rescue him, leaving the country lean. The wife was
Joan, daughter of the Earl of Gloucester, the greedy wife of King John. However,
after King Richard died and John became king, he had his marriage with Joan annulled.'
model-index:
- name: SentenceTransformer based on microsoft/deberta-v3-small
results:
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts test
type: sts-test
metrics:
- type: pearson_cosine
value: 0.7673854808079448
name: Pearson Cosine
- type: spearman_cosine
value: 0.7776198286738142
name: Spearman Cosine
- type: pearson_manhattan
value: 0.782368447545155
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.7720687033298573
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.7882638792170585
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.7775073687564514
name: Spearman Euclidean
- type: pearson_dot
value: 0.7669147371310585
name: Pearson Dot
- type: spearman_dot
value: 0.7762894632049069
name: Spearman Dot
- type: pearson_max
value: 0.7882638792170585
name: Pearson Max
- type: spearman_max
value: 0.7776198286738142
name: Spearman Max
- task:
type: binary-classification
name: Binary Classification
dataset:
name: allNLI dev
type: allNLI-dev
metrics:
- type: cosine_accuracy
value: 0.708984375
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.8714957237243652
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 0.5913043478260869
name: Cosine F1
- type: cosine_f1_threshold
value: 0.7768557071685791
name: Cosine F1 Threshold
- type: cosine_precision
value: 0.4738675958188153
name: Cosine Precision
- type: cosine_recall
value: 0.7861271676300579
name: Cosine Recall
- type: cosine_ap
value: 0.5644305887001508
name: Cosine Ap
- type: dot_accuracy
value: 0.7109375
name: Dot Accuracy
- type: dot_accuracy_threshold
value: 674.426025390625
name: Dot Accuracy Threshold
- type: dot_f1
value: 0.5913043478260869
name: Dot F1
- type: dot_f1_threshold
value: 603.435302734375
name: Dot F1 Threshold
- type: dot_precision
value: 0.4738675958188153
name: Dot Precision
- type: dot_recall
value: 0.7861271676300579
name: Dot Recall
- type: dot_ap
value: 0.5664868031504724
name: Dot Ap
- type: manhattan_accuracy
value: 0.7109375
name: Manhattan Accuracy
- type: manhattan_accuracy_threshold
value: 294.4728088378906
name: Manhattan Accuracy Threshold
- type: manhattan_f1
value: 0.5935483870967742
name: Manhattan F1
- type: manhattan_f1_threshold
value: 401.1482849121094
name: Manhattan F1 Threshold
- type: manhattan_precision
value: 0.4726027397260274
name: Manhattan Precision
- type: manhattan_recall
value: 0.7976878612716763
name: Manhattan Recall
- type: manhattan_ap
value: 0.5642688421649988
name: Manhattan Ap
- type: euclidean_accuracy
value: 0.7109375
name: Euclidean Accuracy
- type: euclidean_accuracy_threshold
value: 14.565500259399414
name: Euclidean Accuracy Threshold
- type: euclidean_f1
value: 0.5913043478260869
name: Euclidean F1
- type: euclidean_f1_threshold
value: 18.60409164428711
name: Euclidean F1 Threshold
- type: euclidean_precision
value: 0.4738675958188153
name: Euclidean Precision
- type: euclidean_recall
value: 0.7861271676300579
name: Euclidean Recall
- type: euclidean_ap
value: 0.5645557227019772
name: Euclidean Ap
- type: max_accuracy
value: 0.7109375
name: Max Accuracy
- type: max_accuracy_threshold
value: 674.426025390625
name: Max Accuracy Threshold
- type: max_f1
value: 0.5935483870967742
name: Max F1
- type: max_f1_threshold
value: 603.435302734375
name: Max F1 Threshold
- type: max_precision
value: 0.4738675958188153
name: Max Precision
- type: max_recall
value: 0.7976878612716763
name: Max Recall
- type: max_ap
value: 0.5664868031504724
name: Max Ap
- task:
type: binary-classification
name: Binary Classification
dataset:
name: Qnli dev
type: Qnli-dev
metrics:
- type: cosine_accuracy
value: 0.6796875
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.7726649045944214
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 0.6925675675675677
name: Cosine F1
- type: cosine_f1_threshold
value: 0.7317887544631958
name: Cosine F1 Threshold
- type: cosine_precision
value: 0.5758426966292135
name: Cosine Precision
- type: cosine_recall
value: 0.8686440677966102
name: Cosine Recall
- type: cosine_ap
value: 0.7302564198016936
name: Cosine Ap
- type: dot_accuracy
value: 0.67578125
name: Dot Accuracy
- type: dot_accuracy_threshold
value: 598.0419921875
name: Dot Accuracy Threshold
- type: dot_f1
value: 0.6912751677852348
name: Dot F1
- type: dot_f1_threshold
value: 565.4718017578125
name: Dot F1 Threshold
- type: dot_precision
value: 0.5722222222222222
name: Dot Precision
- type: dot_recall
value: 0.8728813559322034
name: Dot Recall
- type: dot_ap
value: 0.7300462025003271
name: Dot Ap
- type: manhattan_accuracy
value: 0.6796875
name: Manhattan Accuracy
- type: manhattan_accuracy_threshold
value: 404.8309020996094
name: Manhattan Accuracy Threshold
- type: manhattan_f1
value: 0.6933333333333332
name: Manhattan F1
- type: manhattan_f1_threshold
value: 444.99224853515625
name: Manhattan F1 Threshold
- type: manhattan_precision
value: 0.5714285714285714
name: Manhattan Precision
- type: manhattan_recall
value: 0.8813559322033898
name: Manhattan Recall
- type: manhattan_ap
value: 0.7369214156436785
name: Manhattan Ap
- type: euclidean_accuracy
value: 0.6796875
name: Euclidean Accuracy
- type: euclidean_accuracy_threshold
value: 18.790739059448242
name: Euclidean Accuracy Threshold
- type: euclidean_f1
value: 0.6934306569343065
name: Euclidean F1
- type: euclidean_f1_threshold
value: 19.35132598876953
name: Euclidean F1 Threshold
- type: euclidean_precision
value: 0.6089743589743589
name: Euclidean Precision
- type: euclidean_recall
value: 0.8050847457627118
name: Euclidean Recall
- type: euclidean_ap
value: 0.7307381840067684
name: Euclidean Ap
- type: max_accuracy
value: 0.6796875
name: Max Accuracy
- type: max_accuracy_threshold
value: 598.0419921875
name: Max Accuracy Threshold
- type: max_f1
value: 0.6934306569343065
name: Max F1
- type: max_f1_threshold
value: 565.4718017578125
name: Max F1 Threshold
- type: max_precision
value: 0.6089743589743589
name: Max Precision
- type: max_recall
value: 0.8813559322033898
name: Max Recall
- type: max_ap
value: 0.7369214156436785
name: Max Ap
---
# SentenceTransformer based on microsoft/deberta-v3-small
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small) <!-- at revision a36c739020e01763fe789b4b85e2df55d6180012 -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 tokens
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
- **Language:** en
<!-- - **License:** Unknown -->
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model
(1): AdvancedWeightedPooling(
(linear_cls): Linear(in_features=768, out_features=768, bias=True)
(linear_mean): Linear(in_features=768, out_features=768, bias=True)
(mha): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
)
(layernorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(layernorm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(layernorm_cls): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(layernorm_mean): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
)
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("bobox/DeBERTa3-s-CustomPoolin-v3-step1")
# Run inference
sentences = [
'What was the name of eleven rulers of the 19th and 20th Egyptian dynasties?',
'List of Rulers of Ancient Egypt and Nubia | Lists of Rulers | Heilbrunn Timeline of Art History | The Metropolitan Museum of Art The Metropolitan Museum of Art List of Rulers of Ancient Egypt and Nubia See works of art 30.8.234 52.127.4 Our knowledge of the succession of Egyptian kings is based on kinglists kept by the ancient Egyptians themselves. The most famous are the Palermo Stone, which covers the period from the earliest dynasties to the middle of Dynasty 5; the Abydos Kinglist, which Seti I had carved on his temple at Abydos; and the Turin Canon, a papyrus that covers the period from the earliest dynasties to the reign of Ramesses II. All are incomplete or fragmentary. We also rely on the History of Egypt written by Manetho in the third century B.C. A priest in the temple at Heliopolis, Manetho had access to many original sources and it was he who divided the kings into the thirty dynasties we use today. It is to this structure of dynasties and listed kings that we now attempt to link an absolute chronology of dates in terms of our own calendrical system. The process is made difficult by the fragmentary condition of the kinglists and by differences in the calendrical years used at various times. Some astronomical observations from the ancient Egyptians have survived, allowing us to calculate absolute dates within a margin of error. Synchronisms with the other civilizations of the ancient world are also of limited use.',
'What is the "Jack Sprat" nursery rhyme? | Reference.com What is the "Jack Sprat" nursery rhyme? A: Quick Answer "Jack Sprat" is a traditional English nursery rhyme whose main verse says, "Jack Sprat could eat no fat. His wife could eat no lean. And so between them both, you see, they licked the platter clean." Though it was likely sung by children long before, "Jack Sprat" was first published around 1765 in the compilation "Mother Goose\'s Melody." Full Answer According to Rhymes.org, a U.K. website devoted to nursery rhyme lyrics and origins, the "Jack Sprat" nursery rhyme has its origins in British history. In one interpretation, Jack Sprat was King Charles I, who ruled England in the early part of the 17th century, and his wife was Queen Henrietta Maria. Parliament refused to finance the king\'s war with Spain, which made him lean. However, the queen fattened the coffers by levying an illegal war tax. In an alternative version, the "Jack Sprat" nursery rhyme is linked to King Richard and his brother John of the Robin Hood legend. Jack Sprat was King John, the usurper who tried to take over the crown when King Richard went off to fight in the Crusades in the 12th century. When King Richard was captured, John had to raise a ransom to rescue him, leaving the country lean. The wife was Joan, daughter of the Earl of Gloucester, the greedy wife of King John. However, after King Richard died and John became king, he had his marriage with Joan annulled.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
## Evaluation
### Metrics
#### Semantic Similarity
* Dataset: `sts-test`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
| Metric | Value |
|:--------------------|:-----------|
| pearson_cosine | 0.7674 |
| **spearman_cosine** | **0.7776** |
| pearson_manhattan | 0.7824 |
| spearman_manhattan | 0.7721 |
| pearson_euclidean | 0.7883 |
| spearman_euclidean | 0.7775 |
| pearson_dot | 0.7669 |
| spearman_dot | 0.7763 |
| pearson_max | 0.7883 |
| spearman_max | 0.7776 |
#### Binary Classification
* Dataset: `allNLI-dev`
* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
| Metric | Value |
|:-----------------------------|:-----------|
| cosine_accuracy | 0.709 |
| cosine_accuracy_threshold | 0.8715 |
| cosine_f1 | 0.5913 |
| cosine_f1_threshold | 0.7769 |
| cosine_precision | 0.4739 |
| cosine_recall | 0.7861 |
| cosine_ap | 0.5644 |
| dot_accuracy | 0.7109 |
| dot_accuracy_threshold | 674.426 |
| dot_f1 | 0.5913 |
| dot_f1_threshold | 603.4353 |
| dot_precision | 0.4739 |
| dot_recall | 0.7861 |
| dot_ap | 0.5665 |
| manhattan_accuracy | 0.7109 |
| manhattan_accuracy_threshold | 294.4728 |
| manhattan_f1 | 0.5935 |
| manhattan_f1_threshold | 401.1483 |
| manhattan_precision | 0.4726 |
| manhattan_recall | 0.7977 |
| manhattan_ap | 0.5643 |
| euclidean_accuracy | 0.7109 |
| euclidean_accuracy_threshold | 14.5655 |
| euclidean_f1 | 0.5913 |
| euclidean_f1_threshold | 18.6041 |
| euclidean_precision | 0.4739 |
| euclidean_recall | 0.7861 |
| euclidean_ap | 0.5646 |
| max_accuracy | 0.7109 |
| max_accuracy_threshold | 674.426 |
| max_f1 | 0.5935 |
| max_f1_threshold | 603.4353 |
| max_precision | 0.4739 |
| max_recall | 0.7977 |
| **max_ap** | **0.5665** |
#### Binary Classification
* Dataset: `Qnli-dev`
* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
| Metric | Value |
|:-----------------------------|:-----------|
| cosine_accuracy | 0.6797 |
| cosine_accuracy_threshold | 0.7727 |
| cosine_f1 | 0.6926 |
| cosine_f1_threshold | 0.7318 |
| cosine_precision | 0.5758 |
| cosine_recall | 0.8686 |
| cosine_ap | 0.7303 |
| dot_accuracy | 0.6758 |
| dot_accuracy_threshold | 598.042 |
| dot_f1 | 0.6913 |
| dot_f1_threshold | 565.4718 |
| dot_precision | 0.5722 |
| dot_recall | 0.8729 |
| dot_ap | 0.73 |
| manhattan_accuracy | 0.6797 |
| manhattan_accuracy_threshold | 404.8309 |
| manhattan_f1 | 0.6933 |
| manhattan_f1_threshold | 444.9922 |
| manhattan_precision | 0.5714 |
| manhattan_recall | 0.8814 |
| manhattan_ap | 0.7369 |
| euclidean_accuracy | 0.6797 |
| euclidean_accuracy_threshold | 18.7907 |
| euclidean_f1 | 0.6934 |
| euclidean_f1_threshold | 19.3513 |
| euclidean_precision | 0.609 |
| euclidean_recall | 0.8051 |
| euclidean_ap | 0.7307 |
| max_accuracy | 0.6797 |
| max_accuracy_threshold | 598.042 |
| max_f1 | 0.6934 |
| max_f1_threshold | 565.4718 |
| max_precision | 0.609 |
| max_recall | 0.8814 |
| **max_ap** | **0.7369** |
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Evaluation Dataset
#### vitaminc-pairs
* Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
* Size: 128 evaluation samples
* Columns: <code>claim</code> and <code>evidence</code>
* Approximate statistics based on the first 128 samples:
| | claim | evidence |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 9 tokens</li><li>mean: 21.42 tokens</li><li>max: 41 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 35.55 tokens</li><li>max: 79 tokens</li></ul> |
* Samples:
| claim | evidence |
|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>Dragon Con had over 5000 guests .</code> | <code>Among the more than 6000 guests and musical performers at the 2009 convention were such notables as Patrick Stewart , William Shatner , Leonard Nimoy , Terry Gilliam , Bruce Boxleitner , James Marsters , and Mary McDonnell .</code> |
| <code>COVID-19 has reached more than 185 countries .</code> | <code>As of , more than cases of COVID-19 have been reported in more than 190 countries and 200 territories , resulting in more than deaths .</code> |
| <code>In March , Italy had 3.6x times more cases of coronavirus than China .</code> | <code>As of 12 March , among nations with at least one million citizens , Italy has the world 's highest per capita rate of positive coronavirus cases at 206.1 cases per million people ( 3.6x times the rate of China ) and is the country with the second-highest number of positive cases as well as of deaths in the world , after China .</code> |
* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
```json
{'guide': SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
), 'temperature': 0.025}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 100
- `per_device_eval_batch_size`: 256
- `gradient_accumulation_steps`: 2
- `lr_scheduler_type`: cosine_with_min_lr
- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 1.6666666666666667e-05}
- `warmup_ratio`: 0.33
- `save_safetensors`: False
- `fp16`: True
- `push_to_hub`: True
- `hub_model_id`: bobox/DeBERTa3-s-CustomPoolin-v3-step1-checkpoints-tmp
- `hub_strategy`: all_checkpoints
- `batch_sampler`: no_duplicates
#### All Hyperparameters
<details><summary>Click to expand</summary>
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 100
- `per_device_eval_batch_size`: 256
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 2
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 3
- `max_steps`: -1
- `lr_scheduler_type`: cosine_with_min_lr
- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 1.6666666666666667e-05}
- `warmup_ratio`: 0.33
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: False
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: True
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: True
- `resume_from_checkpoint`: None
- `hub_model_id`: bobox/DeBERTa3-s-CustomPoolin-v3-step1-checkpoints-tmp
- `hub_strategy`: all_checkpoints
- `hub_private_repo`: False
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `eval_use_gather_object`: False
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional
</details>
### Training Logs
<details><summary>Click to expand</summary>
| Epoch | Step | Training Loss | vitaminc-pairs loss | negation-triplets loss | scitail-pairs-pos loss | scitail-pairs-qa loss | xsum-pairs loss | sciq pairs loss | qasc pairs loss | openbookqa pairs loss | msmarco pairs loss | nq pairs loss | trivia pairs loss | gooaq pairs loss | paws-pos loss | global dataset loss | sts-test_spearman_cosine | allNLI-dev_max_ap | Qnli-dev_max_ap |
|:------:|:----:|:-------------:|:-------------------:|:----------------------:|:----------------------:|:---------------------:|:---------------:|:---------------:|:---------------:|:---------------------:|:------------------:|:-------------:|:-----------------:|:----------------:|:-------------:|:-------------------:|:------------------------:|:-----------------:|:---------------:|
| 0.0168 | 8 | 10.2928 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0336 | 16 | 9.2166 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0504 | 24 | 9.4858 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0672 | 32 | 10.6143 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0840 | 40 | 8.7553 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1008 | 48 | 10.9939 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1176 | 56 | 7.6039 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1345 | 64 | 5.9498 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1513 | 72 | 7.3051 | 3.2988 | 3.9604 | 1.9818 | 2.1997 | 6.0515 | 0.6095 | 6.3199 | 4.8391 | 6.4886 | 6.6406 | 6.4894 | 6.1527 | 2.0082 | 4.9577 | 0.3066 | 0.3444 | 0.5627 |
| 0.1681 | 80 | 8.3034 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1849 | 88 | 7.6669 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2017 | 96 | 6.6415 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2185 | 104 | 5.7797 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2353 | 112 | 5.8361 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2521 | 120 | 5.3339 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2689 | 128 | 5.5908 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2857 | 136 | 5.3209 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3025 | 144 | 5.5359 | 3.3310 | 3.8580 | 1.4769 | 1.6994 | 5.4819 | 0.5385 | 5.2021 | 4.4410 | 5.3419 | 5.5506 | 5.6972 | 5.3376 | 1.4170 | 3.9169 | 0.2954 | 0.3795 | 0.6317 |
| 0.3193 | 152 | 5.4713 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3361 | 160 | 4.9368 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3529 | 168 | 4.6594 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3697 | 176 | 4.8392 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3866 | 184 | 4.414 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4034 | 192 | 4.891 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4202 | 200 | 4.4553 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4370 | 208 | 3.9729 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4538 | 216 | 3.7705 | 3.2468 | 3.6435 | 0.7890 | 0.7356 | 3.9327 | 0.4082 | 3.7175 | 3.5404 | 3.5351 | 4.0506 | 3.9953 | 3.6074 | 0.4195 | 2.4726 | 0.3791 | 0.4133 | 0.6779 |
| 0.4706 | 224 | 3.8409 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4874 | 232 | 3.7894 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5042 | 240 | 3.3523 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5210 | 248 | 3.2407 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5378 | 256 | 3.3203 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5546 | 264 | 2.8457 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5714 | 272 | 2.4181 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5882 | 280 | 3.4589 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6050 | 288 | 2.8203 | 3.1119 | 3.1485 | 0.4531 | 0.2652 | 2.6895 | 0.2656 | 2.5542 | 2.7523 | 2.6600 | 3.1773 | 3.2099 | 2.7316 | 0.2006 | 1.6342 | 0.5257 | 0.4717 | 0.7078 |
| 0.6218 | 296 | 2.4697 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6387 | 304 | 2.4654 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6555 | 312 | 2.4236 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6723 | 320 | 2.2879 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6891 | 328 | 2.2145 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7059 | 336 | 1.8464 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7227 | 344 | 2.0086 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7395 | 352 | 2.0635 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7563 | 360 | 1.8584 | 3.3202 | 2.5793 | 0.3434 | 0.1618 | 1.6759 | 0.1834 | 1.6454 | 2.1257 | 2.1938 | 2.5316 | 2.4558 | 2.0596 | 0.0984 | 1.2206 | 0.6610 | 0.5199 | 0.7119 |
| 0.7731 | 368 | 2.0286 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7899 | 376 | 1.9389 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8067 | 384 | 1.7453 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8235 | 392 | 1.6629 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8403 | 400 | 1.2724 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8571 | 408 | 1.7824 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8739 | 416 | 1.5826 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8908 | 424 | 1.1971 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9076 | 432 | 1.5228 | 3.3624 | 2.1952 | 0.3006 | 0.1223 | 1.1091 | 0.1582 | 1.2383 | 1.8664 | 1.7434 | 2.3959 | 2.0697 | 1.7563 | 0.0766 | 1.0193 | 0.7292 | 0.5194 | 0.7126 |
| 0.9244 | 440 | 1.3323 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9412 | 448 | 1.5124 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9580 | 456 | 1.5565 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9748 | 464 | 1.3672 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9916 | 472 | 1.0382 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.0084 | 480 | 1.0626 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.0252 | 488 | 1.3539 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.0420 | 496 | 1.1723 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.0588 | 504 | 1.4235 | 3.4031 | 1.9759 | 0.2554 | 0.0814 | 0.9034 | 0.1378 | 1.1603 | 1.7589 | 1.5608 | 2.1230 | 1.7719 | 1.6633 | 0.0720 | 0.9380 | 0.7523 | 0.5297 | 0.7129 |
| 1.0756 | 512 | 1.2283 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.0924 | 520 | 1.2455 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.1092 | 528 | 1.4265 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.1261 | 536 | 1.296 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.1429 | 544 | 0.8763 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.1597 | 552 | 1.5678 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.1765 | 560 | 1.2548 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.1933 | 568 | 1.3731 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.2101 | 576 | 1.3023 | 3.3815 | 1.8740 | 0.2373 | 0.0769 | 0.7711 | 0.1237 | 0.9432 | 1.6871 | 1.5070 | 1.9947 | 1.6041 | 1.5579 | 0.0721 | 0.8661 | 0.7642 | 0.5412 | 0.7159 |
| 1.2269 | 584 | 0.8135 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.2437 | 592 | 1.0259 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.2605 | 600 | 1.1896 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.2773 | 608 | 1.0532 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.2941 | 616 | 1.3221 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.3109 | 624 | 1.3136 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.3277 | 632 | 1.2238 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.3445 | 640 | 1.2407 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.3613 | 648 | 1.2245 | 3.4717 | 1.7962 | 0.2242 | 0.0488 | 0.7472 | 0.1108 | 0.9272 | 1.6692 | 1.3845 | 1.9117 | 1.3410 | 1.4387 | 0.0701 | 0.8505 | 0.7680 | 0.5471 | 0.7227 |
| 1.3782 | 656 | 1.0428 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.3950 | 664 | 1.1391 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.4118 | 672 | 1.2632 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.4286 | 680 | 0.9403 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.4454 | 688 | 0.7571 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.4622 | 696 | 0.9436 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.4790 | 704 | 1.1239 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.4958 | 712 | 0.9499 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.5126 | 720 | 1.0945 | 3.6495 | 1.6693 | 0.2157 | 0.0492 | 0.6830 | 0.1049 | 0.9140 | 1.5967 | 1.4397 | 1.7394 | 1.3303 | 1.4334 | 0.0603 | 0.8185 | 0.7815 | 0.5606 | 0.7098 |
| 1.5294 | 728 | 1.1161 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.5462 | 736 | 1.0056 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.5630 | 744 | 1.1743 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.5798 | 752 | 0.9153 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.5966 | 760 | 1.1589 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.6134 | 768 | 0.9187 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.6303 | 776 | 0.6937 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.6471 | 784 | 0.9704 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.6639 | 792 | 0.7343 | 3.5442 | 1.6493 | 0.2208 | 0.0249 | 0.6152 | 0.0969 | 0.7111 | 1.5369 | 1.4058 | 1.7066 | 1.2784 | 1.3419 | 0.0585 | 0.7827 | 0.7749 | 0.5627 | 0.7284 |
| 1.6807 | 800 | 1.2878 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.6975 | 808 | 0.9898 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.7143 | 816 | 0.7613 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.7311 | 824 | 0.9612 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.7479 | 832 | 1.1524 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.7647 | 840 | 0.827 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.7815 | 848 | 1.1898 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.7983 | 856 | 1.0117 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.8151 | 864 | 0.7019 | 3.4544 | 1.6149 | 0.2035 | 0.0181 | 0.5525 | 0.0999 | 0.6641 | 1.5456 | 1.3911 | 1.7188 | 1.2547 | 1.3517 | 0.0562 | 0.7473 | 0.7684 | 0.5697 | 0.7329 |
| 1.8319 | 872 | 0.8352 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.8487 | 880 | 0.7836 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.8655 | 888 | 1.0187 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.8824 | 896 | 0.74 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.8992 | 904 | 0.7263 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.9160 | 912 | 0.8073 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.9328 | 920 | 0.8185 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.9496 | 928 | 1.0992 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.9664 | 936 | 0.9973 | 3.5110 | 1.5776 | 0.2035 | 0.0250 | 0.5881 | 0.0934 | 0.6719 | 1.5059 | 1.2970 | 1.6186 | 1.1815 | 1.2714 | 0.0564 | 0.7213 | 0.7799 | 0.5544 | 0.7341 |
| 1.9832 | 944 | 0.6662 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.0 | 952 | 0.533 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.0168 | 960 | 0.7712 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.0336 | 968 | 0.6879 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.0504 | 976 | 0.7975 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.0672 | 984 | 0.873 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.0840 | 992 | 0.7995 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.1008 | 1000 | 1.0119 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.1176 | 1008 | 0.6317 | 3.6778 | 1.5845 | 0.2102 | 0.0228 | 0.5851 | 0.0977 | 0.6411 | 1.4752 | 1.2992 | 1.6314 | 1.1260 | 1.2683 | 0.0556 | 0.7329 | 0.7693 | 0.5614 | 0.7274 |
| 2.1345 | 1016 | 0.72 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.1513 | 1024 | 0.9418 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.1681 | 1032 | 0.7848 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.1849 | 1040 | 0.6965 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.2017 | 1048 | 1.0447 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.2185 | 1056 | 0.6361 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.2353 | 1064 | 0.6837 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.2521 | 1072 | 0.5713 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.2689 | 1080 | 0.8193 | 3.6399 | 1.5565 | 0.2069 | 0.0213 | 0.5440 | 0.0904 | 0.6057 | 1.4815 | 1.2856 | 1.6441 | 1.1469 | 1.2540 | 0.0543 | 0.7216 | 0.7765 | 0.5599 | 0.7322 |
| 2.2857 | 1088 | 0.9754 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.3025 | 1096 | 0.8932 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.3193 | 1104 | 0.8716 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.3361 | 1112 | 0.8787 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.3529 | 1120 | 0.9529 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.3697 | 1128 | 0.775 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.3866 | 1136 | 0.6178 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.4034 | 1144 | 0.8384 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.4202 | 1152 | 0.9425 | 3.5672 | 1.5244 | 0.2111 | 0.0162 | 0.5593 | 0.0893 | 0.5759 | 1.4933 | 1.2703 | 1.5815 | 1.1202 | 1.2132 | 0.0531 | 0.7058 | 0.7730 | 0.5635 | 0.7350 |
| 2.4370 | 1160 | 0.4551 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.4538 | 1168 | 0.6392 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.4706 | 1176 | 0.8341 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.4874 | 1184 | 0.7392 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.5042 | 1192 | 0.7646 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.5210 | 1200 | 0.8613 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.5378 | 1208 | 0.7585 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.5546 | 1216 | 1.0611 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.5714 | 1224 | 0.6506 | 3.6439 | 1.5040 | 0.2125 | 0.0162 | 0.5282 | 0.0863 | 0.5858 | 1.5073 | 1.2444 | 1.5493 | 1.1014 | 1.2073 | 0.0532 | 0.7022 | 0.7774 | 0.5647 | 0.7328 |
| 2.5882 | 1232 | 0.8525 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.6050 | 1240 | 0.6304 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.6218 | 1248 | 0.6354 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.6387 | 1256 | 0.6583 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.6555 | 1264 | 0.5964 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.6723 | 1272 | 0.818 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.6891 | 1280 | 0.8635 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.7059 | 1288 | 0.6389 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.7227 | 1296 | 0.6819 | 3.6131 | 1.5104 | 0.2084 | 0.0148 | 0.5229 | 0.0854 | 0.5588 | 1.4963 | 1.2766 | 1.5679 | 1.0982 | 1.2203 | 0.0529 | 0.7059 | 0.7762 | 0.5659 | 0.7355 |
| 2.7395 | 1304 | 0.7878 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.7563 | 1312 | 0.7638 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.7731 | 1320 | 0.8885 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.7899 | 1328 | 0.8184 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.8067 | 1336 | 0.7472 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.8235 | 1344 | 0.7012 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.8403 | 1352 | 0.4622 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.8571 | 1360 | 0.846 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.8739 | 1368 | 0.8308 | 3.6224 | 1.5088 | 0.2084 | 0.0148 | 0.5118 | 0.0858 | 0.5523 | 1.4941 | 1.2756 | 1.5808 | 1.0925 | 1.2114 | 0.0521 | 0.7022 | 0.7765 | 0.5662 | 0.7366 |
| 2.8908 | 1376 | 0.5334 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.9076 | 1384 | 0.7893 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.9244 | 1392 | 0.6897 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.9412 | 1400 | 0.7803 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.9580 | 1408 | 0.841 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.9748 | 1416 | 0.787 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 2.9916 | 1424 | 0.5861 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 3.0 | 1428 | - | 3.6139 | 1.5071 | 0.2084 | 0.0150 | 0.5124 | 0.0862 | 0.5532 | 1.4924 | 1.2700 | 1.5806 | 1.0905 | 1.2081 | 0.0519 | 0.6997 | 0.7776 | 0.5665 | 0.7369 |
</details>
### Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.2.0
- Transformers: 4.44.2
- PyTorch: 2.4.1+cu121
- Accelerate: 0.34.2
- Datasets: 3.0.1
- Tokenizers: 0.19.1
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->