bobox's picture
Training in progress, step 240, checkpoint
9223a91 verified
---
base_model: microsoft/deberta-v3-small
datasets:
- tals/vitaminc
language:
- en
library_name: sentence-transformers
metrics:
- pearson_cosine
- spearman_cosine
- pearson_manhattan
- spearman_manhattan
- pearson_euclidean
- spearman_euclidean
- pearson_dot
- spearman_dot
- pearson_max
- spearman_max
- cosine_accuracy
- cosine_accuracy_threshold
- cosine_f1
- cosine_f1_threshold
- cosine_precision
- cosine_recall
- cosine_ap
- dot_accuracy
- dot_accuracy_threshold
- dot_f1
- dot_f1_threshold
- dot_precision
- dot_recall
- dot_ap
- manhattan_accuracy
- manhattan_accuracy_threshold
- manhattan_f1
- manhattan_f1_threshold
- manhattan_precision
- manhattan_recall
- manhattan_ap
- euclidean_accuracy
- euclidean_accuracy_threshold
- euclidean_f1
- euclidean_f1_threshold
- euclidean_precision
- euclidean_recall
- euclidean_ap
- max_accuracy
- max_accuracy_threshold
- max_f1
- max_f1_threshold
- max_precision
- max_recall
- max_ap
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:225247
- loss:CachedGISTEmbedLoss
widget:
- source_sentence: how long to grill boneless skinless chicken breasts in oven
sentences:
- "[ syll. a-ka-hi, ak-ahi ] The baby boy name Akahi is also used as a girl name.\
\ Its pronunciation is AA K AA HHiy â\x80 . Akahi's origin, as well as its use,\
\ is in the Hawaiian language. The name's meaning is never before. Akahi is infrequently\
\ used as a baby name for boys."
- October consists of 31 days. November has 30 days. When you add both together
they have 61 days.
- Heat a grill or grill pan. When the grill is hot, place the chicken on the grill
and cook for about 4 minutes per side, or until cooked through. You can also bake
the thawed chicken in a 375 degree F oven for 15 minutes, or until cooked through.
- source_sentence: More than 273 people have died from the 2019-20 coronavirus outside
mainland China .
sentences:
- 'More than 3,700 people have died : around 3,100 in mainland China and around
550 in all other countries combined .'
- 'More than 3,200 people have died : almost 3,000 in mainland China and around
275 in other countries .'
- more than 4,900 deaths have been attributed to COVID-19 .
- source_sentence: Most red algae species live in oceans.
sentences:
- Where do most red algae species live?
- Which layer of the earth is molten?
- As a diver descends, the increase in pressure causes the body’s air pockets in
the ears and lungs to do what?
- source_sentence: Binary compounds of carbon with less electronegative elements are
called carbides.
sentences:
- What are four children born at one birth called?
- Binary compounds of carbon with less electronegative elements are called what?
- The water cycle involves movement of water between air and what?
- source_sentence: What is the basic monetary unit of Iceland?
sentences:
- 'Ao dai - Vietnamese traditional dress - YouTube Ao dai - Vietnamese traditional
dress Want to watch this again later? Sign in to add this video to a playlist.
Need to report the video? Sign in to report inappropriate content. Rating is available
when the video has been rented. This feature is not available right now. Please
try again later. Uploaded on Jul 8, 2009 Simple, yet charming, graceful and elegant,
áo dài was designed to praise the slender beauty of Vietnamese women. The dress
is a genius combination of ancient and modern. It shows every curve on the girl''s
body, creating sexiness for the wearer, yet it still preserves the traditional
feminine grace of Vietnamese women with its charming flowing flaps. The simplicity
of áo dài makes it convenient and practical, something that other Asian traditional
clothes lack. The waist-length slits of the flaps allow every movement of the
legs: walking, running, riding a bicycle, climbing a tree, doing high kicks. The
looseness of the pants allows comfortability. As a girl walks in áo dài, the movements
of the flaps make it seem like she''s not walking but floating in the air. This
breath-taking beautiful image of a Vietnamese girl walking in áo dài has been
an inspiration for generations of Vietnamese poets, novelists, artists and has
left a deep impression for every foreigner who has visited the country. Category'
- 'Icelandic monetary unit - definition of Icelandic monetary unit by The Free Dictionary
Icelandic monetary unit - definition of Icelandic monetary unit by The Free Dictionary
http://www.thefreedictionary.com/Icelandic+monetary+unit Related to Icelandic
monetary unit: Icelandic Old Krona ThesaurusAntonymsRelated WordsSynonymsLegend:
monetary unit - a unit of money Icelandic krona , krona - the basic unit of money
in Iceland eyrir - 100 aurar equal 1 krona in Iceland Want to thank TFD for its
existence? Tell a friend about us , add a link to this page, or visit the webmaster''s
page for free fun content . Link to this page: Copyright © 2003-2017 Farlex, Inc
Disclaimer All content on this website, including dictionary, thesaurus, literature,
geography, and other reference data is for informational purposes only. This information
should not be considered complete, up to date, and is not intended to be used
in place of a visit, consultation, or advice of a legal, medical, or any other
professional.'
- 'Food-Info.net : E-numbers : E140: Chlorophyll CI 75810, Natural Green 3, Chlorophyll
A, Magnesium chlorophyll Origin: Natural green colour, present in all plants and
algae. Commercially extracted from nettles, grass and alfalfa. Function & characteristics:'
model-index:
- name: SentenceTransformer based on microsoft/deberta-v3-small
results:
- task:
type: semantic-similarity
name: Semantic Similarity
dataset:
name: sts test
type: sts-test
metrics:
- type: pearson_cosine
value: 0.3977846210139704
name: Pearson Cosine
- type: spearman_cosine
value: 0.44299644096637864
name: Spearman Cosine
- type: pearson_manhattan
value: 0.43174431600737306
name: Pearson Manhattan
- type: spearman_manhattan
value: 0.4553695033739603
name: Spearman Manhattan
- type: pearson_euclidean
value: 0.42060129087924125
name: Pearson Euclidean
- type: spearman_euclidean
value: 0.44300328790921845
name: Spearman Euclidean
- type: pearson_dot
value: 0.3974381713503513
name: Pearson Dot
- type: spearman_dot
value: 0.4426330607320026
name: Spearman Dot
- type: pearson_max
value: 0.43174431600737306
name: Pearson Max
- type: spearman_max
value: 0.4553695033739603
name: Spearman Max
- task:
type: binary-classification
name: Binary Classification
dataset:
name: allNLI dev
type: allNLI-dev
metrics:
- type: cosine_accuracy
value: 0.66796875
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.9727417230606079
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 0.5338983050847458
name: Cosine F1
- type: cosine_f1_threshold
value: 0.8509687781333923
name: Cosine F1 Threshold
- type: cosine_precision
value: 0.4214046822742475
name: Cosine Precision
- type: cosine_recall
value: 0.7283236994219653
name: Cosine Recall
- type: cosine_ap
value: 0.4443750308487611
name: Cosine Ap
- type: dot_accuracy
value: 0.66796875
name: Dot Accuracy
- type: dot_accuracy_threshold
value: 747.4664916992188
name: Dot Accuracy Threshold
- type: dot_f1
value: 0.5347368421052632
name: Dot F1
- type: dot_f1_threshold
value: 652.6121826171875
name: Dot F1 Threshold
- type: dot_precision
value: 0.4205298013245033
name: Dot Precision
- type: dot_recall
value: 0.7341040462427746
name: Dot Recall
- type: dot_ap
value: 0.4447331164315086
name: Dot Ap
- type: manhattan_accuracy
value: 0.673828125
name: Manhattan Accuracy
- type: manhattan_accuracy_threshold
value: 185.35494995117188
name: Manhattan Accuracy Threshold
- type: manhattan_f1
value: 0.5340909090909091
name: Manhattan F1
- type: manhattan_f1_threshold
value: 316.48419189453125
name: Manhattan F1 Threshold
- type: manhattan_precision
value: 0.3971830985915493
name: Manhattan Precision
- type: manhattan_recall
value: 0.815028901734104
name: Manhattan Recall
- type: manhattan_ap
value: 0.45330636568192945
name: Manhattan Ap
- type: euclidean_accuracy
value: 0.66796875
name: Euclidean Accuracy
- type: euclidean_accuracy_threshold
value: 6.472302436828613
name: Euclidean Accuracy Threshold
- type: euclidean_f1
value: 0.5338983050847458
name: Euclidean F1
- type: euclidean_f1_threshold
value: 15.134000778198242
name: Euclidean F1 Threshold
- type: euclidean_precision
value: 0.4214046822742475
name: Euclidean Precision
- type: euclidean_recall
value: 0.7283236994219653
name: Euclidean Recall
- type: euclidean_ap
value: 0.44436910603457025
name: Euclidean Ap
- type: max_accuracy
value: 0.673828125
name: Max Accuracy
- type: max_accuracy_threshold
value: 747.4664916992188
name: Max Accuracy Threshold
- type: max_f1
value: 0.5347368421052632
name: Max F1
- type: max_f1_threshold
value: 652.6121826171875
name: Max F1 Threshold
- type: max_precision
value: 0.4214046822742475
name: Max Precision
- type: max_recall
value: 0.815028901734104
name: Max Recall
- type: max_ap
value: 0.45330636568192945
name: Max Ap
- task:
type: binary-classification
name: Binary Classification
dataset:
name: Qnli dev
type: Qnli-dev
metrics:
- type: cosine_accuracy
value: 0.66015625
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.8744948506355286
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 0.6646433990895295
name: Cosine F1
- type: cosine_f1_threshold
value: 0.753309965133667
name: Cosine F1 Threshold
- type: cosine_precision
value: 0.5177304964539007
name: Cosine Precision
- type: cosine_recall
value: 0.9279661016949152
name: Cosine Recall
- type: cosine_ap
value: 0.6610633478265061
name: Cosine Ap
- type: dot_accuracy
value: 0.66015625
name: Dot Accuracy
- type: dot_accuracy_threshold
value: 670.719970703125
name: Dot Accuracy Threshold
- type: dot_f1
value: 0.6646433990895295
name: Dot F1
- type: dot_f1_threshold
value: 578.874755859375
name: Dot F1 Threshold
- type: dot_precision
value: 0.5177304964539007
name: Dot Precision
- type: dot_recall
value: 0.9279661016949152
name: Dot Recall
- type: dot_ap
value: 0.6607472505349153
name: Dot Ap
- type: manhattan_accuracy
value: 0.666015625
name: Manhattan Accuracy
- type: manhattan_accuracy_threshold
value: 281.9825134277344
name: Manhattan Accuracy Threshold
- type: manhattan_f1
value: 0.6678899082568808
name: Manhattan F1
- type: manhattan_f1_threshold
value: 328.83447265625
name: Manhattan F1 Threshold
- type: manhattan_precision
value: 0.5889967637540453
name: Manhattan Precision
- type: manhattan_recall
value: 0.7711864406779662
name: Manhattan Recall
- type: manhattan_ap
value: 0.6664006509577655
name: Manhattan Ap
- type: euclidean_accuracy
value: 0.66015625
name: Euclidean Accuracy
- type: euclidean_accuracy_threshold
value: 13.881525039672852
name: Euclidean Accuracy Threshold
- type: euclidean_f1
value: 0.6646433990895295
name: Euclidean F1
- type: euclidean_f1_threshold
value: 19.471359252929688
name: Euclidean F1 Threshold
- type: euclidean_precision
value: 0.5177304964539007
name: Euclidean Precision
- type: euclidean_recall
value: 0.9279661016949152
name: Euclidean Recall
- type: euclidean_ap
value: 0.6611053426809266
name: Euclidean Ap
- type: max_accuracy
value: 0.666015625
name: Max Accuracy
- type: max_accuracy_threshold
value: 670.719970703125
name: Max Accuracy Threshold
- type: max_f1
value: 0.6678899082568808
name: Max F1
- type: max_f1_threshold
value: 578.874755859375
name: Max F1 Threshold
- type: max_precision
value: 0.5889967637540453
name: Max Precision
- type: max_recall
value: 0.9279661016949152
name: Max Recall
- type: max_ap
value: 0.6664006509577655
name: Max Ap
---
# SentenceTransformer based on microsoft/deberta-v3-small
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [microsoft/deberta-v3-small](https://huggingface.co/microsoft/deberta-v3-small) <!-- at revision a36c739020e01763fe789b4b85e2df55d6180012 -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 tokens
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
- **Language:** en
<!-- - **License:** Unknown -->
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model
(1): AdvancedWeightedPooling(
(linear_cls): Linear(in_features=768, out_features=768, bias=True)
(mha): MultiheadAttention(
(out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
)
(layernorm): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
(layernorm2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
)
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("bobox/DeBERTa3-s-CustomPooling-test1-checkpoints-tmp")
# Run inference
sentences = [
'What is the basic monetary unit of Iceland?',
"Icelandic monetary unit - definition of Icelandic monetary unit by The Free Dictionary Icelandic monetary unit - definition of Icelandic monetary unit by The Free Dictionary http://www.thefreedictionary.com/Icelandic+monetary+unit Related to Icelandic monetary unit: Icelandic Old Krona ThesaurusAntonymsRelated WordsSynonymsLegend: monetary unit - a unit of money Icelandic krona , krona - the basic unit of money in Iceland eyrir - 100 aurar equal 1 krona in Iceland Want to thank TFD for its existence? Tell a friend about us , add a link to this page, or visit the webmaster's page for free fun content . Link to this page: Copyright © 2003-2017 Farlex, Inc Disclaimer All content on this website, including dictionary, thesaurus, literature, geography, and other reference data is for informational purposes only. This information should not be considered complete, up to date, and is not intended to be used in place of a visit, consultation, or advice of a legal, medical, or any other professional.",
'Food-Info.net : E-numbers : E140: Chlorophyll CI 75810, Natural Green 3, Chlorophyll A, Magnesium chlorophyll Origin: Natural green colour, present in all plants and algae. Commercially extracted from nettles, grass and alfalfa. Function & characteristics:',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
## Evaluation
### Metrics
#### Semantic Similarity
* Dataset: `sts-test`
* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
| Metric | Value |
|:--------------------|:----------|
| pearson_cosine | 0.3978 |
| **spearman_cosine** | **0.443** |
| pearson_manhattan | 0.4317 |
| spearman_manhattan | 0.4554 |
| pearson_euclidean | 0.4206 |
| spearman_euclidean | 0.443 |
| pearson_dot | 0.3974 |
| spearman_dot | 0.4426 |
| pearson_max | 0.4317 |
| spearman_max | 0.4554 |
#### Binary Classification
* Dataset: `allNLI-dev`
* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
| Metric | Value |
|:-----------------------------|:-----------|
| cosine_accuracy | 0.668 |
| cosine_accuracy_threshold | 0.9727 |
| cosine_f1 | 0.5339 |
| cosine_f1_threshold | 0.851 |
| cosine_precision | 0.4214 |
| cosine_recall | 0.7283 |
| cosine_ap | 0.4444 |
| dot_accuracy | 0.668 |
| dot_accuracy_threshold | 747.4665 |
| dot_f1 | 0.5347 |
| dot_f1_threshold | 652.6122 |
| dot_precision | 0.4205 |
| dot_recall | 0.7341 |
| dot_ap | 0.4447 |
| manhattan_accuracy | 0.6738 |
| manhattan_accuracy_threshold | 185.3549 |
| manhattan_f1 | 0.5341 |
| manhattan_f1_threshold | 316.4842 |
| manhattan_precision | 0.3972 |
| manhattan_recall | 0.815 |
| manhattan_ap | 0.4533 |
| euclidean_accuracy | 0.668 |
| euclidean_accuracy_threshold | 6.4723 |
| euclidean_f1 | 0.5339 |
| euclidean_f1_threshold | 15.134 |
| euclidean_precision | 0.4214 |
| euclidean_recall | 0.7283 |
| euclidean_ap | 0.4444 |
| max_accuracy | 0.6738 |
| max_accuracy_threshold | 747.4665 |
| max_f1 | 0.5347 |
| max_f1_threshold | 652.6122 |
| max_precision | 0.4214 |
| max_recall | 0.815 |
| **max_ap** | **0.4533** |
#### Binary Classification
* Dataset: `Qnli-dev`
* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
| Metric | Value |
|:-----------------------------|:-----------|
| cosine_accuracy | 0.6602 |
| cosine_accuracy_threshold | 0.8745 |
| cosine_f1 | 0.6646 |
| cosine_f1_threshold | 0.7533 |
| cosine_precision | 0.5177 |
| cosine_recall | 0.928 |
| cosine_ap | 0.6611 |
| dot_accuracy | 0.6602 |
| dot_accuracy_threshold | 670.72 |
| dot_f1 | 0.6646 |
| dot_f1_threshold | 578.8748 |
| dot_precision | 0.5177 |
| dot_recall | 0.928 |
| dot_ap | 0.6607 |
| manhattan_accuracy | 0.666 |
| manhattan_accuracy_threshold | 281.9825 |
| manhattan_f1 | 0.6679 |
| manhattan_f1_threshold | 328.8345 |
| manhattan_precision | 0.589 |
| manhattan_recall | 0.7712 |
| manhattan_ap | 0.6664 |
| euclidean_accuracy | 0.6602 |
| euclidean_accuracy_threshold | 13.8815 |
| euclidean_f1 | 0.6646 |
| euclidean_f1_threshold | 19.4714 |
| euclidean_precision | 0.5177 |
| euclidean_recall | 0.928 |
| euclidean_ap | 0.6611 |
| max_accuracy | 0.666 |
| max_accuracy_threshold | 670.72 |
| max_f1 | 0.6679 |
| max_f1_threshold | 578.8748 |
| max_precision | 0.589 |
| max_recall | 0.928 |
| **max_ap** | **0.6664** |
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Evaluation Dataset
#### vitaminc-pairs
* Dataset: [vitaminc-pairs](https://huggingface.co/datasets/tals/vitaminc) at [be6febb](https://huggingface.co/datasets/tals/vitaminc/tree/be6febb761b0b2807687e61e0b5282e459df2fa0)
* Size: 128 evaluation samples
* Columns: <code>claim</code> and <code>evidence</code>
* Approximate statistics based on the first 128 samples:
| | claim | evidence |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
| type | string | string |
| details | <ul><li>min: 9 tokens</li><li>mean: 21.42 tokens</li><li>max: 41 tokens</li></ul> | <ul><li>min: 11 tokens</li><li>mean: 35.55 tokens</li><li>max: 79 tokens</li></ul> |
* Samples:
| claim | evidence |
|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <code>Dragon Con had over 5000 guests .</code> | <code>Among the more than 6000 guests and musical performers at the 2009 convention were such notables as Patrick Stewart , William Shatner , Leonard Nimoy , Terry Gilliam , Bruce Boxleitner , James Marsters , and Mary McDonnell .</code> |
| <code>COVID-19 has reached more than 185 countries .</code> | <code>As of , more than cases of COVID-19 have been reported in more than 190 countries and 200 territories , resulting in more than deaths .</code> |
| <code>In March , Italy had 3.6x times more cases of coronavirus than China .</code> | <code>As of 12 March , among nations with at least one million citizens , Italy has the world 's highest per capita rate of positive coronavirus cases at 206.1 cases per million people ( 3.6x times the rate of China ) and is the country with the second-highest number of positive cases as well as of deaths in the world , after China .</code> |
* Loss: [<code>CachedGISTEmbedLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cachedgistembedloss) with these parameters:
```json
{'guide': SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
), 'temperature': 0.025}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 42
- `per_device_eval_batch_size`: 128
- `gradient_accumulation_steps`: 2
- `learning_rate`: 3e-05
- `weight_decay`: 0.001
- `lr_scheduler_type`: cosine_with_min_lr
- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 1e-05}
- `warmup_ratio`: 0.25
- `save_safetensors`: False
- `fp16`: True
- `push_to_hub`: True
- `hub_model_id`: bobox/DeBERTa3-s-CustomPooling-test1-checkpoints-tmp
- `hub_strategy`: all_checkpoints
- `batch_sampler`: no_duplicates
#### All Hyperparameters
<details><summary>Click to expand</summary>
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 42
- `per_device_eval_batch_size`: 128
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 2
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 3e-05
- `weight_decay`: 0.001
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 3
- `max_steps`: -1
- `lr_scheduler_type`: cosine_with_min_lr
- `lr_scheduler_kwargs`: {'num_cycles': 0.5, 'min_lr': 1e-05}
- `warmup_ratio`: 0.25
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: False
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: True
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: True
- `resume_from_checkpoint`: None
- `hub_model_id`: bobox/DeBERTa3-s-CustomPooling-test1-checkpoints-tmp
- `hub_strategy`: all_checkpoints
- `hub_private_repo`: False
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional
</details>
### Training Logs
<details><summary>Click to expand</summary>
| Epoch | Step | Training Loss | vitaminc-pairs loss | negation-triplets loss | scitail-pairs-pos loss | scitail-pairs-qa loss | xsum-pairs loss | sciq pairs loss | qasc pairs loss | openbookqa pairs loss | msmarco pairs loss | nq pairs loss | trivia pairs loss | gooaq pairs loss | paws-pos loss | global dataset loss | sts-test_spearman_cosine | allNLI-dev_max_ap | Qnli-dev_max_ap |
|:------:|:----:|:-------------:|:-------------------:|:----------------------:|:----------------------:|:---------------------:|:---------------:|:---------------:|:---------------:|:---------------------:|:------------------:|:-------------:|:-----------------:|:----------------:|:-------------:|:-------------------:|:------------------------:|:-----------------:|:---------------:|
| 0.0009 | 1 | 5.8564 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0018 | 2 | 7.1716 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0027 | 3 | 5.9095 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0035 | 4 | 5.0841 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0044 | 5 | 4.0184 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0053 | 6 | 6.2191 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0062 | 7 | 5.6124 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0071 | 8 | 3.9544 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0080 | 9 | 4.7149 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0088 | 10 | 4.9616 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0097 | 11 | 5.2794 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0106 | 12 | 8.8704 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0115 | 13 | 6.0707 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0124 | 14 | 5.4071 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0133 | 15 | 6.9104 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0142 | 16 | 6.0276 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0150 | 17 | 6.737 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0159 | 18 | 6.5354 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0168 | 19 | 5.206 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0177 | 20 | 5.2469 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0186 | 21 | 5.3771 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0195 | 22 | 4.979 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0204 | 23 | 4.7909 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0212 | 24 | 4.9086 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0221 | 25 | 4.8826 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0230 | 26 | 8.2266 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0239 | 27 | 8.3024 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0248 | 28 | 5.8745 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0257 | 29 | 4.7298 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0265 | 30 | 5.4614 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0274 | 31 | 5.8594 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0283 | 32 | 5.2401 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0292 | 33 | 5.1579 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0301 | 34 | 5.2181 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0310 | 35 | 4.6328 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0319 | 36 | 2.121 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0327 | 37 | 5.9026 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0336 | 38 | 7.3796 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0345 | 39 | 5.5361 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0354 | 40 | 4.0243 | 2.9018 | 5.6903 | 2.1136 | 2.8052 | 6.5831 | 0.8882 | 4.1148 | 5.0966 | 10.3911 | 10.9032 | 7.1904 | 8.1935 | 1.3943 | 5.6716 | 0.1879 | 0.3385 | 0.5781 |
| 0.0363 | 41 | 4.9072 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0372 | 42 | 3.4439 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0381 | 43 | 4.9787 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0389 | 44 | 5.8318 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0398 | 45 | 5.3226 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0407 | 46 | 5.1181 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0416 | 47 | 4.7834 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0425 | 48 | 6.6303 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0434 | 49 | 5.8171 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0442 | 50 | 5.1962 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0451 | 51 | 5.2096 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0460 | 52 | 5.0943 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0469 | 53 | 4.9038 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0478 | 54 | 4.6479 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0487 | 55 | 5.5098 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0496 | 56 | 4.6979 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0504 | 57 | 3.1969 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0513 | 58 | 4.4127 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0522 | 59 | 3.7746 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0531 | 60 | 4.5378 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0540 | 61 | 5.0209 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0549 | 62 | 6.5936 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0558 | 63 | 4.2315 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0566 | 64 | 6.4269 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0575 | 65 | 4.2644 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0584 | 66 | 5.1388 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0593 | 67 | 5.1852 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0602 | 68 | 4.8057 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0611 | 69 | 3.1725 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0619 | 70 | 3.3322 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0628 | 71 | 5.139 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0637 | 72 | 4.307 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0646 | 73 | 5.0133 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0655 | 74 | 4.0507 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0664 | 75 | 3.3895 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0673 | 76 | 5.6736 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0681 | 77 | 4.2572 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0690 | 78 | 3.0796 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0699 | 79 | 5.0199 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0708 | 80 | 4.1414 | 2.7794 | 4.8890 | 1.8997 | 2.6761 | 6.2096 | 0.7622 | 3.3129 | 4.5498 | 7.2056 | 7.6809 | 6.3792 | 6.6567 | 1.3848 | 5.0030 | 0.2480 | 0.3513 | 0.5898 |
| 0.0717 | 81 | 5.8604 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0726 | 82 | 4.3003 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0735 | 83 | 4.4568 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0743 | 84 | 4.2747 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0752 | 85 | 5.52 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0761 | 86 | 2.7767 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0770 | 87 | 4.397 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0779 | 88 | 5.4449 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0788 | 89 | 4.2706 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0796 | 90 | 6.4759 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0805 | 91 | 4.1951 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0814 | 92 | 4.6328 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0823 | 93 | 4.1278 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0832 | 94 | 4.1787 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0841 | 95 | 5.2156 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0850 | 96 | 3.1403 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0858 | 97 | 4.0273 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0867 | 98 | 3.0624 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0876 | 99 | 4.6786 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0885 | 100 | 4.1505 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0894 | 101 | 2.9529 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0903 | 102 | 4.7048 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0912 | 103 | 4.7388 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0920 | 104 | 3.7879 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0929 | 105 | 4.0311 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0938 | 106 | 4.1314 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0947 | 107 | 4.9411 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0956 | 108 | 4.1118 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0965 | 109 | 3.6971 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0973 | 110 | 5.605 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0982 | 111 | 3.4563 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0991 | 112 | 3.7422 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1 | 113 | 3.8055 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1009 | 114 | 5.2369 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1018 | 115 | 5.6518 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1027 | 116 | 3.2906 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1035 | 117 | 3.4996 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1044 | 118 | 3.6283 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1053 | 119 | 4.1487 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1062 | 120 | 4.3996 | 2.7279 | 4.3946 | 1.4130 | 2.1150 | 6.0486 | 0.7172 | 2.9669 | 4.4180 | 6.3022 | 6.8412 | 6.2013 | 6.0982 | 0.9474 | 4.3852 | 0.3149 | 0.3693 | 0.5975 |
| 0.1071 | 121 | 3.5291 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1080 | 122 | 3.8232 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1088 | 123 | 4.6035 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1097 | 124 | 3.7607 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1106 | 125 | 3.8461 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1115 | 126 | 3.3413 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1124 | 127 | 4.2777 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1133 | 128 | 4.3597 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1142 | 129 | 3.9046 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1150 | 130 | 4.0527 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1159 | 131 | 5.0883 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1168 | 132 | 3.8308 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1177 | 133 | 3.572 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1186 | 134 | 3.4299 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1195 | 135 | 4.1541 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1204 | 136 | 3.584 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1212 | 137 | 5.0977 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1221 | 138 | 4.6769 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1230 | 139 | 3.8396 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1239 | 140 | 3.2875 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1248 | 141 | 4.1946 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1257 | 142 | 4.9602 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1265 | 143 | 4.1531 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1274 | 144 | 3.8351 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1283 | 145 | 3.112 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1292 | 146 | 2.3145 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1301 | 147 | 4.0989 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1310 | 148 | 3.2173 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1319 | 149 | 2.7913 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1327 | 150 | 3.7627 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1336 | 151 | 3.3669 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1345 | 152 | 2.6775 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1354 | 153 | 3.2804 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1363 | 154 | 3.0676 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1372 | 155 | 3.1559 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1381 | 156 | 2.6638 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1389 | 157 | 2.8045 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1398 | 158 | 4.0568 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1407 | 159 | 2.7554 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1416 | 160 | 3.7407 | 2.7439 | 4.6364 | 1.0089 | 1.1229 | 5.4870 | 0.6284 | 2.5933 | 4.3943 | 5.6565 | 5.9870 | 5.6944 | 5.3857 | 0.3622 | 3.4011 | 0.3141 | 0.3898 | 0.6417 |
| 0.1425 | 161 | 3.4324 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1434 | 162 | 3.6658 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1442 | 163 | 3.96 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1451 | 164 | 2.3167 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1460 | 165 | 3.6345 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1469 | 166 | 2.462 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1478 | 167 | 1.4742 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1487 | 168 | 4.7312 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1496 | 169 | 2.6785 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1504 | 170 | 3.449 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1513 | 171 | 2.437 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1522 | 172 | 4.2431 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1531 | 173 | 4.4848 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1540 | 174 | 2.5575 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1549 | 175 | 2.3798 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1558 | 176 | 4.4939 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1566 | 177 | 4.1285 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1575 | 178 | 3.0096 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1584 | 179 | 4.4431 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1593 | 180 | 3.1172 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1602 | 181 | 2.3576 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1611 | 182 | 3.7849 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1619 | 183 | 3.679 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1628 | 184 | 3.1949 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1637 | 185 | 3.2422 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1646 | 186 | 2.9905 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1655 | 187 | 2.2697 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1664 | 188 | 1.7685 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1673 | 189 | 2.0971 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1681 | 190 | 3.4689 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1690 | 191 | 1.6614 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1699 | 192 | 1.9574 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1708 | 193 | 1.9313 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1717 | 194 | 2.2316 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1726 | 195 | 1.9854 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1735 | 196 | 2.8428 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1743 | 197 | 2.6916 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1752 | 198 | 3.5193 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1761 | 199 | 3.1681 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1770 | 200 | 2.7377 | 2.7042 | 4.8735 | 0.6428 | 0.6248 | 4.3639 | 0.4776 | 1.8950 | 3.3982 | 4.1048 | 4.7591 | 4.4568 | 4.1613 | 0.1802 | 2.4959 | 0.3521 | 0.4227 | 0.6702 |
| 0.1779 | 201 | 1.6408 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1788 | 202 | 2.3864 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1796 | 203 | 2.0848 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1805 | 204 | 2.9074 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1814 | 205 | 2.542 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1823 | 206 | 1.7312 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1832 | 207 | 1.6768 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1841 | 208 | 2.531 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1850 | 209 | 2.9222 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1858 | 210 | 2.4152 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1867 | 211 | 1.4345 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1876 | 212 | 1.5864 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1885 | 213 | 1.272 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1894 | 214 | 1.7011 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1903 | 215 | 3.0076 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1912 | 216 | 2.468 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1920 | 217 | 2.0796 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1929 | 218 | 2.9735 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1938 | 219 | 2.5506 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1947 | 220 | 1.7307 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1956 | 221 | 1.4519 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1965 | 222 | 1.7292 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1973 | 223 | 1.4664 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1982 | 224 | 1.6201 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1991 | 225 | 2.3483 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2 | 226 | 2.1311 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2009 | 227 | 2.3272 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2018 | 228 | 2.6164 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2027 | 229 | 1.6261 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2035 | 230 | 2.5293 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2044 | 231 | 1.2885 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2053 | 232 | 2.0039 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2062 | 233 | 3.0003 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2071 | 234 | 2.0491 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2080 | 235 | 2.0178 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2088 | 236 | 1.8532 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2097 | 237 | 2.3614 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2106 | 238 | 1.1889 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2115 | 239 | 1.4833 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2124 | 240 | 2.8687 | 2.7215 | 4.1544 | 0.4166 | 0.3876 | 3.3157 | 0.3711 | 1.4818 | 2.6939 | 3.2454 | 3.9798 | 3.5949 | 3.2266 | 0.1275 | 1.8867 | 0.4430 | 0.4533 | 0.6664 |
</details>
### Framework Versions
- Python: 3.10.14
- Sentence Transformers: 3.2.0
- Transformers: 4.45.1
- PyTorch: 2.4.0
- Accelerate: 0.34.2
- Datasets: 3.0.1
- Tokenizers: 0.20.0
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->