bobox's picture
Training in progress, step 1220, checkpoint
1857c2a verified
|
raw
history blame
158 kB
metadata
base_model: microsoft/deberta-v3-small
library_name: sentence-transformers
metrics:
  - pearson_cosine
  - spearman_cosine
  - pearson_manhattan
  - spearman_manhattan
  - pearson_euclidean
  - spearman_euclidean
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
  - cosine_accuracy
  - cosine_accuracy_threshold
  - cosine_f1
  - cosine_f1_threshold
  - cosine_precision
  - cosine_recall
  - cosine_ap
  - dot_accuracy
  - dot_accuracy_threshold
  - dot_f1
  - dot_f1_threshold
  - dot_precision
  - dot_recall
  - dot_ap
  - manhattan_accuracy
  - manhattan_accuracy_threshold
  - manhattan_f1
  - manhattan_f1_threshold
  - manhattan_precision
  - manhattan_recall
  - manhattan_ap
  - euclidean_accuracy
  - euclidean_accuracy_threshold
  - euclidean_f1
  - euclidean_f1_threshold
  - euclidean_precision
  - euclidean_recall
  - euclidean_ap
  - max_accuracy
  - max_accuracy_threshold
  - max_f1
  - max_f1_threshold
  - max_precision
  - max_recall
  - max_ap
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:32500
  - loss:GISTEmbedLoss
widget:
  - source_sentence: phase changes do not change
    sentences:
      - >-
        The major Atlantic slave trading nations, ordered by trade volume, were
        the Portuguese, the British, the Spanish, the French, the Dutch, and the
        Danish. Several had established outposts on the African coast where they
        purchased slaves from local African leaders.
      - >-
        phase changes do not change mass. Particles have mass, but mass is
        energy. 
         phase changes do not change  energy
      - >-
        According to the U.S. Census Bureau , the county is a total area of ,
        which has land and ( 0.2 % ) is water .
  - source_sentence: what jobs can you get with a bachelor degree in anthropology?
    sentences:
      - >-
        To determine the atomic weight of an element, you should add up protons
        and neutrons.
      - >-
        ['Paleontologist*', 'Archaeologist*', 'University Professor*', 'Market
        Research Analyst*', 'Primatologist.', 'Forensic Scientist*', 'Medical
        Anthropologist.', 'Museum Technician.']
      - >-
        The wingspan flies , the moth comes depending on the location from July
        to August .
  - source_sentence: Identify different forms of energy (e.g., light, sound, heat).
    sentences:
      - >-
        `` Irreplaceable '' '' remained on the chart for thirty weeks , and was
        certified double-platinum by the Recording Industry Association of
        America ( RIAA ) , denoting sales of two million downloads , and had
        sold over 3,139,000 paid digital downloads in the US as of October 2012
        , according to Nielsen SoundScan . ''
      - >-
        On Rotten Tomatoes , the film has a rating of 63 % , based on 87 reviews
        , with an average rating of 5.9/10 .
      - Heat, light, and sound are all different forms of energy.
  - source_sentence: what is so small it can only be seen with an electron microscope?
    sentences:
      - >-
        Viruses are so small that they can be seen only with an electron
        microscope.. Where most viruses are DNA, HIV is an RNA virus. 
         HIV is so small it can only be seen with an electron microscope
      - >-
        The development of modern lasers has opened many doors to both research
        and applications. A laser beam was used to measure the distance from the
        Earth to the moon. Lasers are important components of CD players. As the
        image above illustrates, lasers can provide precise focusing of beams to
        selectively destroy cancer cells in patients. The ability of a laser to
        focus precisely is due to high-quality crystals that help give rise to
        the laser beam. A variety of techniques are used to manufacture pure
        crystals for use in lasers.
      - >-
        Discussion for (a) This value is the net work done on the package. The
        person actually does more work than this, because friction opposes the
        motion. Friction does negative work and removes some of the energy the
        person expends and converts it to thermal energy. The net work equals
        the sum of the work done by each individual force. Strategy and Concept
        for (b) The forces acting on the package are gravity, the normal force,
        the force of friction, and the applied force. The normal force and force
        of gravity are each perpendicular to the displacement, and therefore do
        no work. Solution for (b) The applied force does work.
  - source_sentence: what aspects of your environment may relate to the epidemic of obesity
    sentences:
      - >-
        Jan Kromkamp ( born August 17 , 1980 in Makkinga , Netherlands ) is a
        Dutch footballer .
      - >-
        When chemicals in solution react, the proper way of writing the chemical
        formulas of the dissolved ionic compounds is in terms of the dissociated
        ions, not the complete ionic formula. A complete ionic equation is a
        chemical equation in which the dissolved ionic compounds are written as
        separated ions. Solubility rules are very useful in determining which
        ionic compounds are dissolved and which are not. For example, when
        NaCl(aq) reacts with AgNO3(aq) in a double-replacement reaction to
        precipitate AgCl(s) and form NaNO3(aq), the complete ionic equation
        includes NaCl, AgNO3, and NaNO3 written as separated ions:.
      - >-
        Genetic changes in human populations occur too slowly to be responsible
        for the obesity epidemic. Nevertheless, the variation in how people
        respond to the environment that promotes physical inactivity and intake
        of high-calorie foods suggests that genes do play a role in the
        development of obesity.
model-index:
  - name: SentenceTransformer based on microsoft/deberta-v3-small
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: sts test
          type: sts-test
        metrics:
          - type: pearson_cosine
            value: 0.6261543137722272
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.6380566379755814
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.64545238834986
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.638675279413056
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.6433597379437007
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.6380140347747367
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.6259993945400585
            name: Pearson Dot
          - type: spearman_dot
            value: 0.6378358352518761
            name: Spearman Dot
          - type: pearson_max
            value: 0.64545238834986
            name: Pearson Max
          - type: spearman_max
            value: 0.638675279413056
            name: Spearman Max
      - task:
          type: binary-classification
          name: Binary Classification
        dataset:
          name: allNLI dev
          type: allNLI-dev
        metrics:
          - type: cosine_accuracy
            value: 0.6796875
            name: Cosine Accuracy
          - type: cosine_accuracy_threshold
            value: 0.9540110230445862
            name: Cosine Accuracy Threshold
          - type: cosine_f1
            value: 0.5454545454545455
            name: Cosine F1
          - type: cosine_f1_threshold
            value: 0.8630210161209106
            name: Cosine F1 Threshold
          - type: cosine_precision
            value: 0.44244604316546765
            name: Cosine Precision
          - type: cosine_recall
            value: 0.7109826589595376
            name: Cosine Recall
          - type: cosine_ap
            value: 0.4867883737362577
            name: Cosine Ap
          - type: dot_accuracy
            value: 0.6796875
            name: Dot Accuracy
          - type: dot_accuracy_threshold
            value: 733.4241943359375
            name: Dot Accuracy Threshold
          - type: dot_f1
            value: 0.5454545454545455
            name: Dot F1
          - type: dot_f1_threshold
            value: 663.4332885742188
            name: Dot F1 Threshold
          - type: dot_precision
            value: 0.44244604316546765
            name: Dot Precision
          - type: dot_recall
            value: 0.7109826589595376
            name: Dot Recall
          - type: dot_ap
            value: 0.4866316260057906
            name: Dot Ap
          - type: manhattan_accuracy
            value: 0.677734375
            name: Manhattan Accuracy
          - type: manhattan_accuracy_threshold
            value: 181.60472106933594
            name: Manhattan Accuracy Threshold
          - type: manhattan_f1
            value: 0.550537634408602
            name: Manhattan F1
          - type: manhattan_f1_threshold
            value: 319.623046875
            name: Manhattan F1 Threshold
          - type: manhattan_precision
            value: 0.4383561643835616
            name: Manhattan Precision
          - type: manhattan_recall
            value: 0.7398843930635838
            name: Manhattan Recall
          - type: manhattan_ap
            value: 0.4882823795400136
            name: Manhattan Ap
          - type: euclidean_accuracy
            value: 0.6796875
            name: Euclidean Accuracy
          - type: euclidean_accuracy_threshold
            value: 8.408852577209473
            name: Euclidean Accuracy Threshold
          - type: euclidean_f1
            value: 0.5454545454545455
            name: Euclidean F1
          - type: euclidean_f1_threshold
            value: 14.51208782196045
            name: Euclidean F1 Threshold
          - type: euclidean_precision
            value: 0.44244604316546765
            name: Euclidean Precision
          - type: euclidean_recall
            value: 0.7109826589595376
            name: Euclidean Recall
          - type: euclidean_ap
            value: 0.48671939529879626
            name: Euclidean Ap
          - type: max_accuracy
            value: 0.6796875
            name: Max Accuracy
          - type: max_accuracy_threshold
            value: 733.4241943359375
            name: Max Accuracy Threshold
          - type: max_f1
            value: 0.550537634408602
            name: Max F1
          - type: max_f1_threshold
            value: 663.4332885742188
            name: Max F1 Threshold
          - type: max_precision
            value: 0.44244604316546765
            name: Max Precision
          - type: max_recall
            value: 0.7398843930635838
            name: Max Recall
          - type: max_ap
            value: 0.4882823795400136
            name: Max Ap
      - task:
          type: binary-classification
          name: Binary Classification
        dataset:
          name: Qnli dev
          type: Qnli-dev
        metrics:
          - type: cosine_accuracy
            value: 0.666015625
            name: Cosine Accuracy
          - type: cosine_accuracy_threshold
            value: 0.8842042684555054
            name: Cosine Accuracy Threshold
          - type: cosine_f1
            value: 0.6793103448275861
            name: Cosine F1
          - type: cosine_f1_threshold
            value: 0.8030812740325928
            name: Cosine F1 Threshold
          - type: cosine_precision
            value: 0.5726744186046512
            name: Cosine Precision
          - type: cosine_recall
            value: 0.8347457627118644
            name: Cosine Recall
          - type: cosine_ap
            value: 0.7052019976296955
            name: Cosine Ap
          - type: dot_accuracy
            value: 0.66796875
            name: Dot Accuracy
          - type: dot_accuracy_threshold
            value: 679.375244140625
            name: Dot Accuracy Threshold
          - type: dot_f1
            value: 0.6794425087108014
            name: Dot F1
          - type: dot_f1_threshold
            value: 618.5501708984375
            name: Dot F1 Threshold
          - type: dot_precision
            value: 0.5769230769230769
            name: Dot Precision
          - type: dot_recall
            value: 0.826271186440678
            name: Dot Recall
          - type: dot_ap
            value: 0.7054354866961744
            name: Dot Ap
          - type: manhattan_accuracy
            value: 0.669921875
            name: Manhattan Accuracy
          - type: manhattan_accuracy_threshold
            value: 288.81317138671875
            name: Manhattan Accuracy Threshold
          - type: manhattan_f1
            value: 0.6791862284820032
            name: Manhattan F1
          - type: manhattan_f1_threshold
            value: 405.0858459472656
            name: Manhattan F1 Threshold
          - type: manhattan_precision
            value: 0.5384615384615384
            name: Manhattan Precision
          - type: manhattan_recall
            value: 0.9194915254237288
            name: Manhattan Recall
          - type: manhattan_ap
            value: 0.7053588266129154
            name: Manhattan Ap
          - type: euclidean_accuracy
            value: 0.666015625
            name: Euclidean Accuracy
          - type: euclidean_accuracy_threshold
            value: 13.342350006103516
            name: Euclidean Accuracy Threshold
          - type: euclidean_f1
            value: 0.6793103448275861
            name: Euclidean F1
          - type: euclidean_f1_threshold
            value: 17.398447036743164
            name: Euclidean F1 Threshold
          - type: euclidean_precision
            value: 0.5726744186046512
            name: Euclidean Precision
          - type: euclidean_recall
            value: 0.8347457627118644
            name: Euclidean Recall
          - type: euclidean_ap
            value: 0.7052194558904903
            name: Euclidean Ap
          - type: max_accuracy
            value: 0.669921875
            name: Max Accuracy
          - type: max_accuracy_threshold
            value: 679.375244140625
            name: Max Accuracy Threshold
          - type: max_f1
            value: 0.6794425087108014
            name: Max F1
          - type: max_f1_threshold
            value: 618.5501708984375
            name: Max F1 Threshold
          - type: max_precision
            value: 0.5769230769230769
            name: Max Precision
          - type: max_recall
            value: 0.9194915254237288
            name: Max Recall
          - type: max_ap
            value: 0.7054354866961744
            name: Max Ap

SentenceTransformer based on microsoft/deberta-v3-small

This is a sentence-transformers model finetuned from microsoft/deberta-v3-small. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: microsoft/deberta-v3-small
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model 
  (1): AdvancedWeightedPooling(
    (alpha_dropout_layer): Dropout(p=0.05, inplace=False)
    (gate_dropout_layer): Dropout(p=0.0, inplace=False)
    (linear_cls_Qpj): Linear(in_features=768, out_features=768, bias=True)
    (linear_attnOut): Linear(in_features=768, out_features=768, bias=True)
    (mha): MultiheadAttention(
      (out_proj): NonDynamicallyQuantizableLinear(in_features=768, out_features=768, bias=True)
    )
    (layernorm_output): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
    (layernorm_weightedPooing): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
    (layernorm_attnOut): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("bobox/DeBERTa3-s-CustomPoolin-toytest4-step1-checkpoints-tmp")
# Run inference
sentences = [
    'what aspects of your environment may relate to the epidemic of obesity',
    'Genetic changes in human populations occur too slowly to be responsible for the obesity epidemic. Nevertheless, the variation in how people respond to the environment that promotes physical inactivity and intake of high-calorie foods suggests that genes do play a role in the development of obesity.',
    'When chemicals in solution react, the proper way of writing the chemical formulas of the dissolved ionic compounds is in terms of the dissociated ions, not the complete ionic formula. A complete ionic equation is a chemical equation in which the dissolved ionic compounds are written as separated ions. Solubility rules are very useful in determining which ionic compounds are dissolved and which are not. For example, when NaCl(aq) reacts with AgNO3(aq) in a double-replacement reaction to precipitate AgCl(s) and form NaNO3(aq), the complete ionic equation includes NaCl, AgNO3, and NaNO3 written as separated ions:.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.6262
spearman_cosine 0.6381
pearson_manhattan 0.6455
spearman_manhattan 0.6387
pearson_euclidean 0.6434
spearman_euclidean 0.638
pearson_dot 0.626
spearman_dot 0.6378
pearson_max 0.6455
spearman_max 0.6387

Binary Classification

Metric Value
cosine_accuracy 0.6797
cosine_accuracy_threshold 0.954
cosine_f1 0.5455
cosine_f1_threshold 0.863
cosine_precision 0.4424
cosine_recall 0.711
cosine_ap 0.4868
dot_accuracy 0.6797
dot_accuracy_threshold 733.4242
dot_f1 0.5455
dot_f1_threshold 663.4333
dot_precision 0.4424
dot_recall 0.711
dot_ap 0.4866
manhattan_accuracy 0.6777
manhattan_accuracy_threshold 181.6047
manhattan_f1 0.5505
manhattan_f1_threshold 319.623
manhattan_precision 0.4384
manhattan_recall 0.7399
manhattan_ap 0.4883
euclidean_accuracy 0.6797
euclidean_accuracy_threshold 8.4089
euclidean_f1 0.5455
euclidean_f1_threshold 14.5121
euclidean_precision 0.4424
euclidean_recall 0.711
euclidean_ap 0.4867
max_accuracy 0.6797
max_accuracy_threshold 733.4242
max_f1 0.5505
max_f1_threshold 663.4333
max_precision 0.4424
max_recall 0.7399
max_ap 0.4883

Binary Classification

Metric Value
cosine_accuracy 0.666
cosine_accuracy_threshold 0.8842
cosine_f1 0.6793
cosine_f1_threshold 0.8031
cosine_precision 0.5727
cosine_recall 0.8347
cosine_ap 0.7052
dot_accuracy 0.668
dot_accuracy_threshold 679.3752
dot_f1 0.6794
dot_f1_threshold 618.5502
dot_precision 0.5769
dot_recall 0.8263
dot_ap 0.7054
manhattan_accuracy 0.6699
manhattan_accuracy_threshold 288.8132
manhattan_f1 0.6792
manhattan_f1_threshold 405.0858
manhattan_precision 0.5385
manhattan_recall 0.9195
manhattan_ap 0.7054
euclidean_accuracy 0.666
euclidean_accuracy_threshold 13.3424
euclidean_f1 0.6793
euclidean_f1_threshold 17.3984
euclidean_precision 0.5727
euclidean_recall 0.8347
euclidean_ap 0.7052
max_accuracy 0.6699
max_accuracy_threshold 679.3752
max_f1 0.6794
max_f1_threshold 618.5502
max_precision 0.5769
max_recall 0.9195
max_ap 0.7054

Training Details

Training Dataset

Unnamed Dataset

  • Size: 32,500 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 4 tokens
    • mean: 29.39 tokens
    • max: 323 tokens
    • min: 2 tokens
    • mean: 54.42 tokens
    • max: 423 tokens
  • Samples:
    sentence1 sentence2
    In which London road is Harrod’s department store? Harrods, Brompton Road, London
    e. in solids the atoms are closely locked in position and can only vibrate, in liquids the atoms and molecules are more loosely connected and can collide with and move past one another, while in gases the atoms or molecules are free to move independently, colliding frequently. Within a substance, atoms that collide frequently and move independently of one another are most likely in a gas
    Joe Cole was unable to join West Bromwich Albion . On 16th October Joe Cole took a long hard look at himself realising that he would never get the opportunity to join West Bromwich Albion and joined Coventry City instead.
  • Loss: GISTEmbedLoss with these parameters:
    {'guide': SentenceTransformer(
      (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
      (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
      (2): Normalize()
    ), 'temperature': 0.025}
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 256
  • lr_scheduler_type: cosine_with_min_lr
  • lr_scheduler_kwargs: {'num_cycles': 0.5, 'min_lr': 3.3333333333333337e-06}
  • warmup_ratio: 0.33
  • save_safetensors: False
  • fp16: True
  • push_to_hub: True
  • hub_model_id: bobox/DeBERTa3-s-CustomPoolin-toytest4-step1-checkpoints-tmp
  • hub_strategy: all_checkpoints
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 256
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: cosine_with_min_lr
  • lr_scheduler_kwargs: {'num_cycles': 0.5, 'min_lr': 3.3333333333333337e-06}
  • warmup_ratio: 0.33
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: bobox/DeBERTa3-s-CustomPoolin-toytest4-step1-checkpoints-tmp
  • hub_strategy: all_checkpoints
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss sts-test_spearman_cosine allNLI-dev_max_ap Qnli-dev_max_ap
0.0010 1 6.0688 - - -
0.0020 2 7.5576 - - -
0.0030 3 4.6849 - - -
0.0039 4 5.4503 - - -
0.0049 5 5.6057 - - -
0.0059 6 6.3049 - - -
0.0069 7 6.8336 - - -
0.0079 8 5.0777 - - -
0.0089 9 4.8358 - - -
0.0098 10 4.641 - - -
0.0108 11 4.828 - - -
0.0118 12 5.2269 - - -
0.0128 13 5.6772 - - -
0.0138 14 5.1422 - - -
0.0148 15 6.2469 - - -
0.0157 16 4.6802 - - -
0.0167 17 4.5492 - - -
0.0177 18 4.8062 - - -
0.0187 19 7.5141 - - -
0.0197 20 5.5202 - - -
0.0207 21 6.5025 - - -
0.0217 22 7.318 - - -
0.0226 23 4.6458 - - -
0.0236 24 4.6191 - - -
0.0246 25 4.3159 - - -
0.0256 26 6.3677 - - -
0.0266 27 5.6052 - - -
0.0276 28 4.196 - - -
0.0285 29 4.4802 - - -
0.0295 30 4.9193 - - -
0.0305 31 4.0996 - - -
0.0315 32 5.6307 - - -
0.0325 33 4.5745 - - -
0.0335 34 4.4514 - - -
0.0344 35 4.0617 - - -
0.0354 36 5.0298 - - -
0.0364 37 3.9815 - - -
0.0374 38 4.0871 - - -
0.0384 39 4.2378 - - -
0.0394 40 3.8226 - - -
0.0404 41 4.3519 - - -
0.0413 42 3.6345 - - -
0.0423 43 5.0829 - - -
0.0433 44 4.6701 - - -
0.0443 45 4.1371 - - -
0.0453 46 4.2418 - - -
0.0463 47 4.4766 - - -
0.0472 48 4.4797 - - -
0.0482 49 3.8471 - - -
0.0492 50 4.3194 - - -
0.0502 51 3.9426 - - -
0.0512 52 3.5333 - - -
0.0522 53 4.2426 - - -
0.0531 54 3.9816 - - -
0.0541 55 3.663 - - -
0.0551 56 3.9057 - - -
0.0561 57 4.0345 - - -
0.0571 58 3.5233 - - -
0.0581 59 3.7999 - - -
0.0591 60 3.1885 - - -
0.0600 61 3.6013 - - -
0.0610 62 3.392 - - -
0.0620 63 3.3814 - - -
0.0630 64 4.0428 - - -
0.0640 65 3.7825 - - -
0.0650 66 3.4181 - - -
0.0659 67 3.7793 - - -
0.0669 68 3.8344 - - -
0.0679 69 3.2165 - - -
0.0689 70 3.3811 - - -
0.0699 71 3.5984 - - -
0.0709 72 3.8583 - - -
0.0719 73 3.296 - - -
0.0728 74 2.7661 - - -
0.0738 75 2.9805 - - -
0.0748 76 2.566 - - -
0.0758 77 3.258 - - -
0.0768 78 3.3804 - - -
0.0778 79 2.8828 - - -
0.0787 80 3.1077 - - -
0.0797 81 2.9441 - - -
0.0807 82 2.9465 - - -
0.0817 83 2.7088 - - -
0.0827 84 2.9215 - - -
0.0837 85 3.4698 - - -
0.0846 86 2.2414 - - -
0.0856 87 3.1601 - - -
0.0866 88 2.7714 - - -
0.0876 89 3.0311 - - -
0.0886 90 3.0336 - - -
0.0896 91 1.9358 - - -
0.0906 92 2.6031 - - -
0.0915 93 2.7515 - - -
0.0925 94 2.8496 - - -
0.0935 95 1.8015 - - -
0.0945 96 2.8138 - - -
0.0955 97 2.0597 - - -
0.0965 98 2.1053 - - -
0.0974 99 2.6785 - - -
0.0984 100 2.588 - - -
0.0994 101 2.0099 - - -
0.1004 102 2.7947 - - -
0.1014 103 2.3274 - - -
0.1024 104 2.2545 - - -
0.1033 105 2.4575 - - -
0.1043 106 2.4413 - - -
0.1053 107 2.3185 - - -
0.1063 108 2.1577 - - -
0.1073 109 2.1278 - - -
0.1083 110 2.0967 - - -
0.1093 111 2.6142 - - -
0.1102 112 1.8553 - - -
0.1112 113 2.1523 - - -
0.1122 114 2.1726 - - -
0.1132 115 1.8564 - - -
0.1142 116 1.8413 - - -
0.1152 117 2.0441 - - -
0.1161 118 2.2159 - - -
0.1171 119 2.6779 - - -
0.1181 120 2.2976 - - -
0.1191 121 1.9407 - - -
0.1201 122 1.9019 - - -
0.1211 123 2.2149 - - -
0.1220 124 1.6823 - - -
0.1230 125 1.8402 - - -
0.1240 126 1.6914 - - -
0.125 127 2.1626 - - -
0.1260 128 1.6414 - - -
0.1270 129 2.2043 - - -
0.1280 130 1.9987 - - -
0.1289 131 1.8868 - - -
0.1299 132 1.8262 - - -
0.1309 133 2.0404 - - -
0.1319 134 1.9134 - - -
0.1329 135 2.3725 - - -
0.1339 136 1.4127 - - -
0.1348 137 1.6876 - - -
0.1358 138 1.8376 - - -
0.1368 139 1.6992 - - -
0.1378 140 1.5032 - - -
0.1388 141 2.0334 - - -
0.1398 142 2.3581 - - -
0.1407 143 1.4236 - - -
0.1417 144 2.202 - - -
0.1427 145 1.7654 - - -
0.1437 146 1.5748 - - -
0.1447 147 1.7996 - - -
0.1457 148 1.7517 - - -
0.1467 149 1.8933 - - -
0.1476 150 1.2836 - - -
0.1486 151 1.7145 - - -
0.1496 152 1.6499 - - -
0.1506 153 1.8273 0.4057 0.4389 0.6725
0.1516 154 2.2859 - - -
0.1526 155 1.0833 - - -
0.1535 156 1.6829 - - -
0.1545 157 2.1464 - - -
0.1555 158 1.745 - - -
0.1565 159 1.7319 - - -
0.1575 160 1.6968 - - -
0.1585 161 1.7401 - - -
0.1594 162 1.729 - - -
0.1604 163 2.0782 - - -
0.1614 164 2.6545 - - -
0.1624 165 1.4045 - - -
0.1634 166 1.2937 - - -
0.1644 167 1.1171 - - -
0.1654 168 1.3537 - - -
0.1663 169 1.7028 - - -
0.1673 170 1.4143 - - -
0.1683 171 1.8648 - - -
0.1693 172 1.6768 - - -
0.1703 173 1.9528 - - -
0.1713 174 1.1718 - - -
0.1722 175 1.8176 - - -
0.1732 176 0.8439 - - -
0.1742 177 1.5092 - - -
0.1752 178 1.1947 - - -
0.1762 179 1.6395 - - -
0.1772 180 1.4394 - - -
0.1781 181 1.7548 - - -
0.1791 182 1.1181 - - -
0.1801 183 1.0271 - - -
0.1811 184 2.3108 - - -
0.1821 185 2.1242 - - -
0.1831 186 1.9822 - - -
0.1841 187 2.3605 - - -
0.1850 188 1.5251 - - -
0.1860 189 1.2351 - - -
0.1870 190 1.5859 - - -
0.1880 191 1.8056 - - -
0.1890 192 1.349 - - -
0.1900 193 0.893 - - -
0.1909 194 1.5122 - - -
0.1919 195 1.3875 - - -
0.1929 196 1.29 - - -
0.1939 197 2.2931 - - -
0.1949 198 1.2663 - - -
0.1959 199 1.9712 - - -
0.1969 200 2.3307 - - -
0.1978 201 1.6544 - - -
0.1988 202 1.638 - - -
0.1998 203 1.3412 - - -
0.2008 204 1.4454 - - -
0.2018 205 1.5437 - - -
0.2028 206 1.4921 - - -
0.2037 207 1.4298 - - -
0.2047 208 1.6174 - - -
0.2057 209 1.4137 - - -
0.2067 210 1.5652 - - -
0.2077 211 1.1631 - - -
0.2087 212 1.2351 - - -
0.2096 213 1.7537 - - -
0.2106 214 1.3186 - - -
0.2116 215 1.2258 - - -
0.2126 216 0.7695 - - -
0.2136 217 1.2775 - - -
0.2146 218 1.6795 - - -
0.2156 219 1.2862 - - -
0.2165 220 1.1723 - - -
0.2175 221 1.3322 - - -
0.2185 222 1.7564 - - -
0.2195 223 1.1071 - - -
0.2205 224 1.2011 - - -
0.2215 225 1.2303 - - -
0.2224 226 1.212 - - -
0.2234 227 1.0117 - - -
0.2244 228 1.1907 - - -
0.2254 229 2.1293 - - -
0.2264 230 1.3063 - - -
0.2274 231 1.2841 - - -
0.2283 232 1.3778 - - -
0.2293 233 1.2242 - - -
0.2303 234 0.9227 - - -
0.2313 235 1.2221 - - -
0.2323 236 2.1041 - - -
0.2333 237 1.3341 - - -
0.2343 238 1.0876 - - -
0.2352 239 1.3328 - - -
0.2362 240 1.2958 - - -
0.2372 241 1.1522 - - -
0.2382 242 1.7942 - - -
0.2392 243 1.1325 - - -
0.2402 244 1.6466 - - -
0.2411 245 1.4608 - - -
0.2421 246 0.6375 - - -
0.2431 247 2.0177 - - -
0.2441 248 1.2069 - - -
0.2451 249 0.7639 - - -
0.2461 250 1.3465 - - -
0.2470 251 1.064 - - -
0.2480 252 1.3757 - - -
0.2490 253 1.612 - - -
0.25 254 0.7917 - - -
0.2510 255 1.5515 - - -
0.2520 256 0.799 - - -
0.2530 257 0.9882 - - -
0.2539 258 1.1814 - - -
0.2549 259 0.6394 - - -
0.2559 260 1.4756 - - -
0.2569 261 0.5338 - - -
0.2579 262 0.9779 - - -
0.2589 263 1.5307 - - -
0.2598 264 1.1213 - - -
0.2608 265 0.9482 - - -
0.2618 266 0.9599 - - -
0.2628 267 1.4455 - - -
0.2638 268 1.6496 - - -
0.2648 269 0.7402 - - -
0.2657 270 0.7835 - - -
0.2667 271 0.7821 - - -
0.2677 272 1.5422 - - -
0.2687 273 1.0995 - - -
0.2697 274 1.378 - - -
0.2707 275 1.3562 - - -
0.2717 276 0.7376 - - -
0.2726 277 1.1678 - - -
0.2736 278 1.2989 - - -
0.2746 279 1.9559 - - -
0.2756 280 1.1237 - - -
0.2766 281 0.952 - - -
0.2776 282 1.6629 - - -
0.2785 283 1.871 - - -
0.2795 284 1.5946 - - -
0.2805 285 1.4456 - - -
0.2815 286 1.4085 - - -
0.2825 287 1.1394 - - -
0.2835 288 1.0315 - - -
0.2844 289 1.488 - - -
0.2854 290 1.4006 - - -
0.2864 291 0.9237 - - -
0.2874 292 1.163 - - -
0.2884 293 1.7037 - - -
0.2894 294 0.8715 - - -
0.2904 295 1.2101 - - -
0.2913 296 1.1179 - - -
0.2923 297 1.3986 - - -
0.2933 298 1.7068 - - -
0.2943 299 0.8695 - - -
0.2953 300 1.3778 - - -
0.2963 301 1.2834 - - -
0.2972 302 0.8123 - - -
0.2982 303 1.6521 - - -
0.2992 304 1.1064 - - -
0.3002 305 0.9578 - - -
0.3012 306 0.9254 0.4888 0.4789 0.7040
0.3022 307 0.7541 - - -
0.3031 308 0.7324 - - -
0.3041 309 0.5974 - - -
0.3051 310 1.1481 - - -
0.3061 311 1.6179 - - -
0.3071 312 1.4641 - - -
0.3081 313 1.7185 - - -
0.3091 314 0.9328 - - -
0.3100 315 0.742 - - -
0.3110 316 1.4173 - - -
0.3120 317 0.7267 - - -
0.3130 318 0.9494 - - -
0.3140 319 1.5111 - - -
0.3150 320 1.6949 - - -
0.3159 321 1.7562 - - -
0.3169 322 1.2532 - - -
0.3179 323 1.1086 - - -
0.3189 324 0.7377 - - -
0.3199 325 1.085 - - -
0.3209 326 0.7767 - - -
0.3219 327 1.4441 - - -
0.3228 328 0.8146 - - -
0.3238 329 0.7403 - - -
0.3248 330 0.8476 - - -
0.3258 331 0.7323 - - -
0.3268 332 1.2241 - - -
0.3278 333 1.5065 - - -
0.3287 334 0.5259 - - -
0.3297 335 1.3103 - - -
0.3307 336 0.8655 - - -
0.3317 337 0.7575 - - -
0.3327 338 1.968 - - -
0.3337 339 1.317 - - -
0.3346 340 1.1972 - - -
0.3356 341 1.6323 - - -
0.3366 342 1.0469 - - -
0.3376 343 1.3349 - - -
0.3386 344 0.9544 - - -
0.3396 345 1.1894 - - -
0.3406 346 0.7717 - - -
0.3415 347 1.2563 - - -
0.3425 348 1.2437 - - -
0.3435 349 0.7806 - - -
0.3445 350 0.8303 - - -
0.3455 351 1.0926 - - -
0.3465 352 0.6654 - - -
0.3474 353 1.1087 - - -
0.3484 354 1.1525 - - -
0.3494 355 1.1127 - - -
0.3504 356 1.4267 - - -
0.3514 357 0.6148 - - -
0.3524 358 1.0123 - - -
0.3533 359 1.9682 - - -
0.3543 360 0.8487 - - -
0.3553 361 1.0412 - - -
0.3563 362 1.0902 - - -
0.3573 363 0.9606 - - -
0.3583 364 0.9206 - - -
0.3593 365 1.4727 - - -
0.3602 366 0.9379 - - -
0.3612 367 0.8387 - - -
0.3622 368 0.9692 - - -
0.3632 369 1.6298 - - -
0.3642 370 1.0882 - - -
0.3652 371 1.1558 - - -
0.3661 372 0.9546 - - -
0.3671 373 1.0124 - - -
0.3681 374 1.3916 - - -
0.3691 375 0.527 - - -
0.3701 376 0.6387 - - -
0.3711 377 1.1445 - - -
0.3720 378 1.3309 - - -
0.3730 379 1.5888 - - -
0.3740 380 1.4422 - - -
0.375 381 1.7044 - - -
0.3760 382 0.7913 - - -
0.3770 383 1.3241 - - -
0.3780 384 0.6473 - - -
0.3789 385 1.221 - - -
0.3799 386 0.7773 - - -
0.3809 387 1.054 - - -
0.3819 388 0.9862 - - -
0.3829 389 0.9684 - - -
0.3839 390 1.3244 - - -
0.3848 391 1.1787 - - -
0.3858 392 1.4698 - - -
0.3868 393 1.0961 - - -
0.3878 394 1.1364 - - -
0.3888 395 0.9368 - - -
0.3898 396 1.1731 - - -
0.3907 397 0.8686 - - -
0.3917 398 0.7481 - - -
0.3927 399 0.7261 - - -
0.3937 400 1.2062 - - -
0.3947 401 0.7462 - - -
0.3957 402 1.0318 - - -
0.3967 403 1.105 - - -
0.3976 404 1.009 - - -
0.3986 405 0.5941 - - -
0.3996 406 1.7972 - - -
0.4006 407 1.0544 - - -
0.4016 408 1.3912 - - -
0.4026 409 0.8305 - - -
0.4035 410 0.8688 - - -
0.4045 411 1.0069 - - -
0.4055 412 1.3141 - - -
0.4065 413 1.1042 - - -
0.4075 414 1.1011 - - -
0.4085 415 1.1192 - - -
0.4094 416 1.5957 - - -
0.4104 417 1.164 - - -
0.4114 418 0.6425 - - -
0.4124 419 0.6068 - - -
0.4134 420 0.9275 - - -
0.4144 421 0.8836 - - -
0.4154 422 1.2115 - - -
0.4163 423 0.8367 - - -
0.4173 424 1.0595 - - -
0.4183 425 0.826 - - -
0.4193 426 0.707 - - -
0.4203 427 0.6235 - - -
0.4213 428 0.7719 - - -
0.4222 429 1.0862 - - -
0.4232 430 0.9311 - - -
0.4242 431 1.2339 - - -
0.4252 432 0.9891 - - -
0.4262 433 1.8443 - - -
0.4272 434 1.1799 - - -
0.4281 435 0.759 - - -
0.4291 436 1.1002 - - -
0.4301 437 0.9141 - - -
0.4311 438 0.5467 - - -
0.4321 439 0.7476 - - -
0.4331 440 1.14 - - -
0.4341 441 1.1504 - - -
0.4350 442 1.26 - - -
0.4360 443 1.0311 - - -
0.4370 444 1.0646 - - -
0.4380 445 0.8687 - - -
0.4390 446 0.6839 - - -
0.4400 447 1.1376 - - -
0.4409 448 0.9759 - - -
0.4419 449 0.7971 - - -
0.4429 450 0.9708 - - -
0.4439 451 0.8217 - - -
0.4449 452 1.3728 - - -
0.4459 453 0.9119 - - -
0.4469 454 1.012 - - -
0.4478 455 1.3738 - - -
0.4488 456 0.8219 - - -
0.4498 457 1.2558 - - -
0.4508 458 0.6247 - - -
0.4518 459 0.7295 0.5410 0.4920 0.6879
0.4528 460 0.8154 - - -
0.4537 461 1.1392 - - -
0.4547 462 0.8618 - - -
0.4557 463 0.9669 - - -
0.4567 464 0.8804 - - -
0.4577 465 0.8479 - - -
0.4587 466 0.6296 - - -
0.4596 467 0.8449 - - -
0.4606 468 0.9772 - - -
0.4616 469 0.6424 - - -
0.4626 470 0.9169 - - -
0.4636 471 0.7599 - - -
0.4646 472 0.8943 - - -
0.4656 473 0.9475 - - -
0.4665 474 1.4518 - - -
0.4675 475 1.274 - - -
0.4685 476 0.7306 - - -
0.4695 477 0.9238 - - -
0.4705 478 0.6593 - - -
0.4715 479 1.0183 - - -
0.4724 480 1.2577 - - -
0.4734 481 0.8738 - - -
0.4744 482 1.1416 - - -
0.4754 483 0.7135 - - -
0.4764 484 1.2587 - - -
0.4774 485 0.8823 - - -
0.4783 486 0.8423 - - -
0.4793 487 0.7704 - - -
0.4803 488 0.7049 - - -
0.4813 489 1.1893 - - -
0.4823 490 1.3985 - - -
0.4833 491 1.3567 - - -
0.4843 492 1.2573 - - -
0.4852 493 0.7671 - - -
0.4862 494 0.5425 - - -
0.4872 495 0.9372 - - -
0.4882 496 0.799 - - -
0.4892 497 0.9548 - - -
0.4902 498 1.0855 - - -
0.4911 499 1.0465 - - -
0.4921 500 1.1004 - - -
0.4931 501 0.6392 - - -
0.4941 502 0.7102 - - -
0.4951 503 1.3242 - - -
0.4961 504 0.6861 - - -
0.4970 505 0.9291 - - -
0.4980 506 0.8592 - - -
0.4990 507 0.9462 - - -
0.5 508 1.0167 - - -
0.5010 509 1.0118 - - -
0.5020 510 0.6741 - - -
0.5030 511 1.4578 - - -
0.5039 512 1.2959 - - -
0.5049 513 0.8533 - - -
0.5059 514 0.6685 - - -
0.5069 515 1.1556 - - -
0.5079 516 0.8177 - - -
0.5089 517 0.6296 - - -
0.5098 518 0.8407 - - -
0.5108 519 0.6987 - - -
0.5118 520 0.9888 - - -
0.5128 521 0.8938 - - -
0.5138 522 0.582 - - -
0.5148 523 0.6596 - - -
0.5157 524 0.6029 - - -
0.5167 525 0.9806 - - -
0.5177 526 0.9463 - - -
0.5187 527 0.7088 - - -
0.5197 528 0.7525 - - -
0.5207 529 0.7625 - - -
0.5217 530 0.8271 - - -
0.5226 531 0.6129 - - -
0.5236 532 1.1563 - - -
0.5246 533 0.8131 - - -
0.5256 534 0.5363 - - -
0.5266 535 0.8819 - - -
0.5276 536 0.9772 - - -
0.5285 537 1.2102 - - -
0.5295 538 1.1234 - - -
0.5305 539 1.1857 - - -
0.5315 540 0.7873 - - -
0.5325 541 0.5034 - - -
0.5335 542 1.3305 - - -
0.5344 543 1.1727 - - -
0.5354 544 1.2825 - - -
0.5364 545 1.0446 - - -
0.5374 546 0.9838 - - -
0.5384 547 1.2194 - - -
0.5394 548 0.7709 - - -
0.5404 549 0.748 - - -
0.5413 550 1.0948 - - -
0.5423 551 0.915 - - -
0.5433 552 1.537 - - -
0.5443 553 0.3239 - - -
0.5453 554 0.9592 - - -
0.5463 555 0.7737 - - -
0.5472 556 0.613 - - -
0.5482 557 1.3646 - - -
0.5492 558 0.6659 - - -
0.5502 559 0.5207 - - -
0.5512 560 0.9467 - - -
0.5522 561 0.5692 - - -
0.5531 562 1.5855 - - -
0.5541 563 0.8855 - - -
0.5551 564 1.1829 - - -
0.5561 565 0.978 - - -
0.5571 566 1.1818 - - -
0.5581 567 0.701 - - -
0.5591 568 1.0226 - - -
0.5600 569 0.5937 - - -
0.5610 570 0.8095 - - -
0.5620 571 1.174 - - -
0.5630 572 0.96 - - -
0.5640 573 0.8339 - - -
0.5650 574 0.717 - - -
0.5659 575 0.5938 - - -
0.5669 576 0.6501 - - -
0.5679 577 0.7003 - - -
0.5689 578 0.5525 - - -
0.5699 579 0.7003 - - -
0.5709 580 1.059 - - -
0.5719 581 0.8625 - - -
0.5728 582 0.5862 - - -
0.5738 583 0.9162 - - -
0.5748 584 0.926 - - -
0.5758 585 1.2729 - - -
0.5768 586 0.8935 - - -
0.5778 587 0.541 - - -
0.5787 588 1.1455 - - -
0.5797 589 0.7306 - - -
0.5807 590 0.9088 - - -
0.5817 591 0.9166 - - -
0.5827 592 0.8679 - - -
0.5837 593 0.9329 - - -
0.5846 594 1.1201 - - -
0.5856 595 0.6418 - - -
0.5866 596 1.145 - - -
0.5876 597 1.4041 - - -
0.5886 598 0.6954 - - -
0.5896 599 0.4567 - - -
0.5906 600 1.1305 - - -
0.5915 601 0.8077 - - -
0.5925 602 0.6143 - - -
0.5935 603 1.3139 - - -
0.5945 604 0.7694 - - -
0.5955 605 0.9622 - - -
0.5965 606 0.91 - - -
0.5974 607 1.3125 - - -
0.5984 608 1.0153 - - -
0.5994 609 0.8468 - - -
0.6004 610 1.1026 - - -
0.6014 611 0.8291 - - -
0.6024 612 0.7235 0.5680 0.5175 0.6900
0.6033 613 0.9613 - - -
0.6043 614 0.7124 - - -
0.6053 615 1.0719 - - -
0.6063 616 0.7233 - - -
0.6073 617 1.6863 - - -
0.6083 618 0.8665 - - -
0.6093 619 1.6432 - - -
0.6102 620 0.771 - - -
0.6112 621 0.6755 - - -
0.6122 622 0.6809 - - -
0.6132 623 0.5626 - - -
0.6142 624 0.6287 - - -
0.6152 625 0.5478 - - -
0.6161 626 0.9155 - - -
0.6171 627 1.8947 - - -
0.6181 628 1.1943 - - -
0.6191 629 1.2465 - - -
0.6201 630 0.4045 - - -
0.6211 631 0.5688 - - -
0.6220 632 0.8682 - - -
0.6230 633 0.901 - - -
0.6240 634 0.8978 - - -
0.625 635 0.6592 - - -
0.6260 636 0.6828 - - -
0.6270 637 0.3995 - - -
0.6280 638 0.8283 - - -
0.6289 639 0.5938 - - -
0.6299 640 1.256 - - -
0.6309 641 0.7801 - - -
0.6319 642 0.8975 - - -
0.6329 643 1.255 - - -
0.6339 644 1.1252 - - -
0.6348 645 1.2394 - - -
0.6358 646 0.9579 - - -
0.6368 647 0.8664 - - -
0.6378 648 0.4087 - - -
0.6388 649 0.4084 - - -
0.6398 650 0.6214 - - -
0.6407 651 0.7237 - - -
0.6417 652 0.6294 - - -
0.6427 653 0.8264 - - -
0.6437 654 0.8258 - - -
0.6447 655 0.6865 - - -
0.6457 656 0.7408 - - -
0.6467 657 0.6196 - - -
0.6476 658 0.5659 - - -
0.6486 659 0.6353 - - -
0.6496 660 0.8432 - - -
0.6506 661 0.6573 - - -
0.6516 662 0.4918 - - -
0.6526 663 0.8571 - - -
0.6535 664 0.9105 - - -
0.6545 665 0.7499 - - -
0.6555 666 0.5277 - - -
0.6565 667 1.3046 - - -
0.6575 668 0.6112 - - -
0.6585 669 0.754 - - -
0.6594 670 1.3611 - - -
0.6604 671 1.3654 - - -
0.6614 672 0.6479 - - -
0.6624 673 0.9422 - - -
0.6634 674 0.5007 - - -
0.6644 675 0.4789 - - -
0.6654 676 0.3962 - - -
0.6663 677 0.8648 - - -
0.6673 678 0.8241 - - -
0.6683 679 0.7923 - - -
0.6693 680 0.5799 - - -
0.6703 681 0.8422 - - -
0.6713 682 0.5638 - - -
0.6722 683 0.9627 - - -
0.6732 684 0.7585 - - -
0.6742 685 0.488 - - -
0.6752 686 1.0991 - - -
0.6762 687 0.4808 - - -
0.6772 688 1.3717 - - -
0.6781 689 0.8225 - - -
0.6791 690 0.9377 - - -
0.6801 691 0.8016 - - -
0.6811 692 0.6257 - - -
0.6821 693 1.0941 - - -
0.6831 694 1.0878 - - -
0.6841 695 0.9231 - - -
0.6850 696 1.3284 - - -
0.6860 697 0.7274 - - -
0.6870 698 0.7851 - - -
0.6880 699 0.6807 - - -
0.6890 700 1.1592 - - -
0.6900 701 0.6347 - - -
0.6909 702 1.1326 - - -
0.6919 703 1.3797 - - -
0.6929 704 0.7782 - - -
0.6939 705 1.1746 - - -
0.6949 706 0.7624 - - -
0.6959 707 0.7216 - - -
0.6969 708 0.4666 - - -
0.6978 709 0.9398 - - -
0.6988 710 0.9532 - - -
0.6998 711 0.8285 - - -
0.7008 712 0.7042 - - -
0.7018 713 1.0057 - - -
0.7028 714 0.6696 - - -
0.7037 715 0.5701 - - -
0.7047 716 0.7894 - - -
0.7057 717 0.9327 - - -
0.7067 718 1.0283 - - -
0.7077 719 0.6419 - - -
0.7087 720 0.7716 - - -
0.7096 721 0.5968 - - -
0.7106 722 0.6547 - - -
0.7116 723 0.8344 - - -
0.7126 724 0.5448 - - -
0.7136 725 0.6075 - - -
0.7146 726 0.6142 - - -
0.7156 727 0.681 - - -
0.7165 728 0.6137 - - -
0.7175 729 1.4516 - - -
0.7185 730 0.9083 - - -
0.7195 731 0.6369 - - -
0.7205 732 0.8102 - - -
0.7215 733 0.7808 - - -
0.7224 734 0.512 - - -
0.7234 735 1.3589 - - -
0.7244 736 0.6555 - - -
0.7254 737 0.8732 - - -
0.7264 738 0.7591 - - -
0.7274 739 0.8237 - - -
0.7283 740 1.1272 - - -
0.7293 741 0.5353 - - -
0.7303 742 1.1713 - - -
0.7313 743 0.7034 - - -
0.7323 744 0.5118 - - -
0.7333 745 0.7816 - - -
0.7343 746 0.7935 - - -
0.7352 747 1.0736 - - -
0.7362 748 0.6528 - - -
0.7372 749 0.5556 - - -
0.7382 750 0.3339 - - -
0.7392 751 1.0371 - - -
0.7402 752 1.3273 - - -
0.7411 753 0.5307 - - -
0.7421 754 0.908 - - -
0.7431 755 0.5248 - - -
0.7441 756 0.6128 - - -
0.7451 757 1.5065 - - -
0.7461 758 1.2071 - - -
0.7470 759 0.9983 - - -
0.7480 760 1.0378 - - -
0.7490 761 1.0732 - - -
0.75 762 0.8046 - - -
0.7510 763 1.1666 - - -
0.7520 764 0.7042 - - -
0.7530 765 1.4759 0.5730 0.5105 0.6982
0.7539 766 0.5595 - - -
0.7549 767 0.3646 - - -
0.7559 768 0.6791 - - -
0.7569 769 0.6889 - - -
0.7579 770 0.9715 - - -
0.7589 771 1.2218 - - -
0.7598 772 0.3957 - - -
0.7608 773 1.5635 - - -
0.7618 774 0.9081 - - -
0.7628 775 0.8633 - - -
0.7638 776 0.9482 - - -
0.7648 777 0.8021 - - -
0.7657 778 0.9083 - - -
0.7667 779 0.9599 - - -
0.7677 780 0.798 - - -
0.7687 781 0.5817 - - -
0.7697 782 0.5287 - - -
0.7707 783 1.0251 - - -
0.7717 784 0.3995 - - -
0.7726 785 0.8753 - - -
0.7736 786 1.315 - - -
0.7746 787 0.8053 - - -
0.7756 788 0.6032 - - -
0.7766 789 0.8059 - - -
0.7776 790 0.9387 - - -
0.7785 791 0.8321 - - -
0.7795 792 0.5788 - - -
0.7805 793 0.7874 - - -
0.7815 794 0.5145 - - -
0.7825 795 1.5847 - - -
0.7835 796 1.1473 - - -
0.7844 797 0.968 - - -
0.7854 798 1.2691 - - -
0.7864 799 0.7046 - - -
0.7874 800 0.9358 - - -
0.7884 801 0.3945 - - -
0.7894 802 0.861 - - -
0.7904 803 1.0736 - - -
0.7913 804 1.0279 - - -
0.7923 805 0.6902 - - -
0.7933 806 0.7344 - - -
0.7943 807 0.8184 - - -
0.7953 808 0.8121 - - -
0.7963 809 0.3772 - - -
0.7972 810 0.2367 - - -
0.7982 811 0.5584 - - -
0.7992 812 0.3465 - - -
0.8002 813 0.9229 - - -
0.8012 814 0.7351 - - -
0.8022 815 0.9024 - - -
0.8031 816 0.4958 - - -
0.8041 817 0.4321 - - -
0.8051 818 0.6945 - - -
0.8061 819 0.6415 - - -
0.8071 820 0.5615 - - -
0.8081 821 0.9732 - - -
0.8091 822 1.018 - - -
0.8100 823 1.1392 - - -
0.8110 824 0.8539 - - -
0.8120 825 0.7081 - - -
0.8130 826 1.3176 - - -
0.8140 827 0.7387 - - -
0.8150 828 1.0975 - - -
0.8159 829 0.5883 - - -
0.8169 830 0.9677 - - -
0.8179 831 0.3869 - - -
0.8189 832 0.9083 - - -
0.8199 833 0.6939 - - -
0.8209 834 0.5046 - - -
0.8219 835 0.3594 - - -
0.8228 836 0.4609 - - -
0.8238 837 0.8132 - - -
0.8248 838 0.4493 - - -
0.8258 839 0.8566 - - -
0.8268 840 0.9958 - - -
0.8278 841 0.936 - - -
0.8287 842 1.4087 - - -
0.8297 843 1.2214 - - -
0.8307 844 0.7135 - - -
0.8317 845 1.1881 - - -
0.8327 846 1.1807 - - -
0.8337 847 0.5896 - - -
0.8346 848 0.8483 - - -
0.8356 849 0.5409 - - -
0.8366 850 0.8286 - - -
0.8376 851 1.136 - - -
0.8386 852 0.819 - - -
0.8396 853 0.3029 - - -
0.8406 854 0.5538 - - -
0.8415 855 0.4572 - - -
0.8425 856 1.1516 - - -
0.8435 857 0.4768 - - -
0.8445 858 0.5357 - - -
0.8455 859 0.8879 - - -
0.8465 860 0.9121 - - -
0.8474 861 1.052 - - -
0.8484 862 0.8009 - - -
0.8494 863 0.2474 - - -
0.8504 864 0.8517 - - -
0.8514 865 0.5678 - - -
0.8524 866 0.4882 - - -
0.8533 867 0.638 - - -
0.8543 868 0.8012 - - -
0.8553 869 0.4172 - - -
0.8563 870 0.6727 - - -
0.8573 871 0.6383 - - -
0.8583 872 1.0728 - - -
0.8593 873 1.3336 - - -
0.8602 874 1.217 - - -
0.8612 875 0.8018 - - -
0.8622 876 1.2326 - - -
0.8632 877 1.0898 - - -
0.8642 878 0.7192 - - -
0.8652 879 0.7779 - - -
0.8661 880 0.9291 - - -
0.8671 881 0.6333 - - -
0.8681 882 0.9028 - - -
0.8691 883 0.9904 - - -
0.8701 884 0.5775 - - -
0.8711 885 0.9783 - - -
0.8720 886 1.111 - - -
0.8730 887 1.0398 - - -
0.8740 888 0.742 - - -
0.875 889 0.7606 - - -
0.8760 890 0.6941 - - -
0.8770 891 1.0219 - - -
0.8780 892 0.8949 - - -
0.8789 893 1.0013 - - -
0.8799 894 1.1122 - - -
0.8809 895 1.1711 - - -
0.8819 896 0.7457 - - -
0.8829 897 0.5431 - - -
0.8839 898 0.8159 - - -
0.8848 899 1.6812 - - -
0.8858 900 0.8163 - - -
0.8868 901 0.9315 - - -
0.8878 902 0.883 - - -
0.8888 903 0.8014 - - -
0.8898 904 1.4277 - - -
0.8907 905 0.7507 - - -
0.8917 906 0.9757 - - -
0.8927 907 0.6366 - - -
0.8937 908 0.7567 - - -
0.8947 909 0.7853 - - -
0.8957 910 0.7611 - - -
0.8967 911 0.8417 - - -
0.8976 912 0.5825 - - -
0.8986 913 1.0816 - - -
0.8996 914 1.1768 - - -
0.9006 915 0.4894 - - -
0.9016 916 0.8739 - - -
0.9026 917 1.0047 - - -
0.9035 918 0.7733 0.6115 0.5123 0.7069
0.9045 919 1.1274 - - -
0.9055 920 1.5414 - - -
0.9065 921 1.0839 - - -
0.9075 922 0.5004 - - -
0.9085 923 0.8282 - - -
0.9094 924 0.8041 - - -
0.9104 925 0.6129 - - -
0.9114 926 0.8697 - - -
0.9124 927 0.7138 - - -
0.9134 928 1.0387 - - -
0.9144 929 0.6024 - - -
0.9154 930 0.6641 - - -
0.9163 931 0.8028 - - -
0.9173 932 0.8097 - - -
0.9183 933 0.6162 - - -
0.9193 934 0.4264 - - -
0.9203 935 0.3983 - - -
0.9213 936 0.7519 - - -
0.9222 937 1.0682 - - -
0.9232 938 0.6805 - - -
0.9242 939 1.1305 - - -
0.9252 940 1.5805 - - -
0.9262 941 0.7031 - - -
0.9272 942 0.6847 - - -
0.9281 943 0.3777 - - -
0.9291 944 0.7438 - - -
0.9301 945 1.1754 - - -
0.9311 946 0.6602 - - -
0.9321 947 1.2893 - - -
0.9331 948 0.8315 - - -
0.9341 949 1.0961 - - -
0.9350 950 0.7657 - - -
0.9360 951 0.8915 - - -
0.9370 952 0.6784 - - -
0.9380 953 0.6772 - - -
0.9390 954 0.9102 - - -
0.9400 955 0.8083 - - -
0.9409 956 0.5966 - - -
0.9419 957 1.06 - - -
0.9429 958 1.0412 - - -
0.9439 959 0.9467 - - -
0.9449 960 0.847 - - -
0.9459 961 0.875 - - -
0.9469 962 0.8303 - - -
0.9478 963 0.8641 - - -
0.9488 964 0.6343 - - -
0.9498 965 0.7566 - - -
0.9508 966 0.9767 - - -
0.9518 967 0.7518 - - -
0.9528 968 0.5381 - - -
0.9537 969 0.9959 - - -
0.9547 970 0.8374 - - -
0.9557 971 0.5459 - - -
0.9567 972 0.6633 - - -
0.9577 973 0.9526 - - -
0.9587 974 0.8487 - - -
0.9596 975 0.6572 - - -
0.9606 976 0.8039 - - -
0.9616 977 0.6213 - - -
0.9626 978 0.5483 - - -
0.9636 979 0.516 - - -
0.9646 980 0.8891 - - -
0.9656 981 0.7904 - - -
0.9665 982 1.2282 - - -
0.9675 983 1.0609 - - -
0.9685 984 0.8063 - - -
0.9695 985 0.5294 - - -
0.9705 986 0.7394 - - -
0.9715 987 0.5749 - - -
0.9724 988 0.9125 - - -
0.9734 989 0.961 - - -
0.9744 990 0.5947 - - -
0.9754 991 0.6246 - - -
0.9764 992 0.6492 - - -
0.9774 993 1.0508 - - -
0.9783 994 0.7613 - - -
0.9793 995 0.706 - - -
0.9803 996 0.9458 - - -
0.9813 997 0.836 - - -
0.9823 998 0.7472 - - -
0.9833 999 0.9071 - - -
0.9843 1000 1.0935 - - -
0.9852 1001 0.5822 - - -
0.9862 1002 0.6596 - - -
0.9872 1003 0.5536 - - -
0.9882 1004 1.0741 - - -
0.9892 1005 0.4586 - - -
0.9902 1006 0.9633 - - -
0.9911 1007 0.8739 - - -
0.9921 1008 0.665 - - -
0.9931 1009 0.9486 - - -
0.9941 1010 0.5323 - - -
0.9951 1011 0.6006 - - -
0.9961 1012 0.4568 - - -
0.9970 1013 0.7544 - - -
0.9980 1014 1.2437 - - -
0.9990 1015 0.8985 - - -
1.0 1016 0.3179 - - -
1.0010 1017 1.2519 - - -
1.0020 1018 0.9775 - - -
1.0030 1019 0.8967 - - -
1.0039 1020 0.5918 - - -
1.0049 1021 0.5798 - - -
1.0059 1022 0.5934 - - -
1.0069 1023 0.6872 - - -
1.0079 1024 0.9535 - - -
1.0089 1025 0.4967 - - -
1.0098 1026 1.116 - - -
1.0108 1027 1.0446 - - -
1.0118 1028 0.5189 - - -
1.0128 1029 0.8272 - - -
1.0138 1030 0.4901 - - -
1.0148 1031 0.8926 - - -
1.0157 1032 0.5543 - - -
1.0167 1033 0.2579 - - -
1.0177 1034 0.865 - - -
1.0187 1035 1.0559 - - -
1.0197 1036 0.714 - - -
1.0207 1037 0.5185 - - -
1.0217 1038 1.3209 - - -
1.0226 1039 0.3696 - - -
1.0236 1040 0.4839 - - -
1.0246 1041 0.7694 - - -
1.0256 1042 1.3417 - - -
1.0266 1043 0.8292 - - -
1.0276 1044 0.3835 - - -
1.0285 1045 0.3755 - - -
1.0295 1046 0.6396 - - -
1.0305 1047 0.364 - - -
1.0315 1048 0.9232 - - -
1.0325 1049 0.6706 - - -
1.0335 1050 0.8329 - - -
1.0344 1051 0.4093 - - -
1.0354 1052 0.7251 - - -
1.0364 1053 0.9254 - - -
1.0374 1054 0.6365 - - -
1.0384 1055 0.7136 - - -
1.0394 1056 0.6278 - - -
1.0404 1057 0.6769 - - -
1.0413 1058 0.7704 - - -
1.0423 1059 0.7701 - - -
1.0433 1060 0.9639 - - -
1.0443 1061 0.7518 - - -
1.0453 1062 0.7474 - - -
1.0463 1063 1.2196 - - -
1.0472 1064 1.1476 - - -
1.0482 1065 0.7139 - - -
1.0492 1066 1.0979 - - -
1.0502 1067 0.7113 - - -
1.0512 1068 0.6796 - - -
1.0522 1069 1.1551 - - -
1.0531 1070 0.7947 - - -
1.0541 1071 0.6783 0.6381 0.4883 0.7054
1.0551 1072 0.7295 - - -
1.0561 1073 0.4529 - - -
1.0571 1074 0.2625 - - -
1.0581 1075 1.2557 - - -
1.0591 1076 0.4826 - - -
1.0600 1077 1.2309 - - -
1.0610 1078 0.4545 - - -
1.0620 1079 0.6658 - - -
1.0630 1080 0.8651 - - -
1.0640 1081 1.0702 - - -
1.0650 1082 1.0225 - - -
1.0659 1083 0.6952 - - -
1.0669 1084 1.0918 - - -
1.0679 1085 0.5222 - - -
1.0689 1086 1.1311 - - -
1.0699 1087 0.6464 - - -
1.0709 1088 0.801 - - -
1.0719 1089 0.3743 - - -
1.0728 1090 0.6204 - - -
1.0738 1091 0.5369 - - -
1.0748 1092 0.6895 - - -
1.0758 1093 1.2127 - - -
1.0768 1094 0.8397 - - -
1.0778 1095 0.3694 - - -
1.0787 1096 0.7576 - - -
1.0797 1097 0.9494 - - -
1.0807 1098 0.8477 - - -
1.0817 1099 0.4869 - - -
1.0827 1100 0.6753 - - -
1.0837 1101 0.7351 - - -
1.0846 1102 0.8257 - - -
1.0856 1103 1.144 - - -
1.0866 1104 0.5526 - - -
1.0876 1105 0.6686 - - -
1.0886 1106 0.7683 - - -
1.0896 1107 0.3337 - - -
1.0906 1108 0.8686 - - -
1.0915 1109 0.66 - - -
1.0925 1110 0.9329 - - -
1.0935 1111 0.5129 - - -
1.0945 1112 0.8907 - - -
1.0955 1113 0.6305 - - -
1.0965 1114 0.3746 - - -
1.0974 1115 0.6791 - - -
1.0984 1116 1.4172 - - -
1.0994 1117 0.5184 - - -
1.1004 1118 0.4803 - - -
1.1014 1119 0.647 - - -
1.1024 1120 0.3867 - - -
1.1033 1121 0.4965 - - -
1.1043 1122 0.9508 - - -
1.1053 1123 0.709 - - -
1.1063 1124 0.9625 - - -
1.1073 1125 0.7724 - - -
1.1083 1126 0.4318 - - -
1.1093 1127 0.6653 - - -
1.1102 1128 0.781 - - -
1.1112 1129 0.6985 - - -
1.1122 1130 1.2949 - - -
1.1132 1131 0.6845 - - -
1.1142 1132 0.7022 - - -
1.1152 1133 0.8645 - - -
1.1161 1134 0.5331 - - -
1.1171 1135 0.5636 - - -
1.1181 1136 0.4511 - - -
1.1191 1137 0.605 - - -
1.1201 1138 0.6367 - - -
1.1211 1139 1.1988 - - -
1.1220 1140 0.5973 - - -
1.1230 1141 0.8703 - - -
1.1240 1142 0.48 - - -
1.125 1143 0.8903 - - -
1.1260 1144 0.6978 - - -
1.1270 1145 0.8008 - - -
1.1280 1146 0.9583 - - -
1.1289 1147 0.5823 - - -
1.1299 1148 0.974 - - -
1.1309 1149 1.1149 - - -
1.1319 1150 0.5558 - - -
1.1329 1151 0.9106 - - -
1.1339 1152 0.671 - - -
1.1348 1153 0.9424 - - -
1.1358 1154 0.6605 - - -
1.1368 1155 0.536 - - -
1.1378 1156 0.38 - - -
1.1388 1157 0.6255 - - -
1.1398 1158 1.435 - - -
1.1407 1159 0.5903 - - -
1.1417 1160 0.9517 - - -
1.1427 1161 0.4636 - - -
1.1437 1162 0.3721 - - -
1.1447 1163 0.6832 - - -
1.1457 1164 0.5893 - - -
1.1467 1165 0.6226 - - -
1.1476 1166 0.5715 - - -
1.1486 1167 1.0958 - - -
1.1496 1168 0.689 - - -
1.1506 1169 0.6443 - - -
1.1516 1170 0.9278 - - -
1.1526 1171 0.7009 - - -
1.1535 1172 0.6254 - - -
1.1545 1173 0.7537 - - -
1.1555 1174 0.7559 - - -
1.1565 1175 0.8148 - - -
1.1575 1176 0.2994 - - -
1.1585 1177 0.2148 - - -
1.1594 1178 1.0492 - - -
1.1604 1179 0.5191 - - -
1.1614 1180 1.3508 - - -
1.1624 1181 1.0396 - - -
1.1634 1182 0.5003 - - -
1.1644 1183 0.7736 - - -
1.1654 1184 0.3263 - - -
1.1663 1185 1.2059 - - -
1.1673 1186 1.3514 - - -
1.1683 1187 0.9086 - - -
1.1693 1188 0.7245 - - -
1.1703 1189 0.717 - - -
1.1713 1190 0.6741 - - -
1.1722 1191 1.2695 - - -
1.1732 1192 0.406 - - -
1.1742 1193 0.5408 - - -
1.1752 1194 0.4256 - - -
1.1762 1195 0.4827 - - -
1.1772 1196 0.5621 - - -
1.1781 1197 0.6734 - - -
1.1791 1198 0.4685 - - -
1.1801 1199 0.859 - - -
1.1811 1200 1.4388 - - -
1.1821 1201 1.1846 - - -
1.1831 1202 0.6824 - - -
1.1841 1203 1.2064 - - -
1.1850 1204 0.5668 - - -
1.1860 1205 0.5286 - - -
1.1870 1206 0.6316 - - -
1.1880 1207 0.8011 - - -
1.1890 1208 0.66 - - -
1.1900 1209 0.3735 - - -
1.1909 1210 0.7461 - - -
1.1919 1211 0.5508 - - -
1.1929 1212 0.6379 - - -
1.1939 1213 0.5204 - - -
1.1949 1214 0.7522 - - -
1.1959 1215 0.9415 - - -
1.1969 1216 0.6351 - - -
1.1978 1217 0.262 - - -
1.1988 1218 0.9048 - - -
1.1998 1219 0.6562 - - -
1.2008 1220 0.4502 - - -

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.2.1
  • Transformers: 4.44.2
  • PyTorch: 2.5.0+cu121
  • Accelerate: 0.34.2
  • Datasets: 3.0.2
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

GISTEmbedLoss

@misc{solatorio2024gistembed,
    title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
    author={Aivin V. Solatorio},
    year={2024},
    eprint={2402.16829},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}