fine-tuned-model / README.md
TomDubois12's picture
Initial commit of the fine-tuned model
fa87059 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:4224
  - loss:CosineSimilarityLoss
base_model: sentence-transformers/all-distilroberta-v1
widget:
  - source_sentence: >-
      Emerging Transparent Electrodes Based on Thin Films of Carbon Nanotubes,
      Graphene, and Metallic Nanostructures
    sentences:
      - >-
        We describe the synthesis of bilayer graphene thin films deposited on
        insulating silicon carbide and report the characterization of their
        electronic band structure using angle-resolved photoemission. By
        selectively adjusting the carrier concentration in each layer, changes
        in the Coulomb potential led to control of the gap between valence and
        conduction bands. This control over the band structure suggests the
        potential application of bilayer graphene to switching functions in
        atomic-scale electronic devices.
      - >-
        We have investigated pressure-induced Raman peak shifts for various
        carbon nanostructures with distinct differences in the degree of
        structural order. The high-frequency tangential vibrational modes of the
        hollow nanostructures, as well as those of graphite crystals and a
        macroscopic carbon fiber used as reference materials, were observed to
        shift to higher wave numbers. The hollow nanostructures and the carbon
        fiber displayed two distinct pressure regimes with transition pressures
        between 0.75 and 2.2 GPa, whereas the graphite crystals showed a linear
        pressure dependence up to hydrostatic pressures of 5 GPa. The observed
        peak shifts were reversible for all hollow nanostructures and graphite.
        Although the pressure-induced Raman peak shift in the low pressure
        regime could be used to identify the elastic properties of the
        macroscopic carbon fiber, a theoretical model shows that the observed
        deviations in the pressure coefficients of the hollow nanostructures in
        this regime can be explained entirely on the basis of geometric effects.
        The close match of all Raman peak shifts in the high pressure regime
        indicates a reversible flattening of the nanostructures at the
        transition point.
      - >-
        Among the different graphene synthesis methods, chemical vapor
        deposition of graphene on low cost copper foil shows great promise for
        large scale applications. Here, we present growth experiments to obtain
        high quality graphene and its clean transfer onto any substrates.
        Bilayer-free monolayer graphene was obtained by a careful pre-annealing
        step and by optimizing the H2 flow during growth. The as-grown graphene
        was transferred using an improved wet chemical graphene transfer
        process. Some major flaws in the conventional wet chemical, polymethyl
        methacrylate (PMMA) assisted, graphene transfer process are addressed.
        The transferred graphene on arbitrary substrates was found to be free of
        metallic contaminants, defects (cracks, holes or folds caused by water
        trapped beneath graphene) and PMMA residues. The high quality of the
        transferred graphene was further evidenced by angle resolved
        photoelectron spectroscopy studies, for which the linear dependency of
        the electronic band structure characteristic of graphene was measured at
        the Dirac point. This is the first Dirac cone observation on the CVD
        grown graphene transferred on some 3D bulk substrate.
  - source_sentence: >-
      Electronic structure, energetics and geometric structure of carbon
      nanotubes: A density-functional study
    sentences:
      - >-
        Few-layer graphene (FLG) samples prepared by two methods (chemical vapor
        deposition (CVD) followed by transfer onto SiO2/Si substrate and
        mechanical exfoliation) are characterized by combined optical contrast
        and micro-Raman mapping experiments. We examine the behavior of the
        integrated intensity ratio of the 2D and G bands (A2D/AG) and of the 2D
        band width (Γ2D) as a function of the number of layers (N). For our
        mechanically exfoliated FLG, A2D/AG decreases and Γ2D increases with N
        as expected for commensurately stacked FLG. For CVD FLG, both similar
        and opposite behaviors are observed and are ascribed to different
        stacking orders. For small (respectively, large) relative rotation angle
        between consecutive layers (θ), the values of the A2D/AG ratio is
        smaller (respectively, larger) and the 2D band is broader (respectively,
        narrower) than for single-layer graphene. Moreover, the A2D/AG ratio
        decreases (respectively, increases) and, conversely, Γ2D increases
        (respectively, decreases) as a function of N for small (respectively,
        large) θ. An intermediate behavior has also been found and is
        interpreted as the presence of both small and large θ within the studied
        area. These results confirm that neither A2D/AG nor Γ2D are definitive
        criteria to identify single-layer graphene, or to count N in FLG.
      - >-
        We present Raman spectra of epitaxial graphene layers grown on 6 root
        3x6 root 3 reconstructed silicon carbide surfaces during annealing at
        elevated temperature. In contrast to exfoliated graphene a significant
        phonon hardening is observed. We ascribe that phonon hardening to a
        minor part to the known electron transfer from the substrate to the
        epitaxial layer, and mainly to mechanical strain that builds up when the
        sample is cooled down after annealing. Due to the larger thermal
        expansion coefficient of silicon carbide compared to the in-plane
        expansion coefficient of graphite this strain is compressive at room
        temperature. (C) 2008 American Institute of Physics.
      - >-
        Based on the local density approximation (LDA) in the framework of the
        density-functional theory, we study the details of electronic structure,
        energetics and geometric structure of the chiral carbon nanotubes. For
        the electronic structure, we study all the chiral nanotubes with the
        diameters between 0.8 and 2.0 nm (154 nanotubes). This LDA result should
        give the important database to be compared with the experimental studies
        in the future. We plot the peak-to-peak energy separations of the
        density of states (DOS) as a function of the nanotube diameter (D). For
        the semiconducting nanotubes, we find the peak-to-peak separations can
        be classified into two types according to the chirality. This chirality
        dependence of the LDA result is opposite to that of the simple π
        tight-binding result. We also perform the geometry optimization of
        chiral carbon nanotubes with different chiral-angle series. From the
        total energy as a function of D, it is found that chiral nanotubes are
        less stable than zigzag nanotubes. We also find that the distribution of
        bond lengths depends on the chirality.
  - source_sentence: Resonant Raman spectra of graphene with point defects
    sentences:
      - "Manganese oxide catalysts were synthesized by direct reaction between manganese acetate and permanganate ions, under acidic and reflux conditions. Parameters such as pH (2.0–4.5) and template cation (Na+, K+ and Cs+) were studied. A pure cryptomelane-type manganese oxide was synthesized under specific conditions, and it was found that the template cation plays an important role on the formation of this kind of structure. Cryptomelane was found to be a very active oxidation catalyst, converting ethyl acetate into CO2 at low temperatures (220\_°C). This catalyst is very stable at least during 90\_h of reaction and its performance is not significantly affected by the presence of water vapour or CO2 in the feed stream. The catalyst performance can be improved by the presence of small amounts of Mn3O4."
      - >-
        A dynamically stretchable solid state supercapacitor using graphene
        woven fabric (GWF) as electrode materials is designed and evaluated. The
        electrode is developed after GWF film is transferred onto a
        pre-stretched polymer substrate. Polyaniline is deposited covering the
        GWF film through in-situ electropolymerization to improve the
        electrochemical properties of the electrode. The supercapacitor is
        assembled in sandwich structure and packaged in polymer and its
        electrochemical performance is investigated under both static and
        dynamic stretching modes. The stretchable supercapacitors possess
        excellent static and dynamic stretchability. The dynamic strain can be
        up to 30% with excellent galvanic stability even under high strain rates
        (up to 60%/s).
      - >-
        Heterogeneous electron transfer rate constants of a series of chemical
        systems are estimated using Cyclic Voltammetry (CV) and Electrochemical
        Impedance Spectroscopy (EIS), and critically compared to one another.
        Using aqueous, quasi-reversible redox systems, and carbon screen-printed
        electrodes, this work has been able to quantify rate constants using
        both techniques and have proved that the two methods sometimes result in
        measured rate constants that differ by as much as one order of
        magnitude. The method has been converted to estimate k0 values for
        irreversible electrochemical systems such as ascorbic acid and
        norepinephrine, yielding reasonable values for the electron transfer of
        their respective oxidation reactions. Such electrochemically
        irreversible cases are compared to data obtained via digital
        simulations. The work is limited to finite concentration ranges of
        electroactive species undergoing simple electron processes (‘E’ type
        reactions). The manuscript provides the field with a simple and
        effective way estimating electron transfer rate constants for
        irreversible electrochemical systems without using digital software
        packages, something which is not possible using either Nicholson or
        Laviron methods.
  - source_sentence: Band Structure of graphite
    sentences:
      - >-
        Rapid progress in identifying biomarkers that are hallmarks of disease
        has increased demand for high-performance detection technologies.
        Implementation of electrochemical methods in clinical analysis may
        provide an effective answer to the growing need for rapid, specific,
        inexpensive, and fully automated means of biomarker analysis. This
        Review summarizes advances from the past 5 years in the development of
        electrochemical sensors for clinically relevant biomolecules, including
        small molecules, nucleic acids, and proteins. Various sensing strategies
        are assessed according to their potential for reaching relevant limits
        of sensitivity, specificity, and degrees of multiplexing. Furthermore,
        we address the remaining challenges and opportunities to integrate
        electrochemical sensing platforms into point-of-care solutions.
      - >-
        The structure and the electrical, mechanical and optical properties of
        few-layer graphene (FLG) synthesized by chemical vapor deposition (CVD)
        on a Ni-coated substrate were studied. Atomic resolution transmission
        electron microscope (TEM) images show highly crystalline single-layer
        parts of the sample changing to multi-layer domains where crystal
        boundaries are connected by chemical bonds. This suggests two different
        growth mechanisms. CVD and carbon segregation participate in the growth
        process and are responsible for the different structural formations
        found. Measurements of the electrical and mechanical properties on the
        centimeter scale provide evidence of a large scale structural
        continuity: (1) in the temperature dependence of the electrical
        conductivity, a non-zero value near 0 K indicates the metallic character
        of electronic transport; (2) Young's modulus of a pristine polycarbonate
        film (1.37 GPa) improves significantly when covered with FLG (1.85 GPa).
        The latter indicates an extraordinary Young modulus value of the
        FLG-coating of TPa orders of magnitude. Raman and optical spectroscopy
        support the previous conclusions. The sample can be used as a flexible
        and transparent electrode and is suitable for use as special membranes
        to detect and study individual molecules in high-resolution TEM.
      - >-
        The site-dependent and spontaneous functionalization of 4-bromobenzene
        diazonium tetralluoroborate (4-BBDT) and its doping effect on a
        mechanically exfoliated graphene (MEG) were investigated. The spatially
        resolved Raman spectra obtained from both edge and basal region of MEG
        revealed that 4-BBDT molecules were noncovalently functionalized on the
        basal region of MEG, while they were covalently bonded to the edge of
        MEG. The chemical doping effect induced by noncovalently functionalized
        4-BBDT molecules on a basal plane region of MEG was successfully
        explicated by Raman spectroscopy. The position of Fermi level of MEG and
        the type of doping charge carrier induced by the noncovalently adsorbed
        4-BBDT molecules were determined from systematic G band and 2D band
        changes. The successful spectroscopic elucidation of the different
        bonding characters of 4-BBDT depending on the site of graphene is
        beneficial for the fundamental studies about the charge transfer
        phenomena of graphene as well as for the potential applications, such as
        electronic devices, hybridized composite structures, etc.
  - source_sentence: >-
      Panorama de l’existant sur les capteurs et analyseurs en ligne pour la
      mesure des parametres physico-chimiques dans l’eau
    sentences:
      - >-
        Le travail de compilation des différents capteurs et analyseurs a été
        réalisé à partir de différentes sources d'information comme l'annuaire
        du Guide de l'eau, les sites web des sociétés et les salons
        professionnels. 71 fabricants ont ainsi été recensés. Un classement a
        été effectué en considérant: les sondes in situ et les capteurs (1 à 3
        paramètres et 4 paramètres et plus), les analyseurs en ligne (avec et
        sans réactifs, in situ) et les appareils portables. Des retours
        d'expériences sur le fonctionnement des stations de mesure en continu
        ont été réalisés pour quatre types d'eau (les cours d'eau, les eaux
        souterraines, les eaux de rejets et les eaux marines) à travers des
        entretiens téléphoniques avec les gestionnaires des stations de mesure
        en France et via la littérature pour les stations situées en Europe. Il
        en ressort que la configuration de la grande majorité des stations est
        basée sur un pompage de l'eau dans un local technique par rapport aux
        stations autonomes in situ. Les paramètres qui sont le plus souvent
        mesurés sont le pH, la conductivité, l'oxygène dissous, la température,
        la turbidité, les nutriments (ammonium, nitrates, phosphates) et la
        matière organique (carbone organique, absorbance spécifique à 254 nm).
        En fonction des besoins, les micropolluants (notamment métaux,
        hydrocarbures et HAP), la chlorophylle et les cyanobactéries ainsi que
        la toxicité sont occasionnellement mesurés. D'une manière générale, les
        capteurs et analyseurs sont jugés robustes et fiables. Certaines
        difficultés ont pu être mises en évidence, par exemple les dérives pour
        les capteurs mesurant l'ammonium. La maintenance associée aux stations
        de mesure peut être très importante en termes de temps passé et de cout
        des réactifs. Des études en amont ont souvent été engagées pour vérifier
        la fiabilité des résultats obtenus, notamment à travers la comparaison
        avec des mesures de contrôle et des prélèvements suivis d'analyses en
        laboratoire. Enfin, certains gestionnaires ont mis en place des
        contrôles qualité rigoureux et fréquents, ceci afin de s'assurer du bon
        fonctionnement et de la stabilité des capteurs dans le temps.
      - >-
        Carbon nanotubes have attracted considerable interest for their unique
        electronic properties. They are fascinating candidates for fundamental
        studies of one dimensional materials as well as for future molecular
        electronics applications. The molecular orbitals of nanotubes are of
        particular importance as they govern the transport properties and the
        chemical reactivity of the system. Here, we show for the first time a
        complete experimental investigation of molecular orbitals of single wall
        carbon nanotubes using atomically resolved scanning tunneling
        spectroscopy. Local conductance measurements show spectacular
        carbon-carbon bond asymmetry at the Van Hove singularities for both
        semiconducting and metallic tubes, demonstrating the symmetry breaking
        of molecular orbitals in nanotubes. Whatever the tube, only two types of
        complementary orbitals are alternatively observed. An analytical
        tight-binding model describing the interference patterns of π orbitals
        confirmed by ab initio calculations, perfectly reproduces the
        experimental results.
      - >-
        Bilayer graphene is an intriguing material in that its electronic
        structure can be altered by changing the stacking order or the relative
        twist angle, yielding a new class of low-dimensional carbon system.
        Twisted bilayer graphene can be obtained by (i) thermal decomposition of
        SiC; (ii) chemical vapor deposition (CVD) on metal catalysts; (iii)
        folding graphene; or (iv) stacking graphene layers one atop the other,
        the latter of which suffers from interlayer contamination. Existing
        synthesis protocols, however, usually result in graphene with
        polycrystalline structures. The present study investigates bilayer
        graphene grown by ambient pressure CVD on polycrystalline Cu.
        Controlling the nucleation in early stage growth allows the constituent
        layers to form single hexagonal crystals. New Raman active modes are
        shown to result from the twist, with the angle determined by
        transmission electron microscopy. The successful growth of
        single-crystal bilayer graphene provides an attractive jumping-off point
        for systematic studies of interlayer coupling in misoriented few-layer
        graphene systems with well-defined geometry.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on sentence-transformers/all-distilroberta-v1

This is a sentence-transformers model finetuned from sentence-transformers/all-distilroberta-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("TomDubois12/fine-tuned-model")
# Run inference
sentences = [
    'Panorama de l’existant sur les capteurs et analyseurs en ligne pour la mesure des parametres physico-chimiques dans l’eau',
    "Le travail de compilation des différents capteurs et analyseurs a été réalisé à partir de différentes sources d'information comme l'annuaire du Guide de l'eau, les sites web des sociétés et les salons professionnels. 71 fabricants ont ainsi été recensés. Un classement a été effectué en considérant: les sondes in situ et les capteurs (1 à 3 paramètres et 4 paramètres et plus), les analyseurs en ligne (avec et sans réactifs, in situ) et les appareils portables. Des retours d'expériences sur le fonctionnement des stations de mesure en continu ont été réalisés pour quatre types d'eau (les cours d'eau, les eaux souterraines, les eaux de rejets et les eaux marines) à travers des entretiens téléphoniques avec les gestionnaires des stations de mesure en France et via la littérature pour les stations situées en Europe. Il en ressort que la configuration de la grande majorité des stations est basée sur un pompage de l'eau dans un local technique par rapport aux stations autonomes in situ. Les paramètres qui sont le plus souvent mesurés sont le pH, la conductivité, l'oxygène dissous, la température, la turbidité, les nutriments (ammonium, nitrates, phosphates) et la matière organique (carbone organique, absorbance spécifique à 254 nm). En fonction des besoins, les micropolluants (notamment métaux, hydrocarbures et HAP), la chlorophylle et les cyanobactéries ainsi que la toxicité sont occasionnellement mesurés. D'une manière générale, les capteurs et analyseurs sont jugés robustes et fiables. Certaines difficultés ont pu être mises en évidence, par exemple les dérives pour les capteurs mesurant l'ammonium. La maintenance associée aux stations de mesure peut être très importante en termes de temps passé et de cout des réactifs. Des études en amont ont souvent été engagées pour vérifier la fiabilité des résultats obtenus, notamment à travers la comparaison avec des mesures de contrôle et des prélèvements suivis d'analyses en laboratoire. Enfin, certains gestionnaires ont mis en place des contrôles qualité rigoureux et fréquents, ceci afin de s'assurer du bon fonctionnement et de la stabilité des capteurs dans le temps.",
    'Bilayer graphene is an intriguing material in that its electronic structure can be altered by changing the stacking order or the relative twist angle, yielding a new class of low-dimensional carbon system. Twisted bilayer graphene can be obtained by (i) thermal decomposition of SiC; (ii) chemical vapor deposition (CVD) on metal catalysts; (iii) folding graphene; or (iv) stacking graphene layers one atop the other, the latter of which suffers from interlayer contamination. Existing synthesis protocols, however, usually result in graphene with polycrystalline structures. The present study investigates bilayer graphene grown by ambient pressure CVD on polycrystalline Cu. Controlling the nucleation in early stage growth allows the constituent layers to form single hexagonal crystals. New Raman active modes are shown to result from the twist, with the angle determined by transmission electron microscopy. The successful growth of single-crystal bilayer graphene provides an attractive jumping-off point for systematic studies of interlayer coupling in misoriented few-layer graphene systems with well-defined geometry.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 4,224 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string int
    details
    • min: 6 tokens
    • mean: 21.55 tokens
    • max: 86 tokens
    • min: 2 tokens
    • mean: 177.38 tokens
    • max: 512 tokens
    • 0: ~67.00%
    • 1: ~33.00%
  • Samples:
    sentence_0 sentence_1 label
    High-Pressure Elastic Properties of Solid Argon to 70 GPa The acoustic velocities, adiabatic elastic constants, bulk modulus, elastic anisotropy, Cauchy violation, and density in an ideal solid argon (Ar) have been determined at high pressures up to 70 GPa in a diamond anvil cell by making new approaches of Brillouin spectroscopy. These results place the first complete study for elastic properties of dense Ar and provide an improved basis for making the theoretical calculations of rare-gas solids over a wide range of compression. 1
    Direct Voltammetric Detection of DNA and pH Sensing on Epitaxial Graphene: An Insight into the Role of Oxygenated Defects In this paper, we carried out detailed electrochemical studies of epitaxial graphene (EG) using inner-sphere and outer-sphere redox mediators. The EG sample was anodized systematically to investigate the effect of edge plane defects on the heterogeneous charge transfer kinetics and capacitive noise. We found that anodized EG, consisting of oxygen-related defects, is a superior biosensing platform for the detection of nucleic acids, uric acids (UA), dopamine (DA), and ascorbic acids (AA). Mixtures of nucleic acids (A, T, C, G) or biomolecules (AA, UA, DA) can be resolved as individual peaks using differential pulse voltammetry. In fact, an anodized EG voltammetric sensor can realize the simultaneous detection of all four DNA bases in double stranded DNA (dsDNA) without a prehydrolysis step, and it can also differentiate single stranded DNA from dsDNA. Our results show that graphene with high edge plane defects, as opposed to pristine graphene, is the choice platform in high resolution electrochemical sensing. 1
    Scanning Electrochemical Microscopy of Carbon Nanomaterials and Graphite We present a comprehensive study of the chiral-index assignment of carbon nanotubes in aqueous suspensions by resonant Raman scattering of the radial breathing mode. We determine the energies of the first optical transition in metallic tubes and of the second optical transition in semiconducting tubes for more than 50 chiral indices. The assignment is unique and does not depend on empirical parameters. The systematics of the so-called branches in the Kataura plot are discussed; many properties of the tubes are similar for members of the same branch. We show how the radial breathing modes observed in a single Raman spectrum can be easily assigned based on these systematics. In addition, empirical fits provide the energies and radial breathing modes for all metallic and semiconducting nanotubes with diameters between 0.6 and 1.5 nm. We discuss the relation between the frequency of the radial breathing mode and tube diameter. Finally, from the Raman intensities we obtain information on the electron-phonon coupling. 0
  • Loss: CosineSimilarityLoss with these parameters:
    {
        "loss_fct": "torch.nn.modules.loss.MSELoss"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
1.8939 500 0.0778

Framework Versions

  • Python: 3.12.7
  • Sentence Transformers: 3.1.1
  • Transformers: 4.45.2
  • PyTorch: 2.5.1+cpu
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}