--- tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:4224 - loss:CosineSimilarityLoss base_model: sentence-transformers/all-distilroberta-v1 widget: - source_sentence: Emerging Transparent Electrodes Based on Thin Films of Carbon Nanotubes, Graphene, and Metallic Nanostructures sentences: - We describe the synthesis of bilayer graphene thin films deposited on insulating silicon carbide and report the characterization of their electronic band structure using angle-resolved photoemission. By selectively adjusting the carrier concentration in each layer, changes in the Coulomb potential led to control of the gap between valence and conduction bands. This control over the band structure suggests the potential application of bilayer graphene to switching functions in atomic-scale electronic devices. - We have investigated pressure-induced Raman peak shifts for various carbon nanostructures with distinct differences in the degree of structural order. The high-frequency tangential vibrational modes of the hollow nanostructures, as well as those of graphite crystals and a macroscopic carbon fiber used as reference materials, were observed to shift to higher wave numbers. The hollow nanostructures and the carbon fiber displayed two distinct pressure regimes with transition pressures between 0.75 and 2.2 GPa, whereas the graphite crystals showed a linear pressure dependence up to hydrostatic pressures of 5 GPa. The observed peak shifts were reversible for all hollow nanostructures and graphite. Although the pressure-induced Raman peak shift in the low pressure regime could be used to identify the elastic properties of the macroscopic carbon fiber, a theoretical model shows that the observed deviations in the pressure coefficients of the hollow nanostructures in this regime can be explained entirely on the basis of geometric effects. The close match of all Raman peak shifts in the high pressure regime indicates a reversible flattening of the nanostructures at the transition point. - Among the different graphene synthesis methods, chemical vapor deposition of graphene on low cost copper foil shows great promise for large scale applications. Here, we present growth experiments to obtain high quality graphene and its clean transfer onto any substrates. Bilayer-free monolayer graphene was obtained by a careful pre-annealing step and by optimizing the H2 flow during growth. The as-grown graphene was transferred using an improved wet chemical graphene transfer process. Some major flaws in the conventional wet chemical, polymethyl methacrylate (PMMA) assisted, graphene transfer process are addressed. The transferred graphene on arbitrary substrates was found to be free of metallic contaminants, defects (cracks, holes or folds caused by water trapped beneath graphene) and PMMA residues. The high quality of the transferred graphene was further evidenced by angle resolved photoelectron spectroscopy studies, for which the linear dependency of the electronic band structure characteristic of graphene was measured at the Dirac point. This is the first Dirac cone observation on the CVD grown graphene transferred on some 3D bulk substrate. - source_sentence: 'Electronic structure, energetics and geometric structure of carbon nanotubes: A density-functional study' sentences: - Few-layer graphene (FLG) samples prepared by two methods (chemical vapor deposition (CVD) followed by transfer onto SiO2/Si substrate and mechanical exfoliation) are characterized by combined optical contrast and micro-Raman mapping experiments. We examine the behavior of the integrated intensity ratio of the 2D and G bands (A2D/AG) and of the 2D band width (Γ2D) as a function of the number of layers (N). For our mechanically exfoliated FLG, A2D/AG decreases and Γ2D increases with N as expected for commensurately stacked FLG. For CVD FLG, both similar and opposite behaviors are observed and are ascribed to different stacking orders. For small (respectively, large) relative rotation angle between consecutive layers (θ), the values of the A2D/AG ratio is smaller (respectively, larger) and the 2D band is broader (respectively, narrower) than for single-layer graphene. Moreover, the A2D/AG ratio decreases (respectively, increases) and, conversely, Γ2D increases (respectively, decreases) as a function of N for small (respectively, large) θ. An intermediate behavior has also been found and is interpreted as the presence of both small and large θ within the studied area. These results confirm that neither A2D/AG nor Γ2D are definitive criteria to identify single-layer graphene, or to count N in FLG. - We present Raman spectra of epitaxial graphene layers grown on 6 root 3x6 root 3 reconstructed silicon carbide surfaces during annealing at elevated temperature. In contrast to exfoliated graphene a significant phonon hardening is observed. We ascribe that phonon hardening to a minor part to the known electron transfer from the substrate to the epitaxial layer, and mainly to mechanical strain that builds up when the sample is cooled down after annealing. Due to the larger thermal expansion coefficient of silicon carbide compared to the in-plane expansion coefficient of graphite this strain is compressive at room temperature. (C) 2008 American Institute of Physics. - Based on the local density approximation (LDA) in the framework of the density-functional theory, we study the details of electronic structure, energetics and geometric structure of the chiral carbon nanotubes. For the electronic structure, we study all the chiral nanotubes with the diameters between 0.8 and 2.0 nm (154 nanotubes). This LDA result should give the important database to be compared with the experimental studies in the future. We plot the peak-to-peak energy separations of the density of states (DOS) as a function of the nanotube diameter (D). For the semiconducting nanotubes, we find the peak-to-peak separations can be classified into two types according to the chirality. This chirality dependence of the LDA result is opposite to that of the simple π tight-binding result. We also perform the geometry optimization of chiral carbon nanotubes with different chiral-angle series. From the total energy as a function of D, it is found that chiral nanotubes are less stable than zigzag nanotubes. We also find that the distribution of bond lengths depends on the chirality. - source_sentence: Resonant Raman spectra of graphene with point defects sentences: - Manganese oxide catalysts were synthesized by direct reaction between manganese acetate and permanganate ions, under acidic and reflux conditions. Parameters such as pH (2.0–4.5) and template cation (Na+, K+ and Cs+) were studied. A pure cryptomelane-type manganese oxide was synthesized under specific conditions, and it was found that the template cation plays an important role on the formation of this kind of structure. Cryptomelane was found to be a very active oxidation catalyst, converting ethyl acetate into CO2 at low temperatures (220 °C). This catalyst is very stable at least during 90 h of reaction and its performance is not significantly affected by the presence of water vapour or CO2 in the feed stream. The catalyst performance can be improved by the presence of small amounts of Mn3O4. - A dynamically stretchable solid state supercapacitor using graphene woven fabric (GWF) as electrode materials is designed and evaluated. The electrode is developed after GWF film is transferred onto a pre-stretched polymer substrate. Polyaniline is deposited covering the GWF film through in-situ electropolymerization to improve the electrochemical properties of the electrode. The supercapacitor is assembled in sandwich structure and packaged in polymer and its electrochemical performance is investigated under both static and dynamic stretching modes. The stretchable supercapacitors possess excellent static and dynamic stretchability. The dynamic strain can be up to 30% with excellent galvanic stability even under high strain rates (up to 60%/s). - Heterogeneous electron transfer rate constants of a series of chemical systems are estimated using Cyclic Voltammetry (CV) and Electrochemical Impedance Spectroscopy (EIS), and critically compared to one another. Using aqueous, quasi-reversible redox systems, and carbon screen-printed electrodes, this work has been able to quantify rate constants using both techniques and have proved that the two methods sometimes result in measured rate constants that differ by as much as one order of magnitude. The method has been converted to estimate k0 values for irreversible electrochemical systems such as ascorbic acid and norepinephrine, yielding reasonable values for the electron transfer of their respective oxidation reactions. Such electrochemically irreversible cases are compared to data obtained via digital simulations. The work is limited to finite concentration ranges of electroactive species undergoing simple electron processes (‘E’ type reactions). The manuscript provides the field with a simple and effective way estimating electron transfer rate constants for irreversible electrochemical systems without using digital software packages, something which is not possible using either Nicholson or Laviron methods. - source_sentence: Band Structure of graphite sentences: - Rapid progress in identifying biomarkers that are hallmarks of disease has increased demand for high-performance detection technologies. Implementation of electrochemical methods in clinical analysis may provide an effective answer to the growing need for rapid, specific, inexpensive, and fully automated means of biomarker analysis. This Review summarizes advances from the past 5 years in the development of electrochemical sensors for clinically relevant biomolecules, including small molecules, nucleic acids, and proteins. Various sensing strategies are assessed according to their potential for reaching relevant limits of sensitivity, specificity, and degrees of multiplexing. Furthermore, we address the remaining challenges and opportunities to integrate electrochemical sensing platforms into point-of-care solutions. - 'The structure and the electrical, mechanical and optical properties of few-layer graphene (FLG) synthesized by chemical vapor deposition (CVD) on a Ni-coated substrate were studied. Atomic resolution transmission electron microscope (TEM) images show highly crystalline single-layer parts of the sample changing to multi-layer domains where crystal boundaries are connected by chemical bonds. This suggests two different growth mechanisms. CVD and carbon segregation participate in the growth process and are responsible for the different structural formations found. Measurements of the electrical and mechanical properties on the centimeter scale provide evidence of a large scale structural continuity: (1) in the temperature dependence of the electrical conductivity, a non-zero value near 0 K indicates the metallic character of electronic transport; (2) Young''s modulus of a pristine polycarbonate film (1.37 GPa) improves significantly when covered with FLG (1.85 GPa). The latter indicates an extraordinary Young modulus value of the FLG-coating of TPa orders of magnitude. Raman and optical spectroscopy support the previous conclusions. The sample can be used as a flexible and transparent electrode and is suitable for use as special membranes to detect and study individual molecules in high-resolution TEM.' - The site-dependent and spontaneous functionalization of 4-bromobenzene diazonium tetralluoroborate (4-BBDT) and its doping effect on a mechanically exfoliated graphene (MEG) were investigated. The spatially resolved Raman spectra obtained from both edge and basal region of MEG revealed that 4-BBDT molecules were noncovalently functionalized on the basal region of MEG, while they were covalently bonded to the edge of MEG. The chemical doping effect induced by noncovalently functionalized 4-BBDT molecules on a basal plane region of MEG was successfully explicated by Raman spectroscopy. The position of Fermi level of MEG and the type of doping charge carrier induced by the noncovalently adsorbed 4-BBDT molecules were determined from systematic G band and 2D band changes. The successful spectroscopic elucidation of the different bonding characters of 4-BBDT depending on the site of graphene is beneficial for the fundamental studies about the charge transfer phenomena of graphene as well as for the potential applications, such as electronic devices, hybridized composite structures, etc. - source_sentence: Panorama de l’existant sur les capteurs et analyseurs en ligne pour la mesure des parametres physico-chimiques dans l’eau sentences: - 'Le travail de compilation des différents capteurs et analyseurs a été réalisé à partir de différentes sources d''information comme l''annuaire du Guide de l''eau, les sites web des sociétés et les salons professionnels. 71 fabricants ont ainsi été recensés. Un classement a été effectué en considérant: les sondes in situ et les capteurs (1 à 3 paramètres et 4 paramètres et plus), les analyseurs en ligne (avec et sans réactifs, in situ) et les appareils portables. Des retours d''expériences sur le fonctionnement des stations de mesure en continu ont été réalisés pour quatre types d''eau (les cours d''eau, les eaux souterraines, les eaux de rejets et les eaux marines) à travers des entretiens téléphoniques avec les gestionnaires des stations de mesure en France et via la littérature pour les stations situées en Europe. Il en ressort que la configuration de la grande majorité des stations est basée sur un pompage de l''eau dans un local technique par rapport aux stations autonomes in situ. Les paramètres qui sont le plus souvent mesurés sont le pH, la conductivité, l''oxygène dissous, la température, la turbidité, les nutriments (ammonium, nitrates, phosphates) et la matière organique (carbone organique, absorbance spécifique à 254 nm). En fonction des besoins, les micropolluants (notamment métaux, hydrocarbures et HAP), la chlorophylle et les cyanobactéries ainsi que la toxicité sont occasionnellement mesurés. D''une manière générale, les capteurs et analyseurs sont jugés robustes et fiables. Certaines difficultés ont pu être mises en évidence, par exemple les dérives pour les capteurs mesurant l''ammonium. La maintenance associée aux stations de mesure peut être très importante en termes de temps passé et de cout des réactifs. Des études en amont ont souvent été engagées pour vérifier la fiabilité des résultats obtenus, notamment à travers la comparaison avec des mesures de contrôle et des prélèvements suivis d''analyses en laboratoire. Enfin, certains gestionnaires ont mis en place des contrôles qualité rigoureux et fréquents, ceci afin de s''assurer du bon fonctionnement et de la stabilité des capteurs dans le temps.' - Carbon nanotubes have attracted considerable interest for their unique electronic properties. They are fascinating candidates for fundamental studies of one dimensional materials as well as for future molecular electronics applications. The molecular orbitals of nanotubes are of particular importance as they govern the transport properties and the chemical reactivity of the system. Here, we show for the first time a complete experimental investigation of molecular orbitals of single wall carbon nanotubes using atomically resolved scanning tunneling spectroscopy. Local conductance measurements show spectacular carbon-carbon bond asymmetry at the Van Hove singularities for both semiconducting and metallic tubes, demonstrating the symmetry breaking of molecular orbitals in nanotubes. Whatever the tube, only two types of complementary orbitals are alternatively observed. An analytical tight-binding model describing the interference patterns of π orbitals confirmed by ab initio calculations, perfectly reproduces the experimental results. - Bilayer graphene is an intriguing material in that its electronic structure can be altered by changing the stacking order or the relative twist angle, yielding a new class of low-dimensional carbon system. Twisted bilayer graphene can be obtained by (i) thermal decomposition of SiC; (ii) chemical vapor deposition (CVD) on metal catalysts; (iii) folding graphene; or (iv) stacking graphene layers one atop the other, the latter of which suffers from interlayer contamination. Existing synthesis protocols, however, usually result in graphene with polycrystalline structures. The present study investigates bilayer graphene grown by ambient pressure CVD on polycrystalline Cu. Controlling the nucleation in early stage growth allows the constituent layers to form single hexagonal crystals. New Raman active modes are shown to result from the twist, with the angle determined by transmission electron microscopy. The successful growth of single-crystal bilayer graphene provides an attractive jumping-off point for systematic studies of interlayer coupling in misoriented few-layer graphene systems with well-defined geometry. pipeline_tag: sentence-similarity library_name: sentence-transformers --- # SentenceTransformer based on sentence-transformers/all-distilroberta-v1 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-distilroberta-v1](https://huggingface.co/sentence-transformers/all-distilroberta-v1). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [sentence-transformers/all-distilroberta-v1](https://huggingface.co/sentence-transformers/all-distilroberta-v1) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 768 tokens - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: RobertaModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("TomDubois12/fine-tuned-model") # Run inference sentences = [ 'Panorama de l’existant sur les capteurs et analyseurs en ligne pour la mesure des parametres physico-chimiques dans l’eau', "Le travail de compilation des différents capteurs et analyseurs a été réalisé à partir de différentes sources d'information comme l'annuaire du Guide de l'eau, les sites web des sociétés et les salons professionnels. 71 fabricants ont ainsi été recensés. Un classement a été effectué en considérant: les sondes in situ et les capteurs (1 à 3 paramètres et 4 paramètres et plus), les analyseurs en ligne (avec et sans réactifs, in situ) et les appareils portables. Des retours d'expériences sur le fonctionnement des stations de mesure en continu ont été réalisés pour quatre types d'eau (les cours d'eau, les eaux souterraines, les eaux de rejets et les eaux marines) à travers des entretiens téléphoniques avec les gestionnaires des stations de mesure en France et via la littérature pour les stations situées en Europe. Il en ressort que la configuration de la grande majorité des stations est basée sur un pompage de l'eau dans un local technique par rapport aux stations autonomes in situ. Les paramètres qui sont le plus souvent mesurés sont le pH, la conductivité, l'oxygène dissous, la température, la turbidité, les nutriments (ammonium, nitrates, phosphates) et la matière organique (carbone organique, absorbance spécifique à 254 nm). En fonction des besoins, les micropolluants (notamment métaux, hydrocarbures et HAP), la chlorophylle et les cyanobactéries ainsi que la toxicité sont occasionnellement mesurés. D'une manière générale, les capteurs et analyseurs sont jugés robustes et fiables. Certaines difficultés ont pu être mises en évidence, par exemple les dérives pour les capteurs mesurant l'ammonium. La maintenance associée aux stations de mesure peut être très importante en termes de temps passé et de cout des réactifs. Des études en amont ont souvent été engagées pour vérifier la fiabilité des résultats obtenus, notamment à travers la comparaison avec des mesures de contrôle et des prélèvements suivis d'analyses en laboratoire. Enfin, certains gestionnaires ont mis en place des contrôles qualité rigoureux et fréquents, ceci afin de s'assurer du bon fonctionnement et de la stabilité des capteurs dans le temps.", 'Bilayer graphene is an intriguing material in that its electronic structure can be altered by changing the stacking order or the relative twist angle, yielding a new class of low-dimensional carbon system. Twisted bilayer graphene can be obtained by (i) thermal decomposition of SiC; (ii) chemical vapor deposition (CVD) on metal catalysts; (iii) folding graphene; or (iv) stacking graphene layers one atop the other, the latter of which suffers from interlayer contamination. Existing synthesis protocols, however, usually result in graphene with polycrystalline structures. The present study investigates bilayer graphene grown by ambient pressure CVD on polycrystalline Cu. Controlling the nucleation in early stage growth allows the constituent layers to form single hexagonal crystals. New Raman active modes are shown to result from the twist, with the angle determined by transmission electron microscopy. The successful growth of single-crystal bilayer graphene provides an attractive jumping-off point for systematic studies of interlayer coupling in misoriented few-layer graphene systems with well-defined geometry.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 4,224 training samples * Columns: sentence_0, sentence_1, and label * Approximate statistics based on the first 1000 samples: | | sentence_0 | sentence_1 | label | |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:------------------------------------------------| | type | string | string | int | | details | | | | * Samples: | sentence_0 | sentence_1 | label | |:---------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------| | High-Pressure Elastic Properties of Solid Argon to 70 GPa | The acoustic velocities, adiabatic elastic constants, bulk modulus, elastic anisotropy, Cauchy violation, and density in an ideal solid argon (Ar) have been determined at high pressures up to 70 GPa in a diamond anvil cell by making new approaches of Brillouin spectroscopy. These results place the first complete study for elastic properties of dense Ar and provide an improved basis for making the theoretical calculations of rare-gas solids over a wide range of compression. | 1 | | Direct Voltammetric Detection of DNA and pH Sensing on Epitaxial Graphene: An Insight into the Role of Oxygenated Defects | In this paper, we carried out detailed electrochemical studies of epitaxial graphene (EG) using inner-sphere and outer-sphere redox mediators. The EG sample was anodized systematically to investigate the effect of edge plane defects on the heterogeneous charge transfer kinetics and capacitive noise. We found that anodized EG, consisting of oxygen-related defects, is a superior biosensing platform for the detection of nucleic acids, uric acids (UA), dopamine (DA), and ascorbic acids (AA). Mixtures of nucleic acids (A, T, C, G) or biomolecules (AA, UA, DA) can be resolved as individual peaks using differential pulse voltammetry. In fact, an anodized EG voltammetric sensor can realize the simultaneous detection of all four DNA bases in double stranded DNA (dsDNA) without a prehydrolysis step, and it can also differentiate single stranded DNA from dsDNA. Our results show that graphene with high edge plane defects, as opposed to pristine graphene, is the choice platform in high resolution electrochemical sensing. | 1 | | Scanning Electrochemical Microscopy of Carbon Nanomaterials and Graphite | We present a comprehensive study of the chiral-index assignment of carbon nanotubes in aqueous suspensions by resonant Raman scattering of the radial breathing mode. We determine the energies of the first optical transition in metallic tubes and of the second optical transition in semiconducting tubes for more than 50 chiral indices. The assignment is unique and does not depend on empirical parameters. The systematics of the so-called branches in the Kataura plot are discussed; many properties of the tubes are similar for members of the same branch. We show how the radial breathing modes observed in a single Raman spectrum can be easily assigned based on these systematics. In addition, empirical fits provide the energies and radial breathing modes for all metallic and semiconducting nanotubes with diameters between 0.6 and 1.5 nm. We discuss the relation between the frequency of the radial breathing mode and tube diameter. Finally, from the Raman intensities we obtain information on the electron-phonon coupling. | 0 | * Loss: [CosineSimilarityLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters: ```json { "loss_fct": "torch.nn.modules.loss.MSELoss" } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `multi_dataset_batch_sampler`: round_robin #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: no - `prediction_loss_only`: True - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 5e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1 - `num_train_epochs`: 3 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.0 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `eval_use_gather_object`: False - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: round_robin
### Training Logs | Epoch | Step | Training Loss | |:------:|:----:|:-------------:| | 1.8939 | 500 | 0.0778 | ### Framework Versions - Python: 3.12.7 - Sentence Transformers: 3.1.1 - Transformers: 4.45.2 - PyTorch: 2.5.1+cpu - Accelerate: 1.1.1 - Datasets: 3.1.0 - Tokenizers: 0.20.3 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ```