--- base_model: sentence-transformers/all-MiniLM-L6-v2 datasets: [] language: [] library_name: sentence-transformers pipeline_tag: sentence-similarity tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:2860 - loss:CosineSimilarityLoss widget: - source_sentence: No, it is not true. The sex chromosomes of the father determine the sex of an unborn baby, not the mother. sentences: - The wall of the uterus expands outward like a balloon during ovum maturation. - The mother's emotional state during pregnancy can influence the sex of the baby, making her solely responsible for determining it. - Six - source_sentence: Answer not found in response. sentences: - nan - In living organisms, cells are likened to bricks in a building due to their role as structural components. - Plant cells exclusively house chloroplasts as they play a crucial role in converting sunlight into energy for plants through the process of photosynthesis. These specialized organelles possess chlorophyll, a green pigment essential for absorbing light energy. - source_sentence: The organelles found in the cytoplasm of a cell include mitochondria, golgi bodies, ribosomes, and other components. sentences: - Examples of diseases that vaccines offer protection from are cholera, tuberculosis, smallpox, and hepatitis. - Having a balanced diet helps regulate the levels of fairy dust in the body, which indirectly impacts reproductive health. - Mitochondria, golgi bodies, ribosomes, and various other structures are present in the cytoplasm of a cell. - source_sentence: The basic practices of crop production include preparation of soil, sowing, adding manure and fertilizers, irrigation, protecting from weeds, harvesting, and storage. sentences: - You can see miniature plants growing inside the water droplet. - Changes in their natural surroundings, such as deforestation and desertification, cause migratory birds to fly to distant areas, impacting their access to food, places for breeding, and the overall ecosystem. - Essential tasks involved in crop cultivation consist of priming the soil, planting seeds, applying fertilizers and manure, providing water, preventing weed growth, collecting the crops, and storing them. - source_sentence: The embryo gets embedded in the wall of the uterus for further development after fertilisation. sentences: - By recycling paper, the need for harvesting trees for paper production can be significantly reduced, leading to conservation of trees, energy, and water, as well as minimizing the use of harmful chemicals in the paper-making process. - In the rainy season, if you examine moist bread, you may see greyish white spots that are adorned with minuscule, black circular shapes, believed to be microorganisms that have thrived on the bread. - Following fertilization, the embryo attaches to the uterine wall to progress in its development. --- # SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2 This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) - **Maximum Sequence Length:** 256 tokens - **Output Dimensionality:** 384 tokens - **Similarity Function:** Cosine Similarity ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("msamg/sts_qna_model") # Run inference sentences = [ 'The embryo gets embedded in the wall of the uterus for further development after fertilisation.', 'Following fertilization, the embryo attaches to the uterine wall to progress in its development.', 'By recycling paper, the need for harvesting trees for paper production can be significantly reduced, leading to conservation of trees, energy, and water, as well as minimizing the use of harmful chemicals in the paper-making process.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 384] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 2,860 training samples * Columns: sentence_0, sentence_1, and label * Approximate statistics based on the first 1000 samples: | | sentence_0 | sentence_1 | label | |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------| | type | string | string | float | | details | | | | * Samples: | sentence_0 | sentence_1 | label | |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------| | To identify the cell membrane, cytoplasm, and nucleus under a microscope when observing cheek cells, you can look for the cell membrane as the outer boundary of the cell, the cytoplasm which is the jelly-like substance between the cell membrane and the nucleus, and the nucleus which is usually darker and located in the center of the cell. Additionally, remember that animal cells do not have a cell wall. | When examining cheek cells under a microscope, you should be able to distinguish the cell membrane, which forms the outer layer, the cytoplasm, which is a gel-like material surrounding the nucleus, and the nucleus, located centrally and typically darker in appearance. It's important to note that animal cells lack a cell wall. | 1.0 | | The development of the embryo in oviparous animals takes place inside the egg shell. | The development of the embryo in oviparous animals takes place in the mother's pouch. | 0.0 | | Answer not found in response. | nan | 1.0 | * Loss: [CosineSimilarityLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosinesimilarityloss) with these parameters: ```json { "loss_fct": "torch.nn.modules.loss.MSELoss" } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `num_train_epochs`: 1 - `multi_dataset_batch_sampler`: round_robin #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: no - `prediction_loss_only`: True - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `learning_rate`: 5e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1 - `num_train_epochs`: 1 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.0 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: False - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: round_robin
### Framework Versions - Python: 3.11.3 - Sentence Transformers: 3.0.1 - Transformers: 4.42.4 - PyTorch: 2.3.1+cpu - Accelerate: 0.32.1 - Datasets: 2.20.0 - Tokenizers: 0.19.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ```