Add new SentenceTransformer model.

9b43aa0 verified 5 months ago

14.1 kB

	---
	base_model: Bofandra/fine-tuning-use-cmlm-multilingual-quran-translation
	datasets: []
	language: []
	library_name: sentence-transformers
	pipeline_tag: sentence-similarity
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	- generated_from_trainer
	- dataset_size:609
	- loss:MegaBatchMarginLoss
	widget:
	- source_sentence: So which of the favors of your Lord would you deny
	sentences:
	- ' This is a straight path.'
	- Have they not traveled through the land and seen how was the end of those before
	them? Allah destroyed [everything] over them, and for the disbelievers is something
	comparable.
	- So which of the favors of your Lord would you deny?
	- source_sentence: So would you perhaps, if you turned away, cause corruption on earth
	and sever your [ties of] relationship
	sentences:
	- Said [the king to the women], "What was your condition when you sought to seduce
	Joseph?" They said, "Perfect is Allah! We know about him no evil." The wife of
	al-'Azeez said, "Now the truth has become evident. It was I who sought to seduce
	him, and indeed, he is of the truthful.
	- Then do they not reflect upon the Qur'an, or are there locks upon [their] hearts?
	- ' Allah has not created the heavens and the earth and what is between them except
	in truth and for a specified term. And indeed, many of the people, in [the matter
	of] the meeting with their Lord, are disbelievers.'
	- source_sentence: Then is he who will shield with his face the worst of the punishment
	on the Day of Resurrection [like one secure from it]
	sentences:
	- ' But you will never find in the way of Allah any change, and you will never find
	in the way of Allah any alteration.'
	- ' Then We made the sun for it an indication.'
	- ' And it will be said to the wrongdoers, "Taste what you used to earn."'
	- source_sentence: Then is it the judgement of [the time of] ignorance they desire
	sentences:
	- Or do you have a clear authority?
	- And they both raced to the door, and she tore his shirt from the back, and they
	found her husband at the door. She said, "What is the recompense of one who intended
	evil for your wife but that he be imprisoned or a painful punishment?"
	- ' But who is better than Allah in judgement for a people who are certain [in faith].'
	- source_sentence: Say, "Who provides for you from the heaven and the earth
	sentences:
	- Except for our first death, and we will not be punished?"
	- And gave a little and [then] refrained?
	- ' Or who controls hearing and sight and who brings the living out of the dead
	and brings the dead out of the living and who arranges [every] matter'
	---

	# SentenceTransformer based on Bofandra/fine-tuning-use-cmlm-multilingual-quran-translation

	This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [Bofandra/fine-tuning-use-cmlm-multilingual-quran-translation](https://huggingface.co/Bofandra/fine-tuning-use-cmlm-multilingual-quran-translation). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

	## Model Details

	### Model Description
	- Model Type: Sentence Transformer
	- Base model: [Bofandra/fine-tuning-use-cmlm-multilingual-quran-translation](https://huggingface.co/Bofandra/fine-tuning-use-cmlm-multilingual-quran-translation) <!-- at revision 46d1967d948e90dde4397f342ad6ddfc99caa96a -->
	- Maximum Sequence Length: 256 tokens
	- Output Dimensionality: 768 tokens
	- Similarity Function: Cosine Similarity
	<!-- - Training Dataset: Unknown -->
	<!-- - Language: Unknown -->
	<!-- - License: Unknown -->

	### Model Sources

	- Documentation: [Sentence Transformers Documentation](https://sbert.net)
	- Repository: [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
	- Hugging Face: [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

	### Full Model Architecture

	```
	SentenceTransformer(
	(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	(2): Normalize()
	)
	```

	## Usage

	### Direct Usage (Sentence Transformers)

	First install the Sentence Transformers library:

	```bash
	pip install -U sentence-transformers
	```

	Then you can load this model and run inference.
	```python
	from sentence_transformers import SentenceTransformer

	# Download from the 🤗 Hub
	model = SentenceTransformer("Bofandra/fine-tuning-use-cmlm-multilingual-quran-translation-qa")
	# Run inference
	sentences = [
	'Say, "Who provides for you from the heaven and the earth',
	' Or who controls hearing and sight and who brings the living out of the dead and brings the dead out of the living and who arranges [every] matter',
	'And gave a little and [then] refrained?',
	]
	embeddings = model.encode(sentences)
	print(embeddings.shape)
	# [3, 768]

	# Get the similarity scores for the embeddings
	similarities = model.similarity(embeddings, embeddings)
	print(similarities.shape)
	# [3, 3]
	```

	<!--
	### Direct Usage (Transformers)

	<details><summary>Click to see the direct usage in Transformers</summary>

	</details>
	-->

	<!--
	### Downstream Usage (Sentence Transformers)

	You can finetune this model on your own dataset.

	<details><summary>Click to expand</summary>

	</details>
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Dataset

	#### Unnamed Dataset


	* Size: 609 training samples
	* Columns: <code>sentence_0</code> and <code>sentence_1</code>
	* Approximate statistics based on the first 1000 samples:
	\| \| sentence_0 \| sentence_1 \|
	\|:--------\|:----------------------------------------------------------------------------------\|:-----------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 3 tokens</li><li>mean: 29.19 tokens</li><li>max: 93 tokens</li></ul> \| <ul><li>min: 3 tokens</li><li>mean: 29.93 tokens</li><li>max: 141 tokens</li></ul> \|
	* Samples:
	\| sentence_0 \| sentence_1 \|
	\|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|:---------------------------------------------------------------------------------------------------\|
	\| <code>And then there came to them that which they were promised</code> \| <code>Shall I inform you upon whom the devils descend?</code> \|
	\| <code>But when the truth came to them from Us, they said, "Why was he not given like that which was given to Moses</code> \| <code>" Did they not disbelieve in that which was given to Moses before</code> \|
	\| <code>Have you not considered the assembly of the Children of Israel after [the time of] Moses when they said to a prophet of theirs, "Send to us a king, and we will fight in the way of Allah "</code> \| <code> He said, "Would you perhaps refrain from fighting if fighting was prescribed for you</code> \|
	* Loss: [<code>MegaBatchMarginLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#megabatchmarginloss)

	### Training Hyperparameters
	#### Non-Default Hyperparameters

	- `per_device_train_batch_size`: 4
	- `per_device_eval_batch_size`: 4
	- `num_train_epochs`: 1
	- `multi_dataset_batch_sampler`: round_robin

	#### All Hyperparameters
	<details><summary>Click to expand</summary>

	- `overwrite_output_dir`: False
	- `do_predict`: False
	- `eval_strategy`: no
	- `prediction_loss_only`: True
	- `per_device_train_batch_size`: 4
	- `per_device_eval_batch_size`: 4
	- `per_gpu_train_batch_size`: None
	- `per_gpu_eval_batch_size`: None
	- `gradient_accumulation_steps`: 1
	- `eval_accumulation_steps`: None
	- `learning_rate`: 5e-05
	- `weight_decay`: 0.0
	- `adam_beta1`: 0.9
	- `adam_beta2`: 0.999
	- `adam_epsilon`: 1e-08
	- `max_grad_norm`: 1
	- `num_train_epochs`: 1
	- `max_steps`: -1
	- `lr_scheduler_type`: linear
	- `lr_scheduler_kwargs`: {}
	- `warmup_ratio`: 0.0
	- `warmup_steps`: 0
	- `log_level`: passive
	- `log_level_replica`: warning
	- `log_on_each_node`: True
	- `logging_nan_inf_filter`: True
	- `save_safetensors`: True
	- `save_on_each_node`: False
	- `save_only_model`: False
	- `restore_callback_states_from_checkpoint`: False
	- `no_cuda`: False
	- `use_cpu`: False
	- `use_mps_device`: False
	- `seed`: 42
	- `data_seed`: None
	- `jit_mode_eval`: False
	- `use_ipex`: False
	- `bf16`: False
	- `fp16`: False
	- `fp16_opt_level`: O1
	- `half_precision_backend`: auto
	- `bf16_full_eval`: False
	- `fp16_full_eval`: False
	- `tf32`: None
	- `local_rank`: 0
	- `ddp_backend`: None
	- `tpu_num_cores`: None
	- `tpu_metrics_debug`: False
	- `debug`: []
	- `dataloader_drop_last`: False
	- `dataloader_num_workers`: 0
	- `dataloader_prefetch_factor`: None
	- `past_index`: -1
	- `disable_tqdm`: False
	- `remove_unused_columns`: True
	- `label_names`: None
	- `load_best_model_at_end`: False
	- `ignore_data_skip`: False
	- `fsdp`: []
	- `fsdp_min_num_params`: 0
	- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
	- `fsdp_transformer_layer_cls_to_wrap`: None
	- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
	- `deepspeed`: None
	- `label_smoothing_factor`: 0.0
	- `optim`: adamw_torch
	- `optim_args`: None
	- `adafactor`: False
	- `group_by_length`: False
	- `length_column_name`: length
	- `ddp_find_unused_parameters`: None
	- `ddp_bucket_cap_mb`: None
	- `ddp_broadcast_buffers`: False
	- `dataloader_pin_memory`: True
	- `dataloader_persistent_workers`: False
	- `skip_memory_metrics`: True
	- `use_legacy_prediction_loop`: False
	- `push_to_hub`: False
	- `resume_from_checkpoint`: None
	- `hub_model_id`: None
	- `hub_strategy`: every_save
	- `hub_private_repo`: False
	- `hub_always_push`: False
	- `gradient_checkpointing`: False
	- `gradient_checkpointing_kwargs`: None
	- `include_inputs_for_metrics`: False
	- `eval_do_concat_batches`: True
	- `fp16_backend`: auto
	- `push_to_hub_model_id`: None
	- `push_to_hub_organization`: None
	- `mp_parameters`:
	- `auto_find_batch_size`: False
	- `full_determinism`: False
	- `torchdynamo`: None
	- `ray_scope`: last
	- `ddp_timeout`: 1800
	- `torch_compile`: False
	- `torch_compile_backend`: None
	- `torch_compile_mode`: None
	- `dispatch_batches`: None
	- `split_batches`: None
	- `include_tokens_per_second`: False
	- `include_num_input_tokens_seen`: False
	- `neftune_noise_alpha`: None
	- `optim_target_modules`: None
	- `batch_eval_metrics`: False
	- `eval_on_start`: False
	- `batch_sampler`: batch_sampler
	- `multi_dataset_batch_sampler`: round_robin

	</details>

	### Framework Versions
	- Python: 3.10.12
	- Sentence Transformers: 3.0.1
	- Transformers: 4.42.3
	- PyTorch: 2.3.0+cu121
	- Accelerate: 0.31.0
	- Datasets: 2.20.0
	- Tokenizers: 0.19.1

	## Citation

	### BibTeX

	#### Sentence Transformers
	```bibtex
	@inproceedings{reimers-2019-sentence-bert,
	title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
	author = "Reimers, Nils and Gurevych, Iryna",
	booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
	month = "11",
	year = "2019",
	publisher = "Association for Computational Linguistics",
	url = "https://arxiv.org/abs/1908.10084",
	}
	```

	#### MegaBatchMarginLoss
	```bibtex
	@inproceedings{wieting-gimpel-2018-paranmt,
	title = "{P}ara{NMT}-50{M}: Pushing the Limits of Paraphrastic Sentence Embeddings with Millions of Machine Translations",
	author = "Wieting, John and Gimpel, Kevin",
	editor = "Gurevych, Iryna and Miyao, Yusuke",
	booktitle = "Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
	month = jul,
	year = "2018",
	address = "Melbourne, Australia",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/P18-1042",
	doi = "10.18653/v1/P18-1042",
	pages = "451--462",
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->