msmarco-distilbert-base-v4_1 / README.md

Model save

caf671e verified about 1 month ago

35.4 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- sentence-transformers
	- sentence-similarity
	- feature-extraction
	- generated_from_trainer
	- dataset_size:154
	- loss:MatryoshkaLoss
	- loss:MultipleNegativesRankingLoss
	base_model: sentence-transformers/msmarco-distilbert-base-v4
	widget:
	- source_sentence: Hey, what career oppotunities do you provide?
	sentences:
	- TechChefz Digital is present in two countries. Its headquarters is in Noida, India,
	with additional offices in Delaware, United States, and Gauram Nagar, Delhi, India.
	- 'Customer Experience & Marketing Technology

	Covering journey science, content architecture, personalization, campaign management,
	and conversion rate optimization, driving customer experiences and engagements


	Enterprise Platforms & Systems Integration

	Platform selection services in CMS, e-commerce, and learning management systems,
	with a focus on marketplace commerce


	Analytics, Data Science & Business Intelligence

	Engage in analytics, data science, and machine learning to derive insights. Implement
	intelligent search, recommendation engines, and predictive models for optimization
	and enhanced decision-making. TechChefz Digital seeks passionate individuals to
	join our innovative team. We offer dynamic work environments fostering creativity
	and expertise. Whether you''re seasoned or fresh, exciting career opportunities
	await in technology, consulting, design, and more. Join us in shaping digital
	transformation and unlocking possibilities for clients and the industry.

	7+ Years Industry Experience


	300+ Enthusiasts


	80% Employee Retention Rate

	'
	- 'How long does it take to develop an e-commerce website?

	The development time for an e-commerce website can vary widely depending on its
	complexity, features, and the platform chosen. A basic online store might take
	a few weeks to set up, while a custom, feature-rich site could take several months
	to develop. Clear communication of your requirements and timely decision-making
	can help streamline the process.'
	- source_sentence: What technologies are used for web development?
	sentences:
	- 'Our Featured Insights

	Simplifying Image Loading in React with Lazy Loading and Intersection Observer
	API


	What Is React Js?


	The Role of Artificial Intelligence (AI) in Personalizing Digital Marketing Campaigns


	Mastering Personalization in Digital Marketing: Tailoring Campaigns for Success


	How Customer Experience Drives Your Business Growth


	Which is the best CMS for your Digital Transformation Journey?


	The Art of Test Case Creation Templates'
	- 'DISCOVER TECHSTACK

	Empowering solutions

	with cutting-edge technology stacks

	Web & Mobile Development

	Crafting dynamic and engaging online experiences tailored to your brand''s vision
	and objectives.

	Content Management Systems

	3D, AR & VR

	Learning Management System

	Commerce

	Analytics

	Personalization & Marketing Cloud

	Cloud & DevSecOps

	Tech Stack

	HTML, JS, CSS

	React JS

	Angular JS

	Vue JS

	Next JS

	React Native

	Flutter

	Node JS

	Python

	Frappe

	Java

	Spring Boot

	Go Lang

	Mongo DB

	PostgreSQL

	MySQL'
	- 'Can you help migrate our existing infrastructure to a DevOps model?

	Yes, we specialize in transitioning traditional IT infrastructure to a DevOps
	model. Our process includes assessing your current setup, planning the migration,
	implementing the necessary tools and practices, and providing ongoing support
	to ensure a smooth transition.'
	- source_sentence: Where is TechChefz based?
	sentences:
	- 'CLIENT TESTIMONIALS

	Worked with TCZ on two business critical website development projects. The TCZ
	team is a group of experts in their respective domains and have helped us with
	excellent end-to-end development of a website right from the conceptualization
	to implementation and maintenance. By Dr. Kunal Joshi - Healthcare Marketing &
	Strategy Professional


	TCZ helped us with our new website launch in a seamless manner. Through all our
	discussions, they made sure to have the website designed as we had envisioned
	it to be. Thank you team TCZ.

	By Dr. Sarita Ahlawat - Managing Director and Co-Founder, Botlab Dynamics '
	- TechChefz Digital is present in two countries. Its headquarters is in Noida, India,
	with additional offices in Delaware, United States, and Gauram Nagar, Delhi, India.
	- " What we do\n\nDigital Strategy\nCreating digital frameworks that transform\
	\ your digital enterprise and produce a return on investment.\n\nPlatform Selection\n\
	Helping you select the optimal digital experience, commerce, cloud and marketing\
	\ platform for your enterprise.\n\nPlatform Builds\nDeploying next-gen scalable\
	\ and agile enterprise digital platforms, along with multi-platform integrations.\n\
	\nProduct Builds\nHelp you ideate, strategize, and engineer your product with\
	\ help of our enterprise frameworks \n\nTeam Augmentation\nHelp you scale up and\
	\ augment your existing team to solve your hiring challenges with our easy to\
	\ deploy staff augmentation offerings .\nManaged Services\nOperate and monitor\
	\ your business-critical applications, data, and IT workloads, along with Application\
	\ maintenance and operations\n"
	- source_sentence: Will you assess our current infrastructure before migrating?
	sentences:
	- 'Introducing the world of Global EdTech Firm.


	In this project, We implemented a comprehensive digital platform strategy to unify
	user experience across platforms, integrating diverse tech stacks and specialized
	platforms to enhance customer engagement and streamline operations.

	Develop tailored online tutoring and learning hub platforms, leveraging AI/ML
	for personalized learning experiences, thus accelerating user journeys and improving
	conversion rates.

	Provide managed services for seamless application support and platform stabilization,
	optimizing operational efficiency and enabling scalable B2B subscriptions for
	schools and districts, facilitating easy onboarding and growth across the US States.


	We also achieved 200% Improvement in Courses & Content being delivered to Students.
	50% Increase in Student’s Retention 150%, Increase in Teacher & Tutor Retention.'
	- TechChefz Digital has established its presence in two countries, showcasing its
	global reach and influence. The company’s headquarters is strategically located
	in Noida, India, serving as the central hub for its operations and leadership.
	In addition to the headquarters, TechChefz Digital has expanded its footprint
	with offices in Delaware, United States, allowing the company to cater to the
	North American market with ease and efficiency.
	- 'Can you help migrate our existing infrastructure to a DevOps model?

	Yes, we specialize in transitioning traditional IT infrastructure to a DevOps
	model. Our process includes assessing your current setup, planning the migration,
	implementing the necessary tools and practices, and providing ongoing support
	to ensure a smooth transition.'
	- source_sentence: What steps do you take to understand a business's needs?
	sentences:
	- 'How do you customize your DevOps solutions for different industries?

	We understand that each industry has unique challenges and requirements. Our approach
	involves a thorough analysis of your business needs, industry standards, and regulatory
	requirements to tailor a DevOps solution that meets your specific objectives'
	- "Inception: Pioneering the Digital Frontier In our foundational year, TechChefz\
	\ embarked on a journey of digital transformation, laying the groundwork for our\
	\ future endeavors. We began working on Cab Accelerator Apps akin to Uber and\
	\ Ola, deploying them across Europe, Africa, and Australia, marking our initial\
	\ foray into global markets. Alongside, we successfully delivered technology trainings\
	\ across USA & India. \nqueries-techchefz-website\nqueries-techchefz-website\n\
	100%\n10\nA4\n\nAccelerating Momentum: A year of strategic partnerships & Transformative\
	\ Projects. In 2018, TechChefz continued to build on its strong foundation, expanding\
	\ its global footprint and forging strategic partnerships. Our collaboration with\
	\ digital agencies and system integrators propelled us into enterprise accounts,\
	\ focusing on digital experience development. This year marked significant collaborations\
	\ with leading automotive brands and financial institutions, enhancing our portfolio\
	\ and establishing TechChefz as a trusted partner in the industry. \n "
	- 'Our Vision Be a partner for industry verticals on the inevitable journey towards
	enterprise transformation and future readiness, by harnessing the growing power
	of Artificial Intelligence, Machine Learning, Data Science and emerging methodologies,
	with immediacy of impact and swiftness of outcome.Our Mission

	To decode data, and code new intelligence into products and automation, engineer,
	develop and deploy systems and applications that redefine experiences and realign
	business growth.'
	pipeline_tag: sentence-similarity
	library_name: sentence-transformers
	metrics:
	- cosine_accuracy@1
	- cosine_accuracy@3
	- cosine_accuracy@5
	- cosine_accuracy@10
	- cosine_precision@1
	- cosine_precision@3
	- cosine_precision@5
	- cosine_precision@10
	- cosine_recall@1
	- cosine_recall@3
	- cosine_recall@5
	- cosine_recall@10
	- cosine_ndcg@10
	- cosine_mrr@10
	- cosine_map@100
	model-index:
	- name: BGE base Financial Matryoshka
	results:
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 768
	type: dim_768
	metrics:
	- type: cosine_accuracy@1
	value: 0.03896103896103896
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.4805194805194805
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.5714285714285714
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 0.6493506493506493
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.03896103896103896
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.1601731601731602
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.11428571428571425
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.06493506493506492
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.03896103896103896
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.4805194805194805
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.5714285714285714
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 0.6493506493506493
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.3349468392248154
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.23376623376623376
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.24652168791713625
	name: Cosine Map@100
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 512
	type: dim_512
	metrics:
	- type: cosine_accuracy@1
	value: 0.025974025974025976
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.4935064935064935
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.5844155844155844
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 0.6493506493506493
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.025974025974025976
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.1645021645021645
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.11688311688311684
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.06493506493506492
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.025974025974025976
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.4935064935064935
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.5844155844155844
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 0.6493506493506493
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.3381817622000061
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.23697691197691195
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.2485755814005223
	name: Cosine Map@100
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 256
	type: dim_256
	metrics:
	- type: cosine_accuracy@1
	value: 0.05194805194805195
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.4675324675324675
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.5194805194805194
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 0.6233766233766234
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.05194805194805195
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.15584415584415587
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.1038961038961039
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.062337662337662324
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.05194805194805195
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.4675324675324675
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.5194805194805194
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 0.6233766233766234
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.3379715765084199
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.24577922077922074
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.2597360814073472
	name: Cosine Map@100
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 128
	type: dim_128
	metrics:
	- type: cosine_accuracy@1
	value: 0.05194805194805195
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.44155844155844154
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.5584415584415584
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 0.6623376623376623
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.05194805194805195
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.14718614718614723
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.11168831168831166
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.0662337662337662
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.05194805194805195
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.44155844155844154
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.5584415584415584
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 0.6623376623376623
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.34288867015255386
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.24065656565656557
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.2507978917088375
	name: Cosine Map@100
	- task:
	type: information-retrieval
	name: Information Retrieval
	dataset:
	name: dim 64
	type: dim_64
	metrics:
	- type: cosine_accuracy@1
	value: 0.06493506493506493
	name: Cosine Accuracy@1
	- type: cosine_accuracy@3
	value: 0.4155844155844156
	name: Cosine Accuracy@3
	- type: cosine_accuracy@5
	value: 0.5064935064935064
	name: Cosine Accuracy@5
	- type: cosine_accuracy@10
	value: 0.5974025974025974
	name: Cosine Accuracy@10
	- type: cosine_precision@1
	value: 0.06493506493506493
	name: Cosine Precision@1
	- type: cosine_precision@3
	value: 0.13852813852813856
	name: Cosine Precision@3
	- type: cosine_precision@5
	value: 0.1012987012987013
	name: Cosine Precision@5
	- type: cosine_precision@10
	value: 0.05974025974025971
	name: Cosine Precision@10
	- type: cosine_recall@1
	value: 0.06493506493506493
	name: Cosine Recall@1
	- type: cosine_recall@3
	value: 0.4155844155844156
	name: Cosine Recall@3
	- type: cosine_recall@5
	value: 0.5064935064935064
	name: Cosine Recall@5
	- type: cosine_recall@10
	value: 0.5974025974025974
	name: Cosine Recall@10
	- type: cosine_ndcg@10
	value: 0.32285221821950844
	name: Cosine Ndcg@10
	- type: cosine_mrr@10
	value: 0.23481240981240978
	name: Cosine Mrr@10
	- type: cosine_map@100
	value: 0.24816289395996594
	name: Cosine Map@100
	---

	# BGE base Financial Matryoshka

	This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/msmarco-distilbert-base-v4](https://huggingface.co/sentence-transformers/msmarco-distilbert-base-v4). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

	## Model Details

	### Model Description
	- Model Type: Sentence Transformer
	- Base model: [sentence-transformers/msmarco-distilbert-base-v4](https://huggingface.co/sentence-transformers/msmarco-distilbert-base-v4) <!-- at revision 19f0f4c73dc418bad0e0fc600611e808b7448a28 -->
	- Maximum Sequence Length: 512 tokens
	- Output Dimensionality: 768 dimensions
	- Similarity Function: Cosine Similarity
	<!-- - Training Dataset: Unknown -->
	- Language: en
	- License: apache-2.0

	### Model Sources

	- Documentation: [Sentence Transformers Documentation](https://sbert.net)
	- Repository: [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
	- Hugging Face: [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

	### Full Model Architecture

	```
	SentenceTransformer(
	(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel
	(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
	)
	```

	## Usage

	### Direct Usage (Sentence Transformers)

	First install the Sentence Transformers library:

	```bash
	pip install -U sentence-transformers
	```

	Then you can load this model and run inference.
	```python
	from sentence_transformers import SentenceTransformer

	# Download from the 🤗 Hub
	model = SentenceTransformer("Shashwat13333/msmarco-distilbert-base-v4")
	# Run inference
	sentences = [
	"What steps do you take to understand a business's needs?",
	'How do you customize your DevOps solutions for different industries?\nWe understand that each industry has unique challenges and requirements. Our approach involves a thorough analysis of your business needs, industry standards, and regulatory requirements to tailor a DevOps solution that meets your specific objectives',
	'Our Vision Be a partner for industry verticals on the inevitable journey towards enterprise transformation and future readiness, by harnessing the growing power of Artificial Intelligence, Machine Learning, Data Science and emerging methodologies, with immediacy of impact and swiftness of outcome.Our Mission\nTo decode data, and code new intelligence into products and automation, engineer, develop and deploy systems and applications that redefine experiences and realign business growth.',
	]
	embeddings = model.encode(sentences)
	print(embeddings.shape)
	# [3, 768]

	# Get the similarity scores for the embeddings
	similarities = model.similarity(embeddings, embeddings)
	print(similarities.shape)
	# [3, 3]
	```

	<!--
	### Direct Usage (Transformers)

	<details><summary>Click to see the direct usage in Transformers</summary>

	</details>
	-->

	<!--
	### Downstream Usage (Sentence Transformers)

	You can finetune this model on your own dataset.

	<details><summary>Click to expand</summary>

	</details>
	-->

	<!--
	### Out-of-Scope Use

	List how the model may foreseeably be misused and address what users ought not to do with the model.
	-->

	## Evaluation

	### Metrics

	#### Information Retrieval

	* Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
	* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

	\| Metric \| dim_768 \| dim_512 \| dim_256 \| dim_128 \| dim_64 \|
	\|:--------------------\|:-----------\|:-----------\|:----------\|:-----------\|:-----------\|
	\| cosine_accuracy@1 \| 0.039 \| 0.026 \| 0.0519 \| 0.0519 \| 0.0649 \|
	\| cosine_accuracy@3 \| 0.4805 \| 0.4935 \| 0.4675 \| 0.4416 \| 0.4156 \|
	\| cosine_accuracy@5 \| 0.5714 \| 0.5844 \| 0.5195 \| 0.5584 \| 0.5065 \|
	\| cosine_accuracy@10 \| 0.6494 \| 0.6494 \| 0.6234 \| 0.6623 \| 0.5974 \|
	\| cosine_precision@1 \| 0.039 \| 0.026 \| 0.0519 \| 0.0519 \| 0.0649 \|
	\| cosine_precision@3 \| 0.1602 \| 0.1645 \| 0.1558 \| 0.1472 \| 0.1385 \|
	\| cosine_precision@5 \| 0.1143 \| 0.1169 \| 0.1039 \| 0.1117 \| 0.1013 \|
	\| cosine_precision@10 \| 0.0649 \| 0.0649 \| 0.0623 \| 0.0662 \| 0.0597 \|
	\| cosine_recall@1 \| 0.039 \| 0.026 \| 0.0519 \| 0.0519 \| 0.0649 \|
	\| cosine_recall@3 \| 0.4805 \| 0.4935 \| 0.4675 \| 0.4416 \| 0.4156 \|
	\| cosine_recall@5 \| 0.5714 \| 0.5844 \| 0.5195 \| 0.5584 \| 0.5065 \|
	\| cosine_recall@10 \| 0.6494 \| 0.6494 \| 0.6234 \| 0.6623 \| 0.5974 \|
	\| cosine_ndcg@10 \| 0.3349 \| 0.3382 \| 0.338 \| 0.3429 \| 0.3229 \|
	\| cosine_mrr@10 \| 0.2338 \| 0.237 \| 0.2458 \| 0.2407 \| 0.2348 \|
	\| cosine_map@100 \| 0.2465 \| 0.2486 \| 0.2597 \| 0.2508 \| 0.2482 \|

	<!--
	## Bias, Risks and Limitations

	What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.
	-->

	<!--
	### Recommendations

	What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.
	-->

	## Training Details

	### Training Dataset

	#### Unnamed Dataset


	* Size: 154 training samples
	* Columns: <code>anchor</code> and <code>positive</code>
	* Approximate statistics based on the first 154 samples:
	\| \| anchor \| positive \|
	\|:--------\|:----------------------------------------------------------------------------------\|:------------------------------------------------------------------------------------\|
	\| type \| string \| string \|
	\| details \| <ul><li>min: 7 tokens</li><li>mean: 12.43 tokens</li><li>max: 20 tokens</li></ul> \| <ul><li>min: 20 tokens</li><li>mean: 126.6 tokens</li><li>max: 378 tokens</li></ul> \|
	* Samples:
	\| anchor \| positive \|
	\|:---------------------------------------------------------\|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\|
	\| <code>What kind of websites can you help us with?</code> \| <code>CLIENT TESTIMONIALS<br>Worked with TCZ on two business critical website development projects. The TCZ team is a group of experts in their respective domains and have helped us with excellent end-to-end development of a website right from the conceptualization to implementation and maintenance. By Dr. Kunal Joshi - Healthcare Marketing & Strategy Professional<br><br>TCZ helped us with our new website launch in a seamless manner. Through all our discussions, they made sure to have the website designed as we had envisioned it to be. Thank you team TCZ.<br>By Dr. Sarita Ahlawat - Managing Director and Co-Founder, Botlab Dynamics </code> \|
	\| <code>What does DevSecOps mean?</code> \| <code>How do you ensure the security of our DevOps pipeline?<br>Security is a top priority in our DevOps solutions. We implement DevSecOps practices, integrating security measures into the CI/CD pipeline from the outset. This includes automated security scans, compliance checks, and vulnerability assessments to ensure your infrastructure is secure</code> \|
	\| <code>do you work with tech like nlp ?</code> \| <code>What AI solutions does Techchefz specialize in?<br>We specialize in a range of AI solutions including recommendation engines, NLP, computer vision, customer segmentation, predictive analytics, operational efficiency through machine learning, risk management, and conversational AI for customer service.</code> \|
	* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
	```json
	{
	"loss": "MultipleNegativesRankingLoss",
	"matryoshka_dims": [
	768,
	512,
	256,
	128,
	64
	],
	"matryoshka_weights": [
	1,
	1,
	1,
	1,
	1
	],
	"n_dims_per_step": -1
	}
	```

	### Training Hyperparameters
	#### Non-Default Hyperparameters

	- `eval_strategy`: epoch
	- `gradient_accumulation_steps`: 4
	- `learning_rate`: 1e-05
	- `weight_decay`: 0.01
	- `num_train_epochs`: 4
	- `lr_scheduler_type`: cosine
	- `warmup_ratio`: 0.1
	- `fp16`: True
	- `load_best_model_at_end`: True
	- `optim`: adamw_torch_fused
	- `push_to_hub`: True
	- `hub_model_id`: Shashwat13333/msmarco-distilbert-base-v4_1
	- `push_to_hub_model_id`: msmarco-distilbert-base-v4_1
	- `batch_sampler`: no_duplicates

	#### All Hyperparameters
	<details><summary>Click to expand</summary>

	- `overwrite_output_dir`: False
	- `do_predict`: False
	- `eval_strategy`: epoch
	- `prediction_loss_only`: True
	- `per_device_train_batch_size`: 8
	- `per_device_eval_batch_size`: 8
	- `per_gpu_train_batch_size`: None
	- `per_gpu_eval_batch_size`: None
	- `gradient_accumulation_steps`: 4
	- `eval_accumulation_steps`: None
	- `torch_empty_cache_steps`: None
	- `learning_rate`: 1e-05
	- `weight_decay`: 0.01
	- `adam_beta1`: 0.9
	- `adam_beta2`: 0.999
	- `adam_epsilon`: 1e-08
	- `max_grad_norm`: 1.0
	- `num_train_epochs`: 4
	- `max_steps`: -1
	- `lr_scheduler_type`: cosine
	- `lr_scheduler_kwargs`: {}
	- `warmup_ratio`: 0.1
	- `warmup_steps`: 0
	- `log_level`: passive
	- `log_level_replica`: warning
	- `log_on_each_node`: True
	- `logging_nan_inf_filter`: True
	- `save_safetensors`: True
	- `save_on_each_node`: False
	- `save_only_model`: False
	- `restore_callback_states_from_checkpoint`: False
	- `no_cuda`: False
	- `use_cpu`: False
	- `use_mps_device`: False
	- `seed`: 42
	- `data_seed`: None
	- `jit_mode_eval`: False
	- `use_ipex`: False
	- `bf16`: False
	- `fp16`: True
	- `fp16_opt_level`: O1
	- `half_precision_backend`: auto
	- `bf16_full_eval`: False
	- `fp16_full_eval`: False
	- `tf32`: None
	- `local_rank`: 0
	- `ddp_backend`: None
	- `tpu_num_cores`: None
	- `tpu_metrics_debug`: False
	- `debug`: []
	- `dataloader_drop_last`: False
	- `dataloader_num_workers`: 0
	- `dataloader_prefetch_factor`: None
	- `past_index`: -1
	- `disable_tqdm`: False
	- `remove_unused_columns`: True
	- `label_names`: None
	- `load_best_model_at_end`: True
	- `ignore_data_skip`: False
	- `fsdp`: []
	- `fsdp_min_num_params`: 0
	- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
	- `fsdp_transformer_layer_cls_to_wrap`: None
	- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
	- `deepspeed`: None
	- `label_smoothing_factor`: 0.0
	- `optim`: adamw_torch_fused
	- `optim_args`: None
	- `adafactor`: False
	- `group_by_length`: False
	- `length_column_name`: length
	- `ddp_find_unused_parameters`: None
	- `ddp_bucket_cap_mb`: None
	- `ddp_broadcast_buffers`: False
	- `dataloader_pin_memory`: True
	- `dataloader_persistent_workers`: False
	- `skip_memory_metrics`: True
	- `use_legacy_prediction_loop`: False
	- `push_to_hub`: True
	- `resume_from_checkpoint`: None
	- `hub_model_id`: Shashwat13333/msmarco-distilbert-base-v4_1
	- `hub_strategy`: every_save
	- `hub_private_repo`: None
	- `hub_always_push`: False
	- `gradient_checkpointing`: False
	- `gradient_checkpointing_kwargs`: None
	- `include_inputs_for_metrics`: False
	- `include_for_metrics`: []
	- `eval_do_concat_batches`: True
	- `fp16_backend`: auto
	- `push_to_hub_model_id`: msmarco-distilbert-base-v4_1
	- `push_to_hub_organization`: None
	- `mp_parameters`:
	- `auto_find_batch_size`: False
	- `full_determinism`: False
	- `torchdynamo`: None
	- `ray_scope`: last
	- `ddp_timeout`: 1800
	- `torch_compile`: False
	- `torch_compile_backend`: None
	- `torch_compile_mode`: None
	- `dispatch_batches`: None
	- `split_batches`: None
	- `include_tokens_per_second`: False
	- `include_num_input_tokens_seen`: False
	- `neftune_noise_alpha`: None
	- `optim_target_modules`: None
	- `batch_eval_metrics`: False
	- `eval_on_start`: False
	- `use_liger_kernel`: False
	- `eval_use_gather_object`: False
	- `average_tokens_across_devices`: False
	- `prompts`: None
	- `batch_sampler`: no_duplicates
	- `multi_dataset_batch_sampler`: proportional

	</details>

	### Training Logs
	\| Epoch \| Step \| Training Loss \| dim_768_cosine_ndcg@10 \| dim_512_cosine_ndcg@10 \| dim_256_cosine_ndcg@10 \| dim_128_cosine_ndcg@10 \| dim_64_cosine_ndcg@10 \|
	\|:-------:\|:------:\|:-------------:\|:----------------------:\|:----------------------:\|:----------------------:\|:----------------------:\|:---------------------:\|
	\| 0.2 \| 1 \| 4.0076 \| - \| - \| - \| - \| - \|
	\| 1.0 \| 5 \| 4.8662 \| 0.3288 \| 0.3390 \| 0.3208 \| 0.3246 \| 0.2749 \|
	\| 2.0 \| 10 \| 4.1825 \| 0.3288 \| 0.3456 \| 0.3306 \| 0.3405 \| 0.2954 \|
	\| 3.0 \| 15 \| 3.048 \| 0.3329 \| 0.3313 \| 0.3346 \| 0.3392 \| 0.3227 \|
	\| 4.0 \| 20 \| 2.5029 \| 0.3349 \| 0.3382 \| 0.338 \| 0.3429 \| 0.3229 \|

	* The bold row denotes the saved checkpoint.

	### Framework Versions
	- Python: 3.11.11
	- Sentence Transformers: 3.3.1
	- Transformers: 4.47.1
	- PyTorch: 2.5.1+cu124
	- Accelerate: 1.2.1
	- Datasets: 3.2.0
	- Tokenizers: 0.21.0

	## Citation

	### BibTeX

	#### Sentence Transformers
	```bibtex
	@inproceedings{reimers-2019-sentence-bert,
	title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
	author = "Reimers, Nils and Gurevych, Iryna",
	booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
	month = "11",
	year = "2019",
	publisher = "Association for Computational Linguistics",
	url = "https://arxiv.org/abs/1908.10084",
	}
	```

	#### MatryoshkaLoss
	```bibtex
	@misc{kusupati2024matryoshka,
	title={Matryoshka Representation Learning},
	author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
	year={2024},
	eprint={2205.13147},
	archivePrefix={arXiv},
	primaryClass={cs.LG}
	}
	```

	#### MultipleNegativesRankingLoss
	```bibtex
	@misc{henderson2017efficient,
	title={Efficient Natural Language Response Suggestion for Smart Reply},
	author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
	year={2017},
	eprint={1705.00652},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```

	<!--
	## Glossary

	Clearly define terms in order to be accessible across audiences.
	-->

	<!--
	## Model Card Authors

	Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.
	-->

	<!--
	## Model Card Contact

	Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.
	-->