---
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:154
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/msmarco-distilbert-base-v4
widget:
- source_sentence: Hey, what career oppotunities do you provide?
  sentences:
  - TechChefz Digital is present in two countries. Its headquarters is in Noida, India,
    with additional offices in Delaware, United States, and Gauram Nagar, Delhi, India.
  - 'Customer Experience & Marketing Technology

    Covering journey science, content architecture, personalization, campaign management,
    and conversion rate optimization, driving customer experiences and engagements


    Enterprise Platforms & Systems Integration

    Platform selection services in CMS, e-commerce, and learning management systems,
    with a focus on marketplace commerce


    Analytics, Data Science & Business Intelligence

    Engage in analytics, data science, and machine learning to derive insights. Implement
    intelligent search, recommendation engines, and predictive models for optimization
    and enhanced decision-making. TechChefz Digital seeks passionate individuals to
    join our innovative team. We offer dynamic work environments fostering creativity
    and expertise. Whether you''re seasoned or fresh, exciting career opportunities
    await in technology, consulting, design, and more. Join us in shaping digital
    transformation and unlocking possibilities for clients and the industry.

    7+ Years Industry Experience


    300+ Enthusiasts


    80% Employee Retention Rate

    '
  - 'How long does it take to develop an e-commerce website?

    The development time for an e-commerce website can vary widely depending on its
    complexity, features, and the platform chosen. A basic online store might take
    a few weeks to set up, while a custom, feature-rich site could take several months
    to develop. Clear communication of your requirements and timely decision-making
    can help streamline the process.'
- source_sentence: What technologies are used for web development?
  sentences:
  - 'Our Featured Insights

    Simplifying Image Loading in React with Lazy Loading and Intersection Observer
    API


    What Is React Js?


    The Role of Artificial Intelligence (AI) in Personalizing Digital Marketing Campaigns


    Mastering Personalization in Digital Marketing: Tailoring Campaigns for Success


    How Customer Experience Drives Your Business Growth


    Which is the best CMS for your Digital Transformation Journey?


    The Art of Test Case Creation Templates'
  - 'DISCOVER TECHSTACK

    Empowering solutions

    with cutting-edge technology stacks

    Web & Mobile Development

    Crafting dynamic and engaging online experiences tailored to your brand''s vision
    and objectives.

    Content Management Systems

    3D, AR & VR

    Learning Management System

    Commerce

    Analytics

    Personalization & Marketing Cloud

    Cloud & DevSecOps

    Tech Stack

    HTML, JS, CSS

    React JS

    Angular JS

    Vue JS

    Next JS

    React Native

    Flutter

    Node JS

    Python

    Frappe

    Java

    Spring Boot

    Go Lang

    Mongo DB

    PostgreSQL

    MySQL'
  - 'Can you help migrate our existing infrastructure to a DevOps model?

    Yes, we specialize in transitioning traditional IT infrastructure to a DevOps
    model. Our process includes assessing your current setup, planning the migration,
    implementing the necessary tools and practices, and providing ongoing support
    to ensure a smooth transition.'
- source_sentence: Where is TechChefz based?
  sentences:
  - 'CLIENT TESTIMONIALS

    Worked with TCZ on two business critical website development projects. The TCZ
    team is a group of experts in their respective domains and have helped us with
    excellent end-to-end development of a website right from the conceptualization
    to implementation and maintenance. By Dr. Kunal Joshi - Healthcare Marketing &
    Strategy Professional


    TCZ helped us with our new website launch in a seamless manner. Through all our
    discussions, they made sure to have the website designed as we had envisioned
    it to be. Thank you team TCZ.

    By Dr. Sarita Ahlawat - Managing Director and Co-Founder, Botlab Dynamics '
  - TechChefz Digital is present in two countries. Its headquarters is in Noida, India,
    with additional offices in Delaware, United States, and Gauram Nagar, Delhi, India.
  - "  What we do\n\nDigital Strategy\nCreating digital frameworks that transform\
    \ your digital enterprise and produce a return on investment.\n\nPlatform Selection\n\
    Helping you select the optimal digital experience, commerce, cloud and marketing\
    \ platform for your enterprise.\n\nPlatform Builds\nDeploying next-gen scalable\
    \ and agile enterprise digital platforms, along with multi-platform integrations.\n\
    \nProduct Builds\nHelp you ideate, strategize, and engineer your product with\
    \ help of our enterprise frameworks \n\nTeam Augmentation\nHelp you scale up and\
    \ augment your existing team to solve your hiring challenges with our easy to\
    \ deploy staff augmentation offerings .\nManaged Services\nOperate and monitor\
    \ your business-critical applications, data, and IT workloads, along with Application\
    \ maintenance and operations\n"
- source_sentence: Will you assess our current infrastructure before migrating?
  sentences:
  - 'Introducing the world of Global EdTech Firm.


    In this project, We implemented a comprehensive digital platform strategy to unify
    user experience across platforms, integrating diverse tech stacks and specialized
    platforms to enhance customer engagement and streamline operations.

    Develop tailored online tutoring and learning hub platforms, leveraging AI/ML
    for personalized learning experiences, thus accelerating user journeys and improving
    conversion rates.

    Provide managed services for seamless application support and platform stabilization,
    optimizing operational efficiency and enabling scalable B2B subscriptions for
    schools and districts, facilitating easy onboarding and growth across the US States.


    We also achieved 200% Improvement in Courses & Content being delivered to Students.
    50% Increase in Student’s Retention 150%, Increase in Teacher & Tutor Retention.'
  - TechChefz Digital has established its presence in two countries, showcasing its
    global reach and influence. The company’s headquarters is strategically located
    in Noida, India, serving as the central hub for its operations and leadership.
    In addition to the headquarters, TechChefz Digital has expanded its footprint
    with offices in Delaware, United States, allowing the company to cater to the
    North American market with ease and efficiency.
  - 'Can you help migrate our existing infrastructure to a DevOps model?

    Yes, we specialize in transitioning traditional IT infrastructure to a DevOps
    model. Our process includes assessing your current setup, planning the migration,
    implementing the necessary tools and practices, and providing ongoing support
    to ensure a smooth transition.'
- source_sentence: What steps do you take to understand a business's needs?
  sentences:
  - 'How do you customize your DevOps solutions for different industries?

    We understand that each industry has unique challenges and requirements. Our approach
    involves a thorough analysis of your business needs, industry standards, and regulatory
    requirements to tailor a DevOps solution that meets your specific objectives'
  - "Inception: Pioneering the Digital Frontier In our foundational year, TechChefz\
    \ embarked on a journey of digital transformation, laying the groundwork for our\
    \ future endeavors. We began working on Cab Accelerator Apps akin to Uber and\
    \ Ola, deploying them across Europe, Africa, and Australia, marking our initial\
    \ foray into global markets. Alongside, we successfully delivered technology trainings\
    \ across USA & India. \nqueries-techchefz-website\nqueries-techchefz-website\n\
    100%\n10\nA4\n\nAccelerating Momentum: A year of strategic partnerships & Transformative\
    \ Projects. In 2018, TechChefz continued to build on its strong foundation, expanding\
    \ its global footprint and forging strategic partnerships. Our collaboration with\
    \ digital agencies and system integrators propelled us into enterprise accounts,\
    \ focusing on digital experience development. This year marked significant collaborations\
    \ with leading automotive brands and financial institutions, enhancing our portfolio\
    \ and establishing TechChefz as a trusted partner in the industry. \n "
  - 'Our Vision Be a partner for industry verticals on the inevitable journey towards
    enterprise transformation and future readiness, by harnessing the growing power
    of Artificial Intelligence, Machine Learning, Data Science and emerging methodologies,
    with immediacy of impact and swiftness of outcome.Our Mission

    To decode data, and code new intelligence into products and automation, engineer,
    develop and deploy systems and applications that redefine experiences and realign
    business growth.'
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: BGE base Financial Matryoshka
  results:
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 768
      type: dim_768
    metrics:
    - type: cosine_accuracy@1
      value: 0.03896103896103896
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.4805194805194805
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.5714285714285714
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.6493506493506493
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.03896103896103896
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.1601731601731602
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.11428571428571425
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.06493506493506492
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.03896103896103896
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.4805194805194805
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.5714285714285714
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.6493506493506493
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.3349468392248154
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.23376623376623376
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.24652168791713625
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 512
      type: dim_512
    metrics:
    - type: cosine_accuracy@1
      value: 0.025974025974025976
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.4935064935064935
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.5844155844155844
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.6493506493506493
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.025974025974025976
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.1645021645021645
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.11688311688311684
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.06493506493506492
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.025974025974025976
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.4935064935064935
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.5844155844155844
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.6493506493506493
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.3381817622000061
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.23697691197691195
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.2485755814005223
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 256
      type: dim_256
    metrics:
    - type: cosine_accuracy@1
      value: 0.05194805194805195
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.4675324675324675
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.5194805194805194
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.6233766233766234
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.05194805194805195
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.15584415584415587
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.1038961038961039
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.062337662337662324
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.05194805194805195
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.4675324675324675
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.5194805194805194
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.6233766233766234
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.3379715765084199
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.24577922077922074
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.2597360814073472
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 128
      type: dim_128
    metrics:
    - type: cosine_accuracy@1
      value: 0.05194805194805195
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.44155844155844154
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.5584415584415584
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.6623376623376623
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.05194805194805195
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.14718614718614723
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.11168831168831166
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.0662337662337662
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.05194805194805195
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.44155844155844154
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.5584415584415584
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.6623376623376623
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.34288867015255386
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.24065656565656557
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.2507978917088375
      name: Cosine Map@100
  - task:
      type: information-retrieval
      name: Information Retrieval
    dataset:
      name: dim 64
      type: dim_64
    metrics:
    - type: cosine_accuracy@1
      value: 0.06493506493506493
      name: Cosine Accuracy@1
    - type: cosine_accuracy@3
      value: 0.4155844155844156
      name: Cosine Accuracy@3
    - type: cosine_accuracy@5
      value: 0.5064935064935064
      name: Cosine Accuracy@5
    - type: cosine_accuracy@10
      value: 0.5974025974025974
      name: Cosine Accuracy@10
    - type: cosine_precision@1
      value: 0.06493506493506493
      name: Cosine Precision@1
    - type: cosine_precision@3
      value: 0.13852813852813856
      name: Cosine Precision@3
    - type: cosine_precision@5
      value: 0.1012987012987013
      name: Cosine Precision@5
    - type: cosine_precision@10
      value: 0.05974025974025971
      name: Cosine Precision@10
    - type: cosine_recall@1
      value: 0.06493506493506493
      name: Cosine Recall@1
    - type: cosine_recall@3
      value: 0.4155844155844156
      name: Cosine Recall@3
    - type: cosine_recall@5
      value: 0.5064935064935064
      name: Cosine Recall@5
    - type: cosine_recall@10
      value: 0.5974025974025974
      name: Cosine Recall@10
    - type: cosine_ndcg@10
      value: 0.32285221821950844
      name: Cosine Ndcg@10
    - type: cosine_mrr@10
      value: 0.23481240981240978
      name: Cosine Mrr@10
    - type: cosine_map@100
      value: 0.24816289395996594
      name: Cosine Map@100
---

# BGE base Financial Matryoshka

This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/msmarco-distilbert-base-v4](https://huggingface.co/sentence-transformers/msmarco-distilbert-base-v4). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

## Model Details

### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [sentence-transformers/msmarco-distilbert-base-v4](https://huggingface.co/sentence-transformers/msmarco-distilbert-base-v4) <!-- at revision 19f0f4c73dc418bad0e0fc600611e808b7448a28 -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 768 dimensions
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
- **Language:** en
- **License:** apache-2.0

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)

### Full Model Architecture

```
SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Shashwat13333/msmarco-distilbert-base-v4")
# Run inference
sentences = [
    "What steps do you take to understand a business's needs?",
    'How do you customize your DevOps solutions for different industries?\nWe understand that each industry has unique challenges and requirements. Our approach involves a thorough analysis of your business needs, industry standards, and regulatory requirements to tailor a DevOps solution that meets your specific objectives',
    'Our Vision Be a partner for industry verticals on the inevitable journey towards enterprise transformation and future readiness, by harnessing the growing power of Artificial Intelligence, Machine Learning, Data Science and emerging methodologies, with immediacy of impact and swiftness of outcome.Our Mission\nTo decode data, and code new intelligence into products and automation, engineer, develop and deploy systems and applications that redefine experiences and realign business growth.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Information Retrieval

* Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64`
* Evaluated with [<code>InformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator)

| Metric              | dim_768    | dim_512    | dim_256   | dim_128    | dim_64     |
|:--------------------|:-----------|:-----------|:----------|:-----------|:-----------|
| cosine_accuracy@1   | 0.039      | 0.026      | 0.0519    | 0.0519     | 0.0649     |
| cosine_accuracy@3   | 0.4805     | 0.4935     | 0.4675    | 0.4416     | 0.4156     |
| cosine_accuracy@5   | 0.5714     | 0.5844     | 0.5195    | 0.5584     | 0.5065     |
| cosine_accuracy@10  | 0.6494     | 0.6494     | 0.6234    | 0.6623     | 0.5974     |
| cosine_precision@1  | 0.039      | 0.026      | 0.0519    | 0.0519     | 0.0649     |
| cosine_precision@3  | 0.1602     | 0.1645     | 0.1558    | 0.1472     | 0.1385     |
| cosine_precision@5  | 0.1143     | 0.1169     | 0.1039    | 0.1117     | 0.1013     |
| cosine_precision@10 | 0.0649     | 0.0649     | 0.0623    | 0.0662     | 0.0597     |
| cosine_recall@1     | 0.039      | 0.026      | 0.0519    | 0.0519     | 0.0649     |
| cosine_recall@3     | 0.4805     | 0.4935     | 0.4675    | 0.4416     | 0.4156     |
| cosine_recall@5     | 0.5714     | 0.5844     | 0.5195    | 0.5584     | 0.5065     |
| cosine_recall@10    | 0.6494     | 0.6494     | 0.6234    | 0.6623     | 0.5974     |
| **cosine_ndcg@10**  | **0.3349** | **0.3382** | **0.338** | **0.3429** | **0.3229** |
| cosine_mrr@10       | 0.2338     | 0.237      | 0.2458    | 0.2407     | 0.2348     |
| cosine_map@100      | 0.2465     | 0.2486     | 0.2597    | 0.2508     | 0.2482     |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### Unnamed Dataset


* Size: 154 training samples
* Columns: <code>anchor</code> and <code>positive</code>
* Approximate statistics based on the first 154 samples:
  |         | anchor                                                                            | positive                                                                            |
  |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|
  | type    | string                                                                            | string                                                                              |
  | details | <ul><li>min: 7 tokens</li><li>mean: 12.43 tokens</li><li>max: 20 tokens</li></ul> | <ul><li>min: 20 tokens</li><li>mean: 126.6 tokens</li><li>max: 378 tokens</li></ul> |
* Samples:
  | anchor                                                   | positive                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
  |:---------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>What kind of websites can you help us with?</code> | <code>CLIENT TESTIMONIALS<br>Worked with TCZ on two business critical website development projects. The TCZ team is a group of experts in their respective domains and have helped us with excellent end-to-end development of a website right from the conceptualization to implementation and maintenance. By Dr. Kunal Joshi - Healthcare Marketing & Strategy Professional<br><br>TCZ helped us with our new website launch in a seamless manner. Through all our discussions, they made sure to have the website designed as we had envisioned it to be. Thank you team TCZ.<br>By Dr. Sarita Ahlawat - Managing Director and Co-Founder, Botlab Dynamics </code> |
  | <code>What does DevSecOps mean?</code>                   | <code>How do you ensure the security of our DevOps pipeline?<br>Security is a top priority in our DevOps solutions. We implement DevSecOps practices, integrating security measures into the CI/CD pipeline from the outset. This includes automated security scans, compliance checks, and vulnerability assessments to ensure your infrastructure is secure</code>                                                                                                                                                                                                                                                                                                   |
  | <code>do you work with tech like nlp ?</code>            | <code>What AI solutions does Techchefz specialize in?<br>We specialize in a range of AI solutions including recommendation engines, NLP, computer vision, customer segmentation, predictive analytics, operational efficiency through machine learning, risk management, and conversational AI for customer service.</code>                                                                                                                                                                                                                                                                                                                                            |
* Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
  ```json
  {
      "loss": "MultipleNegativesRankingLoss",
      "matryoshka_dims": [
          768,
          512,
          256,
          128,
          64
      ],
      "matryoshka_weights": [
          1,
          1,
          1,
          1,
          1
      ],
      "n_dims_per_step": -1
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `eval_strategy`: epoch
- `gradient_accumulation_steps`: 4
- `learning_rate`: 1e-05
- `weight_decay`: 0.01
- `num_train_epochs`: 4
- `lr_scheduler_type`: cosine
- `warmup_ratio`: 0.1
- `fp16`: True
- `load_best_model_at_end`: True
- `optim`: adamw_torch_fused
- `push_to_hub`: True
- `hub_model_id`: Shashwat13333/msmarco-distilbert-base-v4_1
- `push_to_hub_model_id`: msmarco-distilbert-base-v4_1
- `batch_sampler`: no_duplicates

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: epoch
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 8
- `per_device_eval_batch_size`: 8
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 4
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 1e-05
- `weight_decay`: 0.01
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 4
- `max_steps`: -1
- `lr_scheduler_type`: cosine
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: True
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: True
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch_fused
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: True
- `resume_from_checkpoint`: None
- `hub_model_id`: Shashwat13333/msmarco-distilbert-base-v4_1
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: msmarco-distilbert-base-v4_1
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `dispatch_batches`: None
- `split_batches`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional

</details>

### Training Logs
| Epoch   | Step   | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
|:-------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:|
| 0.2     | 1      | 4.0076        | -                      | -                      | -                      | -                      | -                     |
| 1.0     | 5      | 4.8662        | 0.3288                 | 0.3390                 | 0.3208                 | 0.3246                 | 0.2749                |
| 2.0     | 10     | 4.1825        | 0.3288                 | 0.3456                 | 0.3306                 | 0.3405                 | 0.2954                |
| 3.0     | 15     | 3.048         | 0.3329                 | 0.3313                 | 0.3346                 | 0.3392                 | 0.3227                |
| **4.0** | **20** | **2.5029**    | **0.3349**             | **0.3382**             | **0.338**              | **0.3429**             | **0.3229**            |

* The bold row denotes the saved checkpoint.

### Framework Versions
- Python: 3.11.11
- Sentence Transformers: 3.3.1
- Transformers: 4.47.1
- PyTorch: 2.5.1+cu124
- Accelerate: 1.2.1
- Datasets: 3.2.0
- Tokenizers: 0.21.0

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### MatryoshkaLoss
```bibtex
@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
```

#### MultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->