--- language: - en license: apache-2.0 tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:154 - loss:MatryoshkaLoss - loss:MultipleNegativesRankingLoss base_model: sentence-transformers/msmarco-distilbert-base-v4 widget: - source_sentence: Hey, what career oppotunities do you provide? sentences: - TechChefz Digital is present in two countries. Its headquarters is in Noida, India, with additional offices in Delaware, United States, and Gauram Nagar, Delhi, India. - 'Customer Experience & Marketing Technology Covering journey science, content architecture, personalization, campaign management, and conversion rate optimization, driving customer experiences and engagements Enterprise Platforms & Systems Integration Platform selection services in CMS, e-commerce, and learning management systems, with a focus on marketplace commerce Analytics, Data Science & Business Intelligence Engage in analytics, data science, and machine learning to derive insights. Implement intelligent search, recommendation engines, and predictive models for optimization and enhanced decision-making. TechChefz Digital seeks passionate individuals to join our innovative team. We offer dynamic work environments fostering creativity and expertise. Whether you''re seasoned or fresh, exciting career opportunities await in technology, consulting, design, and more. Join us in shaping digital transformation and unlocking possibilities for clients and the industry. 7+ Years Industry Experience 300+ Enthusiasts 80% Employee Retention Rate ' - 'How long does it take to develop an e-commerce website? The development time for an e-commerce website can vary widely depending on its complexity, features, and the platform chosen. A basic online store might take a few weeks to set up, while a custom, feature-rich site could take several months to develop. Clear communication of your requirements and timely decision-making can help streamline the process.' - source_sentence: What technologies are used for web development? sentences: - 'Our Featured Insights Simplifying Image Loading in React with Lazy Loading and Intersection Observer API What Is React Js? The Role of Artificial Intelligence (AI) in Personalizing Digital Marketing Campaigns Mastering Personalization in Digital Marketing: Tailoring Campaigns for Success How Customer Experience Drives Your Business Growth Which is the best CMS for your Digital Transformation Journey? The Art of Test Case Creation Templates' - 'DISCOVER TECHSTACK Empowering solutions with cutting-edge technology stacks Web & Mobile Development Crafting dynamic and engaging online experiences tailored to your brand''s vision and objectives. Content Management Systems 3D, AR & VR Learning Management System Commerce Analytics Personalization & Marketing Cloud Cloud & DevSecOps Tech Stack HTML, JS, CSS React JS Angular JS Vue JS Next JS React Native Flutter Node JS Python Frappe Java Spring Boot Go Lang Mongo DB PostgreSQL MySQL' - 'Can you help migrate our existing infrastructure to a DevOps model? Yes, we specialize in transitioning traditional IT infrastructure to a DevOps model. Our process includes assessing your current setup, planning the migration, implementing the necessary tools and practices, and providing ongoing support to ensure a smooth transition.' - source_sentence: Where is TechChefz based? sentences: - 'CLIENT TESTIMONIALS Worked with TCZ on two business critical website development projects. The TCZ team is a group of experts in their respective domains and have helped us with excellent end-to-end development of a website right from the conceptualization to implementation and maintenance. By Dr. Kunal Joshi - Healthcare Marketing & Strategy Professional TCZ helped us with our new website launch in a seamless manner. Through all our discussions, they made sure to have the website designed as we had envisioned it to be. Thank you team TCZ. By Dr. Sarita Ahlawat - Managing Director and Co-Founder, Botlab Dynamics ' - TechChefz Digital is present in two countries. Its headquarters is in Noida, India, with additional offices in Delaware, United States, and Gauram Nagar, Delhi, India. - " What we do\n\nDigital Strategy\nCreating digital frameworks that transform\ \ your digital enterprise and produce a return on investment.\n\nPlatform Selection\n\ Helping you select the optimal digital experience, commerce, cloud and marketing\ \ platform for your enterprise.\n\nPlatform Builds\nDeploying next-gen scalable\ \ and agile enterprise digital platforms, along with multi-platform integrations.\n\ \nProduct Builds\nHelp you ideate, strategize, and engineer your product with\ \ help of our enterprise frameworks \n\nTeam Augmentation\nHelp you scale up and\ \ augment your existing team to solve your hiring challenges with our easy to\ \ deploy staff augmentation offerings .\nManaged Services\nOperate and monitor\ \ your business-critical applications, data, and IT workloads, along with Application\ \ maintenance and operations\n" - source_sentence: Will you assess our current infrastructure before migrating? sentences: - 'Introducing the world of Global EdTech Firm. In this project, We implemented a comprehensive digital platform strategy to unify user experience across platforms, integrating diverse tech stacks and specialized platforms to enhance customer engagement and streamline operations. Develop tailored online tutoring and learning hub platforms, leveraging AI/ML for personalized learning experiences, thus accelerating user journeys and improving conversion rates. Provide managed services for seamless application support and platform stabilization, optimizing operational efficiency and enabling scalable B2B subscriptions for schools and districts, facilitating easy onboarding and growth across the US States. We also achieved 200% Improvement in Courses & Content being delivered to Students. 50% Increase in Student’s Retention 150%, Increase in Teacher & Tutor Retention.' - TechChefz Digital has established its presence in two countries, showcasing its global reach and influence. The company’s headquarters is strategically located in Noida, India, serving as the central hub for its operations and leadership. In addition to the headquarters, TechChefz Digital has expanded its footprint with offices in Delaware, United States, allowing the company to cater to the North American market with ease and efficiency. - 'Can you help migrate our existing infrastructure to a DevOps model? Yes, we specialize in transitioning traditional IT infrastructure to a DevOps model. Our process includes assessing your current setup, planning the migration, implementing the necessary tools and practices, and providing ongoing support to ensure a smooth transition.' - source_sentence: What steps do you take to understand a business's needs? sentences: - 'How do you customize your DevOps solutions for different industries? We understand that each industry has unique challenges and requirements. Our approach involves a thorough analysis of your business needs, industry standards, and regulatory requirements to tailor a DevOps solution that meets your specific objectives' - "Inception: Pioneering the Digital Frontier In our foundational year, TechChefz\ \ embarked on a journey of digital transformation, laying the groundwork for our\ \ future endeavors. We began working on Cab Accelerator Apps akin to Uber and\ \ Ola, deploying them across Europe, Africa, and Australia, marking our initial\ \ foray into global markets. Alongside, we successfully delivered technology trainings\ \ across USA & India. \nqueries-techchefz-website\nqueries-techchefz-website\n\ 100%\n10\nA4\n\nAccelerating Momentum: A year of strategic partnerships & Transformative\ \ Projects. In 2018, TechChefz continued to build on its strong foundation, expanding\ \ its global footprint and forging strategic partnerships. Our collaboration with\ \ digital agencies and system integrators propelled us into enterprise accounts,\ \ focusing on digital experience development. This year marked significant collaborations\ \ with leading automotive brands and financial institutions, enhancing our portfolio\ \ and establishing TechChefz as a trusted partner in the industry. \n " - 'Our Vision Be a partner for industry verticals on the inevitable journey towards enterprise transformation and future readiness, by harnessing the growing power of Artificial Intelligence, Machine Learning, Data Science and emerging methodologies, with immediacy of impact and swiftness of outcome.Our Mission To decode data, and code new intelligence into products and automation, engineer, develop and deploy systems and applications that redefine experiences and realign business growth.' pipeline_tag: sentence-similarity library_name: sentence-transformers metrics: - cosine_accuracy@1 - cosine_accuracy@3 - cosine_accuracy@5 - cosine_accuracy@10 - cosine_precision@1 - cosine_precision@3 - cosine_precision@5 - cosine_precision@10 - cosine_recall@1 - cosine_recall@3 - cosine_recall@5 - cosine_recall@10 - cosine_ndcg@10 - cosine_mrr@10 - cosine_map@100 model-index: - name: BGE base Financial Matryoshka results: - task: type: information-retrieval name: Information Retrieval dataset: name: dim 768 type: dim_768 metrics: - type: cosine_accuracy@1 value: 0.03896103896103896 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.4805194805194805 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.5714285714285714 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.6493506493506493 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.03896103896103896 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.1601731601731602 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.11428571428571425 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.06493506493506492 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.03896103896103896 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.4805194805194805 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.5714285714285714 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.6493506493506493 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.3349468392248154 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.23376623376623376 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.24652168791713625 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 512 type: dim_512 metrics: - type: cosine_accuracy@1 value: 0.025974025974025976 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.4935064935064935 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.5844155844155844 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.6493506493506493 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.025974025974025976 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.1645021645021645 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.11688311688311684 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.06493506493506492 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.025974025974025976 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.4935064935064935 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.5844155844155844 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.6493506493506493 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.3381817622000061 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.23697691197691195 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.2485755814005223 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 256 type: dim_256 metrics: - type: cosine_accuracy@1 value: 0.05194805194805195 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.4675324675324675 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.5194805194805194 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.6233766233766234 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.05194805194805195 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.15584415584415587 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.1038961038961039 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.062337662337662324 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.05194805194805195 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.4675324675324675 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.5194805194805194 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.6233766233766234 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.3379715765084199 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.24577922077922074 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.2597360814073472 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 128 type: dim_128 metrics: - type: cosine_accuracy@1 value: 0.05194805194805195 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.44155844155844154 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.5584415584415584 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.6623376623376623 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.05194805194805195 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.14718614718614723 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.11168831168831166 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.0662337662337662 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.05194805194805195 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.44155844155844154 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.5584415584415584 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.6623376623376623 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.34288867015255386 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.24065656565656557 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.2507978917088375 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 64 type: dim_64 metrics: - type: cosine_accuracy@1 value: 0.06493506493506493 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.4155844155844156 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.5064935064935064 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.5974025974025974 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.06493506493506493 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.13852813852813856 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.1012987012987013 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.05974025974025971 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.06493506493506493 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.4155844155844156 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.5064935064935064 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.5974025974025974 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.32285221821950844 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.23481240981240978 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.24816289395996594 name: Cosine Map@100 --- # BGE base Financial Matryoshka This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/msmarco-distilbert-base-v4](https://huggingface.co/sentence-transformers/msmarco-distilbert-base-v4). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [sentence-transformers/msmarco-distilbert-base-v4](https://huggingface.co/sentence-transformers/msmarco-distilbert-base-v4) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 768 dimensions - **Similarity Function:** Cosine Similarity - **Language:** en - **License:** apache-2.0 ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("Shashwat13333/msmarco-distilbert-base-v4") # Run inference sentences = [ "What steps do you take to understand a business's needs?", 'How do you customize your DevOps solutions for different industries?\nWe understand that each industry has unique challenges and requirements. Our approach involves a thorough analysis of your business needs, industry standards, and regulatory requirements to tailor a DevOps solution that meets your specific objectives', 'Our Vision Be a partner for industry verticals on the inevitable journey towards enterprise transformation and future readiness, by harnessing the growing power of Artificial Intelligence, Machine Learning, Data Science and emerging methodologies, with immediacy of impact and swiftness of outcome.Our Mission\nTo decode data, and code new intelligence into products and automation, engineer, develop and deploy systems and applications that redefine experiences and realign business growth.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Information Retrieval * Datasets: `dim_768`, `dim_512`, `dim_256`, `dim_128` and `dim_64` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 | |:--------------------|:-----------|:-----------|:----------|:-----------|:-----------| | cosine_accuracy@1 | 0.039 | 0.026 | 0.0519 | 0.0519 | 0.0649 | | cosine_accuracy@3 | 0.4805 | 0.4935 | 0.4675 | 0.4416 | 0.4156 | | cosine_accuracy@5 | 0.5714 | 0.5844 | 0.5195 | 0.5584 | 0.5065 | | cosine_accuracy@10 | 0.6494 | 0.6494 | 0.6234 | 0.6623 | 0.5974 | | cosine_precision@1 | 0.039 | 0.026 | 0.0519 | 0.0519 | 0.0649 | | cosine_precision@3 | 0.1602 | 0.1645 | 0.1558 | 0.1472 | 0.1385 | | cosine_precision@5 | 0.1143 | 0.1169 | 0.1039 | 0.1117 | 0.1013 | | cosine_precision@10 | 0.0649 | 0.0649 | 0.0623 | 0.0662 | 0.0597 | | cosine_recall@1 | 0.039 | 0.026 | 0.0519 | 0.0519 | 0.0649 | | cosine_recall@3 | 0.4805 | 0.4935 | 0.4675 | 0.4416 | 0.4156 | | cosine_recall@5 | 0.5714 | 0.5844 | 0.5195 | 0.5584 | 0.5065 | | cosine_recall@10 | 0.6494 | 0.6494 | 0.6234 | 0.6623 | 0.5974 | | **cosine_ndcg@10** | **0.3349** | **0.3382** | **0.338** | **0.3429** | **0.3229** | | cosine_mrr@10 | 0.2338 | 0.237 | 0.2458 | 0.2407 | 0.2348 | | cosine_map@100 | 0.2465 | 0.2486 | 0.2597 | 0.2508 | 0.2482 | ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 154 training samples * Columns: anchor and positive * Approximate statistics based on the first 154 samples: | | anchor | positive | |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | anchor | positive | |:---------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | What kind of websites can you help us with? | CLIENT TESTIMONIALS
Worked with TCZ on two business critical website development projects. The TCZ team is a group of experts in their respective domains and have helped us with excellent end-to-end development of a website right from the conceptualization to implementation and maintenance. By Dr. Kunal Joshi - Healthcare Marketing & Strategy Professional

TCZ helped us with our new website launch in a seamless manner. Through all our discussions, they made sure to have the website designed as we had envisioned it to be. Thank you team TCZ.
By Dr. Sarita Ahlawat - Managing Director and Co-Founder, Botlab Dynamics
| | What does DevSecOps mean? | How do you ensure the security of our DevOps pipeline?
Security is a top priority in our DevOps solutions. We implement DevSecOps practices, integrating security measures into the CI/CD pipeline from the outset. This includes automated security scans, compliance checks, and vulnerability assessments to ensure your infrastructure is secure
| | do you work with tech like nlp ? | What AI solutions does Techchefz specialize in?
We specialize in a range of AI solutions including recommendation engines, NLP, computer vision, customer segmentation, predictive analytics, operational efficiency through machine learning, risk management, and conversational AI for customer service.
| * Loss: [MatryoshkaLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters: ```json { "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: epoch - `gradient_accumulation_steps`: 4 - `learning_rate`: 1e-05 - `weight_decay`: 0.01 - `num_train_epochs`: 4 - `lr_scheduler_type`: cosine - `warmup_ratio`: 0.1 - `fp16`: True - `load_best_model_at_end`: True - `optim`: adamw_torch_fused - `push_to_hub`: True - `hub_model_id`: Shashwat13333/msmarco-distilbert-base-v4_1 - `push_to_hub_model_id`: msmarco-distilbert-base-v4_1 - `batch_sampler`: no_duplicates #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: epoch - `prediction_loss_only`: True - `per_device_train_batch_size`: 8 - `per_device_eval_batch_size`: 8 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 4 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 1e-05 - `weight_decay`: 0.01 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 4 - `max_steps`: -1 - `lr_scheduler_type`: cosine - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.1 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: True - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: True - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch_fused - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: True - `resume_from_checkpoint`: None - `hub_model_id`: Shashwat13333/msmarco-distilbert-base-v4_1 - `hub_strategy`: every_save - `hub_private_repo`: None - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: msmarco-distilbert-base-v4_1 - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `eval_use_gather_object`: False - `average_tokens_across_devices`: False - `prompts`: None - `batch_sampler`: no_duplicates - `multi_dataset_batch_sampler`: proportional
### Training Logs | Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 | |:-------:|:------:|:-------------:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:| | 0.2 | 1 | 4.0076 | - | - | - | - | - | | 1.0 | 5 | 4.8662 | 0.3288 | 0.3390 | 0.3208 | 0.3246 | 0.2749 | | 2.0 | 10 | 4.1825 | 0.3288 | 0.3456 | 0.3306 | 0.3405 | 0.2954 | | 3.0 | 15 | 3.048 | 0.3329 | 0.3313 | 0.3346 | 0.3392 | 0.3227 | | **4.0** | **20** | **2.5029** | **0.3349** | **0.3382** | **0.338** | **0.3429** | **0.3229** | * The bold row denotes the saved checkpoint. ### Framework Versions - Python: 3.11.11 - Sentence Transformers: 3.3.1 - Transformers: 4.47.1 - PyTorch: 2.5.1+cu124 - Accelerate: 1.2.1 - Datasets: 3.2.0 - Tokenizers: 0.21.0 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MatryoshkaLoss ```bibtex @misc{kusupati2024matryoshka, title={Matryoshka Representation Learning}, author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi}, year={2024}, eprint={2205.13147}, archivePrefix={arXiv}, primaryClass={cs.LG} } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```