metadata
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:154
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: sentence-transformers/msmarco-distilbert-base-v4
widget:
- source_sentence: Hey, what career oppotunities do you provide?
sentences:
- >-
TechChefz Digital is present in two countries. Its headquarters is in
Noida, India, with additional offices in Delaware, United States, and
Gauram Nagar, Delhi, India.
- >
Customer Experience & Marketing Technology
Covering journey science, content architecture, personalization,
campaign management, and conversion rate optimization, driving customer
experiences and engagements
Enterprise Platforms & Systems Integration
Platform selection services in CMS, e-commerce, and learning management
systems, with a focus on marketplace commerce
Analytics, Data Science & Business Intelligence
Engage in analytics, data science, and machine learning to derive
insights. Implement intelligent search, recommendation engines, and
predictive models for optimization and enhanced decision-making.
TechChefz Digital seeks passionate individuals to join our innovative
team. We offer dynamic work environments fostering creativity and
expertise. Whether you're seasoned or fresh, exciting career
opportunities await in technology, consulting, design, and more. Join us
in shaping digital transformation and unlocking possibilities for
clients and the industry.
7+ Years Industry Experience
300+ Enthusiasts
80% Employee Retention Rate
- >-
How long does it take to develop an e-commerce website?
The development time for an e-commerce website can vary widely depending
on its complexity, features, and the platform chosen. A basic online
store might take a few weeks to set up, while a custom, feature-rich
site could take several months to develop. Clear communication of your
requirements and timely decision-making can help streamline the process.
- source_sentence: What technologies are used for web development?
sentences:
- >-
Our Featured Insights
Simplifying Image Loading in React with Lazy Loading and Intersection
Observer API
What Is React Js?
The Role of Artificial Intelligence (AI) in Personalizing Digital
Marketing Campaigns
Mastering Personalization in Digital Marketing: Tailoring Campaigns for
Success
How Customer Experience Drives Your Business Growth
Which is the best CMS for your Digital Transformation Journey?
The Art of Test Case Creation Templates
- >-
DISCOVER TECHSTACK
Empowering solutions
with cutting-edge technology stacks
Web & Mobile Development
Crafting dynamic and engaging online experiences tailored to your
brand's vision and objectives.
Content Management Systems
3D, AR & VR
Learning Management System
Commerce
Analytics
Personalization & Marketing Cloud
Cloud & DevSecOps
Tech Stack
HTML, JS, CSS
React JS
Angular JS
Vue JS
Next JS
React Native
Flutter
Node JS
Python
Frappe
Java
Spring Boot
Go Lang
Mongo DB
PostgreSQL
MySQL
- >-
Can you help migrate our existing infrastructure to a DevOps model?
Yes, we specialize in transitioning traditional IT infrastructure to a
DevOps model. Our process includes assessing your current setup,
planning the migration, implementing the necessary tools and practices,
and providing ongoing support to ensure a smooth transition.
- source_sentence: Where is TechChefz based?
sentences:
- >-
CLIENT TESTIMONIALS
Worked with TCZ on two business critical website development projects.
The TCZ team is a group of experts in their respective domains and have
helped us with excellent end-to-end development of a website right from
the conceptualization to implementation and maintenance. By Dr. Kunal
Joshi - Healthcare Marketing & Strategy Professional
TCZ helped us with our new website launch in a seamless manner. Through
all our discussions, they made sure to have the website designed as we
had envisioned it to be. Thank you team TCZ.
By Dr. Sarita Ahlawat - Managing Director and Co-Founder, Botlab
Dynamics
- >-
TechChefz Digital is present in two countries. Its headquarters is in
Noida, India, with additional offices in Delaware, United States, and
Gauram Nagar, Delhi, India.
- >2
What we do
Digital Strategy
Creating digital frameworks that transform your digital enterprise and
produce a return on investment.
Platform Selection
Helping you select the optimal digital experience, commerce, cloud and
marketing platform for your enterprise.
Platform Builds
Deploying next-gen scalable and agile enterprise digital platforms,
along with multi-platform integrations.
Product Builds
Help you ideate, strategize, and engineer your product with help of our
enterprise frameworks
Team Augmentation
Help you scale up and augment your existing team to solve your hiring
challenges with our easy to deploy staff augmentation offerings .
Managed Services
Operate and monitor your business-critical applications, data, and IT
workloads, along with Application maintenance and operations
- source_sentence: Will you assess our current infrastructure before migrating?
sentences:
- >-
Introducing the world of Global EdTech Firm.
In this project, We implemented a comprehensive digital platform
strategy to unify user experience across platforms, integrating diverse
tech stacks and specialized platforms to enhance customer engagement and
streamline operations.
Develop tailored online tutoring and learning hub platforms, leveraging
AI/ML for personalized learning experiences, thus accelerating user
journeys and improving conversion rates.
Provide managed services for seamless application support and platform
stabilization, optimizing operational efficiency and enabling scalable
B2B subscriptions for schools and districts, facilitating easy
onboarding and growth across the US States.
We also achieved 200% Improvement in Courses & Content being delivered
to Students. 50% Increase in Student’s Retention 150%, Increase in
Teacher & Tutor Retention.
- >-
TechChefz Digital has established its presence in two countries,
showcasing its global reach and influence. The company’s headquarters is
strategically located in Noida, India, serving as the central hub for
its operations and leadership. In addition to the headquarters,
TechChefz Digital has expanded its footprint with offices in Delaware,
United States, allowing the company to cater to the North American
market with ease and efficiency.
- >-
Can you help migrate our existing infrastructure to a DevOps model?
Yes, we specialize in transitioning traditional IT infrastructure to a
DevOps model. Our process includes assessing your current setup,
planning the migration, implementing the necessary tools and practices,
and providing ongoing support to ensure a smooth transition.
- source_sentence: What steps do you take to understand a business's needs?
sentences:
- >-
How do you customize your DevOps solutions for different industries?
We understand that each industry has unique challenges and requirements.
Our approach involves a thorough analysis of your business needs,
industry standards, and regulatory requirements to tailor a DevOps
solution that meets your specific objectives
- >-
Inception: Pioneering the Digital Frontier In our foundational year,
TechChefz embarked on a journey of digital transformation, laying the
groundwork for our future endeavors. We began working on Cab Accelerator
Apps akin to Uber and Ola, deploying them across Europe, Africa, and
Australia, marking our initial foray into global markets. Alongside, we
successfully delivered technology trainings across USA & India.
queries-techchefz-website
queries-techchefz-website
100%
10
A4
Accelerating Momentum: A year of strategic partnerships & Transformative
Projects. In 2018, TechChefz continued to build on its strong
foundation, expanding its global footprint and forging strategic
partnerships. Our collaboration with digital agencies and system
integrators propelled us into enterprise accounts, focusing on digital
experience development. This year marked significant collaborations with
leading automotive brands and financial institutions, enhancing our
portfolio and establishing TechChefz as a trusted partner in the
industry.
- >-
Our Vision Be a partner for industry verticals on the inevitable journey
towards enterprise transformation and future readiness, by harnessing
the growing power of Artificial Intelligence, Machine Learning, Data
Science and emerging methodologies, with immediacy of impact and
swiftness of outcome.Our Mission
To decode data, and code new intelligence into products and automation,
engineer, develop and deploy systems and applications that redefine
experiences and realign business growth.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: BGE base Financial Matryoshka
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 768
type: dim_768
metrics:
- type: cosine_accuracy@1
value: 0.03896103896103896
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.4805194805194805
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.5714285714285714
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.6493506493506493
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.03896103896103896
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.1601731601731602
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.11428571428571425
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.06493506493506492
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.03896103896103896
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.4805194805194805
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.5714285714285714
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.6493506493506493
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.3349468392248154
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.23376623376623376
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.24652168791713625
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 512
type: dim_512
metrics:
- type: cosine_accuracy@1
value: 0.025974025974025976
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.4935064935064935
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.5844155844155844
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.6493506493506493
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.025974025974025976
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.1645021645021645
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.11688311688311684
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.06493506493506492
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.025974025974025976
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.4935064935064935
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.5844155844155844
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.6493506493506493
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.3381817622000061
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.23697691197691195
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.2485755814005223
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 256
type: dim_256
metrics:
- type: cosine_accuracy@1
value: 0.05194805194805195
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.4675324675324675
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.5194805194805194
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.6233766233766234
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.05194805194805195
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.15584415584415587
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.1038961038961039
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.062337662337662324
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.05194805194805195
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.4675324675324675
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.5194805194805194
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.6233766233766234
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.3379715765084199
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.24577922077922074
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.2597360814073472
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 128
type: dim_128
metrics:
- type: cosine_accuracy@1
value: 0.05194805194805195
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.44155844155844154
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.5584415584415584
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.6623376623376623
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.05194805194805195
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.14718614718614723
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.11168831168831166
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.0662337662337662
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.05194805194805195
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.44155844155844154
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.5584415584415584
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.6623376623376623
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.34288867015255386
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.24065656565656557
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.2507978917088375
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 64
type: dim_64
metrics:
- type: cosine_accuracy@1
value: 0.06493506493506493
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.4155844155844156
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.5064935064935064
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.5974025974025974
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.06493506493506493
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.13852813852813856
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.1012987012987013
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.05974025974025971
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.06493506493506493
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.4155844155844156
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.5064935064935064
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.5974025974025974
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.32285221821950844
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.23481240981240978
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.24816289395996594
name: Cosine Map@100
BGE base Financial Matryoshka
This is a sentence-transformers model finetuned from sentence-transformers/msmarco-distilbert-base-v4. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/msmarco-distilbert-base-v4
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Shashwat13333/msmarco-distilbert-base-v4")
# Run inference
sentences = [
"What steps do you take to understand a business's needs?",
'How do you customize your DevOps solutions for different industries?\nWe understand that each industry has unique challenges and requirements. Our approach involves a thorough analysis of your business needs, industry standards, and regulatory requirements to tailor a DevOps solution that meets your specific objectives',
'Our Vision Be a partner for industry verticals on the inevitable journey towards enterprise transformation and future readiness, by harnessing the growing power of Artificial Intelligence, Machine Learning, Data Science and emerging methodologies, with immediacy of impact and swiftness of outcome.Our Mission\nTo decode data, and code new intelligence into products and automation, engineer, develop and deploy systems and applications that redefine experiences and realign business growth.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Datasets:
dim_768
,dim_512
,dim_256
,dim_128
anddim_64
- Evaluated with
InformationRetrievalEvaluator
Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
---|---|---|---|---|---|
cosine_accuracy@1 | 0.039 | 0.026 | 0.0519 | 0.0519 | 0.0649 |
cosine_accuracy@3 | 0.4805 | 0.4935 | 0.4675 | 0.4416 | 0.4156 |
cosine_accuracy@5 | 0.5714 | 0.5844 | 0.5195 | 0.5584 | 0.5065 |
cosine_accuracy@10 | 0.6494 | 0.6494 | 0.6234 | 0.6623 | 0.5974 |
cosine_precision@1 | 0.039 | 0.026 | 0.0519 | 0.0519 | 0.0649 |
cosine_precision@3 | 0.1602 | 0.1645 | 0.1558 | 0.1472 | 0.1385 |
cosine_precision@5 | 0.1143 | 0.1169 | 0.1039 | 0.1117 | 0.1013 |
cosine_precision@10 | 0.0649 | 0.0649 | 0.0623 | 0.0662 | 0.0597 |
cosine_recall@1 | 0.039 | 0.026 | 0.0519 | 0.0519 | 0.0649 |
cosine_recall@3 | 0.4805 | 0.4935 | 0.4675 | 0.4416 | 0.4156 |
cosine_recall@5 | 0.5714 | 0.5844 | 0.5195 | 0.5584 | 0.5065 |
cosine_recall@10 | 0.6494 | 0.6494 | 0.6234 | 0.6623 | 0.5974 |
cosine_ndcg@10 | 0.3349 | 0.3382 | 0.338 | 0.3429 | 0.3229 |
cosine_mrr@10 | 0.2338 | 0.237 | 0.2458 | 0.2407 | 0.2348 |
cosine_map@100 | 0.2465 | 0.2486 | 0.2597 | 0.2508 | 0.2482 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 154 training samples
- Columns:
anchor
andpositive
- Approximate statistics based on the first 154 samples:
anchor positive type string string details - min: 7 tokens
- mean: 12.43 tokens
- max: 20 tokens
- min: 20 tokens
- mean: 126.6 tokens
- max: 378 tokens
- Samples:
anchor positive What kind of websites can you help us with?
CLIENT TESTIMONIALS
Worked with TCZ on two business critical website development projects. The TCZ team is a group of experts in their respective domains and have helped us with excellent end-to-end development of a website right from the conceptualization to implementation and maintenance. By Dr. Kunal Joshi - Healthcare Marketing & Strategy Professional
TCZ helped us with our new website launch in a seamless manner. Through all our discussions, they made sure to have the website designed as we had envisioned it to be. Thank you team TCZ.
By Dr. Sarita Ahlawat - Managing Director and Co-Founder, Botlab DynamicsWhat does DevSecOps mean?
How do you ensure the security of our DevOps pipeline?
Security is a top priority in our DevOps solutions. We implement DevSecOps practices, integrating security measures into the CI/CD pipeline from the outset. This includes automated security scans, compliance checks, and vulnerability assessments to ensure your infrastructure is securedo you work with tech like nlp ?
What AI solutions does Techchefz specialize in?
We specialize in a range of AI solutions including recommendation engines, NLP, computer vision, customer segmentation, predictive analytics, operational efficiency through machine learning, risk management, and conversational AI for customer service. - Loss:
MatryoshkaLoss
with these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochgradient_accumulation_steps
: 4learning_rate
: 1e-05weight_decay
: 0.01num_train_epochs
: 4lr_scheduler_type
: cosinewarmup_ratio
: 0.1fp16
: Trueload_best_model_at_end
: Trueoptim
: adamw_torch_fusedpush_to_hub
: Truehub_model_id
: Shashwat13333/msmarco-distilbert-base-v4_1push_to_hub_model_id
: msmarco-distilbert-base-v4_1batch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 8per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 4eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 1e-05weight_decay
: 0.01adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: cosinelr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torch_fusedoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Trueresume_from_checkpoint
: Nonehub_model_id
: Shashwat13333/msmarco-distilbert-base-v4_1hub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: msmarco-distilbert-base-v4_1push_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
---|---|---|---|---|---|---|---|
0.2 | 1 | 4.0076 | - | - | - | - | - |
1.0 | 5 | 4.8662 | 0.3288 | 0.3390 | 0.3208 | 0.3246 | 0.2749 |
2.0 | 10 | 4.1825 | 0.3288 | 0.3456 | 0.3306 | 0.3405 | 0.2954 |
3.0 | 15 | 3.048 | 0.3329 | 0.3313 | 0.3346 | 0.3392 | 0.3227 |
4.0 | 20 | 2.5029 | 0.3349 | 0.3382 | 0.338 | 0.3429 | 0.3229 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.11.11
- Sentence Transformers: 3.3.1
- Transformers: 4.47.1
- PyTorch: 2.5.1+cu124
- Accelerate: 1.2.1
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}