SentenceTransformer based on BAAI/bge-base-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5 on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-base-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Tejasw1/votum-acts-v1")
# Run inference
sentences = [
    'Represent this sentence for searching relevant passages: According to **Section 13(2)(a) of the Central Goods and Services Tax Act, 2017**, what is the time of supply of services if the invoice is issued within the prescribed period?',
    'Document: (1) The liability to pay tax on services shall arise at the time ofsupply, as determined in accordance with the provisions of this section.(2) The time of supply of services shall be the earliest of the following dates, namely:--(a) the date of issue of invoice by the supplier, if the invoice is issued within the periodprescribed under 1*** sub-section (2) of section 31 or the date of receipt of payment, whichever is earlier; or(b) the date of provision of service, if the invoice is not issued within the period prescribed under sub-section (2) of 1*** section 31 or the date of receipt of payment, whichever is earlier; or(c) the date on which the recipient shows the receipt of services in his books of account, in a case where the provisions of clause (a) or clause (b) do not apply:Provided that where the supplier of taxable service receives an amount up to one thousand rupees inexcess of the amount indicated in the tax invoice, the time of supply to the extent of such excess amountshall, at the option of the said supplier, be the date of issue of invoice relating to such excess amount.Explanation.--For the purposes of clauses (a) and (b)--(i) the supply shall be deemed to have been made to the extent it is covered by the invoice or, asthe case may be, the payment;(ii) "the date of receipt of payment" shall be the date on which the payment is entered in thebooks of account of the supplier or the date on which the payment is credited to his bank account,whichever is earlier.(3) In case of supplies in respect of which tax is paid or liable to be paid on reverse charge basis, thetime of supply shall be the earlier of the following dates, namely:--(a) the date of payment as entered in the books of account of the recipient or the date on which thepayment is debited in his bank account, whichever is earlier; or(b) the date immediately following sixty days from the date of issue of invoice or any otherdocument, by whatever name called, in lieu thereof by the supplier:Provided that where it is not possible to determine the time of supply under clause (a) or clause (b),the time of supply shall be the date of entry in the books of account of the recipient of supply:Provided further that in case of supply by associated enterprises, where the supplier of service islocated outside India, the time of supply shall be the date of entry in the books of account of the recipientof supply or the date of payment, whichever is earlier.(4) In case of supply of vouchers by a supplier, the time of supply shall be--(a) the date of issue of voucher, if the supply is identifiable at that point; or(b) the date of redemption of voucher, in all other cases.(5) Where it is not possible to determine the time of supply under the provisions of sub-section (2) orsub-section (3) or sub-section (4), the time of supply shall--(a) in a case where a periodical return has to be filed, be the date on which such return is to befiled; or(b) in any other case, be the date on which the tax is paid.(6) The time of supply to the extent it relates to an addition in the value of supply by way of interest,late fee or penalty for delayed payment of any consideration shall be the date on which the supplierreceives such addition in value.\t\t\t\t\t\t\t\t\t1. The words, brackets and figure “sub-section (2) of” omitted by Act 31 of 2018, s. 7 (w.e.f. 1-2-2019).',
    'Document: Article (51) Prohibitions                                                                                                        Public Welfare Associations and their Members may not do the following:\r1. Practice any Public Welfare activity other than those stipulated in its By-laws.\r2. Practice any political or partisan activity, collecting information, interfering in politics or matters affecting the security of the State and its law of government, or using its Office for that purpose, or provoking sectarian, racial, or religious disputes.\r3. Affiliate, join, participate in, or deal with any illegal Associations or entities, or any natural or Legal Person belonging to it, whether inside or outside the State, or financing or providing support to them in any way.\r4. Deal with, financing, or providing support to any illegal Association, terrorist Association, or entity, or any natural or Legal Person belonging to any of them.\r5. Form secret societies, companies, or formations of a secret, military, or paramilitary nature, or calling for favouring, supporting, or financing violence or terrorist organisations.\r6. Practice activities that would disturb public order, public morals, Emirati customs and traditions, or threaten the national security of the State.\r7. Call for discrimination between citizens or residents of the State on the basis of gender, origin, colour, language, religion or belief, or any activity that calls for racism, incitement to hatred, or other reasons that are contrary to the Constitution and the legislation in force in the State.\r8. Participate in supporting or financing the electoral campaigns of any candidate in elections and referendums, or presenting a candidate in those elections on behalf of the Association.\r9. Grant any professional or applied certificates without authorisation from the Competent Authorities in the State, or without an official partnership with one of the specialised universities or the Competent Authorities, and in accordance with the rules regulating this in the State.\r10. Practice any Public Welfare Activities outside the spatial scope of the licence issued to him by the Competent Authority.\r11. Practice any Activities that require a licence or approval from a governmental entity, before obtaining a licence or approval from that entity and the Competent Authority.\r12. Aim to make a profit for the Members of a Public Welfare Association, or engaging in an activity aimed at that, or distributing the Funds of a Public Welfare Association to its Members, employees, or those responsible for its management.\r13. Conduct opinion polls, publishing or making their results available, or conducting field research or presenting their results, without obtaining prior approval from the Ministry and the relevant authorities in the State.\r14. Conclude agreement in any form with a foreign party outside the State before the Ministry approval, as well as any amendment to it.\r15. Deal in any way with embassies, consulates and diplomatic missions without obtaining permission from the Competent Authority, and without the approval of the Ministry of Foreign Affairs in accordance with the procedures followed in this regard.\r16. Open branches or Offices outside the State.\r17. Interfere in the work of any State or Local Government Authority.\r18. Represent any individual or group before the Court in any lawsuits related to the interests of these individuals or groups.\r19. Raise and disseminate information that urges non-respect for the Constitution, laws and legislation in force in the State, non-respect for judicial rulings, or prevention of their implementation.\r20. Publish information, news, or propaganda that would prejudice public order or harm the public interest, public security, or public morals.\r21. Hold courses, workshops, Meetings or seminars, whether inside or outside the State, that would harm public order, harm the public interest or public security, or harm public morals.\r22. Work in any way under political cover.\r23. Any other prohibitions in implementation of the legislation in force in the State.\r24. Any other prohibitions determined by the Competent Authority, pursuant to the Resolutions issued by it in this regard.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric dim_768 dim_512
cosine_accuracy@1 0.0178 0.0153
cosine_accuracy@3 0.2606 0.2606
cosine_accuracy@5 0.4958 0.4941
cosine_accuracy@10 0.7156 0.7114
cosine_precision@1 0.0178 0.0153
cosine_precision@3 0.0869 0.0869
cosine_precision@5 0.0992 0.0988
cosine_precision@10 0.0716 0.0711
cosine_recall@1 0.0178 0.0153
cosine_recall@3 0.2606 0.2606
cosine_recall@5 0.4958 0.4941
cosine_recall@10 0.7156 0.7114
cosine_ndcg@10 0.3202 0.3172
cosine_mrr@10 0.1982 0.1957
cosine_map@100 0.2108 0.2082

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 22,370 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 22 tokens
    • mean: 41.92 tokens
    • max: 77 tokens
    • min: 4 tokens
    • mean: 264.59 tokens
    • max: 512 tokens
  • Samples:
    anchor positive
    Represent this sentence for searching relevant passages: Under Section 52(1)(d) of the Bharatiya Sakshya Adhiniyam, 2023, are courts required to take judicial notice of the seals of all courts and tribunals? Document: (1) The Court shall take judicial notice of thefollowing facts, namely:--(a) all laws in force in the territory of India including laws having extra-territorial operation;(b) international treaty, agreement or convention with country or countries by India, or decisionsmade by India at international associations or other bodies;(c) the course of proceeding of the Constituent Assembly of India, of Parliament of India and ofthe State Legislatures;(d) the seals of all Courts and Tribunals;(e) the seals of Courts of Admiralty and Maritime Jurisdiction, Notaries Public, and all sealswhich any person is authorised to use by the Constitution, or by an Act of Parliament or StateLegislatures, or Regulations having the force of law in India;(f) the accession to office, names, titles, functions, and signatures of the persons filling for thetime being any public office in any State, if the fact of their appointment to such office is notified inany Official Gazette;(g) the existence, title...
    Represent this sentence for searching relevant passages: Is it permissible for a bankruptcy trustee to appoint the bankrupt to supervise the management of the estate, carry on his business, or assist in administering the estate under the Insolvency and Bankruptcy Code, 2016, Section 153? Document: The bankruptcy trustee for the purposes of thisChapter may after procuring the approval of the committee of creditors,— (a) carry on any business of the bankrupt as far as may be necessary for winding it upbeneficially;(b) bring, institute or defend any legal action or proceedings relating to the property comprised inthe estate of the bankrupt;(c) accept as consideration for the sale of any property a sum of money due at a future timesubject to certain stipulations such as security;(d) mortgage or pledge any property for the purpose of raising money for the payment of thedebts of the bankrupt;(e) where any right, option or other power forms part of the estate of the bankrupt, makepayments or incur liabilities with a view to obtaining, for the benefit of the creditors, any propertywhich is the subject of such right, option or power;(f) refer to arbitration or compromise on such terms as may be agreed, any debts subsisting orsupposed to subsist between the bankrupt and any pers...
    Represent this sentence for searching relevant passages: What insurance requirements are imposed on Federal Agencies occupying Union Owned Properties under Article (23) of the Federal Decree Concerning the Union Owned Properties? Document: Article (23) Obligations of the Federal Authorities that occupy any of the Union Owned Properties 1. In addition to the obligations stipulated herein, every Federal Agency that occupies, manages, or supervises the management of any of the Union Owned Properties shall comply, as follows:
    a. Provide a report showing the legal and surveying status of that property, estimating its value, and indicating its architectural and constructional condition, along with attaching its construction plan and any data or any facts, documents or papers related in any way to the sources of its ownership or occupancy, within a period not exceeding (6) six months from the effective date herein. His authority shall provide the Ministry with a copy of this report immediately upon completion of its preparation, and it shall renew this data and provide the Ministry with a copy of it whenever necess...
  • Loss: MatryoshkaLoss with these parameters:
    {
        "loss": "MultipleNegativesRankingLoss",
        "matryoshka_dims": [
            768,
            512
        ],
        "matryoshka_weights": [
            1,
            1
        ],
        "n_dims_per_step": -1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • gradient_accumulation_steps: 8
  • learning_rate: 2e-05
  • num_train_epochs: 4
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • bf16: True
  • tf32: True
  • load_best_model_at_end: True
  • optim: adamw_torch_fused
  • prompts: {'anchor': 'Represent this sentence for searching relevant passages: ', 'positive': 'Document: '}
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 8
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: {'anchor': 'Represent this sentence for searching relevant passages: ', 'positive': 'Document: '}
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss dim_768_cosine_ndcg@10 dim_512_cosine_ndcg@10
0.0286 10 0.5889 - -
0.0572 20 0.4858 - -
0.0858 30 0.4432 - -
0.1144 40 0.3437 - -
0.1430 50 0.2103 - -
0.1716 60 0.1903 - -
0.2002 70 0.1414 - -
0.2288 80 0.1627 - -
0.2574 90 0.1609 - -
0.2860 100 0.0968 - -
0.3146 110 0.1367 - -
0.3432 120 0.1228 - -
0.3718 130 0.0891 - -
0.4004 140 0.1116 - -
0.4290 150 0.1173 - -
0.4576 160 0.1162 - -
0.4862 170 0.1124 - -
0.5148 180 0.1014 - -
0.5434 190 0.0767 - -
0.5720 200 0.0745 - -
0.6006 210 0.0691 - -
0.6292 220 0.094 - -
0.6578 230 0.0692 - -
0.6864 240 0.0471 - -
0.7151 250 0.0647 - -
0.7437 260 0.077 - -
0.7723 270 0.0551 - -
0.8009 280 0.0538 - -
0.8295 290 0.0863 - -
0.8581 300 0.0698 - -
0.8867 310 0.0599 - -
0.9153 320 0.0494 - -
0.9439 330 0.0746 - -
0.9725 340 0.0544 - -
0.9982 349 - 0.3143 0.3102
1.0021 350 0.06 - -
1.0307 360 0.09 - -
1.0593 370 0.0597 - -
1.0880 380 0.0613 - -
1.1166 390 0.0589 - -
1.1452 400 0.0309 - -
1.1738 410 0.0378 - -
1.2024 420 0.0417 - -
1.2310 430 0.0417 - -
1.2596 440 0.0412 - -
1.2882 450 0.0214 - -
1.3168 460 0.0374 - -
1.3454 470 0.0388 - -
1.3740 480 0.0188 - -
1.4026 490 0.0247 - -
1.4312 500 0.0275 - -
1.4598 510 0.0336 - -
1.4884 520 0.017 - -
1.5170 530 0.0234 - -
1.5456 540 0.0163 - -
1.5742 550 0.0193 - -
1.6028 560 0.0209 - -
1.6314 570 0.0252 - -
1.6600 580 0.02 - -
1.6886 590 0.0199 - -
1.7172 600 0.0162 - -
1.7458 610 0.0246 - -
1.7744 620 0.0133 - -
1.8030 630 0.017 - -
1.8316 640 0.0241 - -
1.8602 650 0.018 - -
1.8888 660 0.0186 - -
1.9174 670 0.0121 - -
1.9460 680 0.0264 - -
1.9746 690 0.0112 - -
1.9975 698 - 0.3174 0.3161
2.0043 700 0.0159 - -
2.0329 710 0.0295 - -
2.0615 720 0.0197 - -
2.0901 730 0.0252 - -
2.1187 740 0.019 - -
2.1473 750 0.0074 - -
2.1759 760 0.0122 - -
2.2045 770 0.0116 - -
2.2331 780 0.0113 - -
2.2617 790 0.0132 - -
2.2903 800 0.0112 - -
2.3189 810 0.0167 - -
2.3475 820 0.0078 - -
2.3761 830 0.0079 - -
2.4047 840 0.0072 - -
2.4333 850 0.008 - -
2.4619 860 0.0135 - -
2.4905 870 0.0087 - -
2.5191 880 0.0066 - -
2.5477 890 0.0052 - -
2.5763 900 0.0077 - -
2.6049 910 0.0084 - -
2.6335 920 0.0096 - -
2.6621 930 0.0067 - -
2.6907 940 0.0072 - -
2.7193 950 0.0061 - -
2.7479 960 0.0132 - -
2.7765 970 0.0061 - -
2.8051 980 0.0058 - -
2.8338 990 0.01 - -
2.8624 1000 0.0084 - -
2.8910 1010 0.0082 - -
2.9196 1020 0.0055 - -
2.9482 1030 0.0073 - -
2.9768 1040 0.0074 - -
2.9968 1047 - 0.323 0.3161
3.0064 1050 0.0086 - -
3.0350 1060 0.0127 - -
3.0636 1070 0.0083 - -
3.0922 1080 0.0111 - -
3.1208 1090 0.0091 - -
3.1494 1100 0.0037 - -
3.1780 1110 0.0074 - -
3.2066 1120 0.005 - -
3.2353 1130 0.006 - -
3.2639 1140 0.0071 - -
3.2925 1150 0.0062 - -
3.3211 1160 0.008 - -
3.3497 1170 0.0042 - -
3.3783 1180 0.003 - -
3.4069 1190 0.0049 - -
3.4355 1200 0.004 - -
3.4641 1210 0.0062 - -
3.4927 1220 0.0056 - -
3.5213 1230 0.0048 - -
3.5499 1240 0.0034 - -
3.5785 1250 0.0045 - -
3.6071 1260 0.0041 - -
3.6357 1270 0.0048 - -
3.6643 1280 0.0045 - -
3.6929 1290 0.0044 - -
3.7215 1300 0.0047 - -
3.7501 1310 0.0061 - -
3.7787 1320 0.0037 - -
3.8073 1330 0.0045 - -
3.8359 1340 0.0068 - -
3.8645 1350 0.0048 - -
3.8931 1360 0.0056 - -
3.9217 1370 0.0049 - -
3.9503 1380 0.0055 - -
3.9789 1390 0.004 - -
3.9961 1396 - 0.3202 0.3172
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.5
  • Sentence Transformers: 3.3.1
  • Transformers: 4.46.3
  • PyTorch: 2.4.1+cu121
  • Accelerate: 0.34.2
  • Datasets: 3.0.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
34
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Tejasw1/votum-acts-v1

Finetuned
(325)
this model

Evaluation results