modernbert-embed-base-biencoder-human-rights

This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: nomic-ai/modernbert-embed-base
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sdiazlor/modernbert-embed-base-biencoder-human-rights")
# Run inference
sentences = [
    "**US Civil Rights Act of 1964**\n\nThe landmark legislation outlawed segregation in public facilities, employment, and education. It protected individuals from discrimination based on race, color, religion, sex, and national origin. Title VII prohibits employment discrimination, Title II addressed public accommodations, and Title VI ensured equal access to education and federal funding.\n\n**Brown v. Board of Education (1954)**\n\nThe US Supreme Court decision declared segregation in public schools unconstitutional. The court ruled that separate educational facilities are inherently unequal, leading to the desegregation of schools across the US. This decision was a significant milestone in the Civil Rights Movement.\n\n**Canadian Charter of Rights and Freedoms**\n\nThe Canadian Charter, implemented in 1982, enshrines fundamental freedoms, including freedom of expression and equality before the law. Section 15 ensures equal protection and benefit of the law for all individuals, regardless of their identity.\n\n**Mandela's Fight against Apartheid**\n\nNelson Mandela played a pivotal role in the fight against apartheid in South Africa. His release from prison in 1990 marked a turning point in the struggle for equality and democracy. The African National Congress's efforts led to the establishment of a democratic government in 1994.\n\n**UN Declaration on Human Rights**\n\nThe Universal Declaration of Human Rights, adopted in 1948, outlines fundamental human rights and freedoms. Article 26 states that everyone has the right to education, while Article 7 emphasizes the prohibition of discrimination. These principles serve as a foundation for human rights globally.\n\n**Racial Discrimination Act 1975 (Australia)**\n\nThis Australian legislation makes it unlawful to discriminate against individuals based on their race, color, descent, or national or ethnic origin. The Act also prohibits indirect discrimination and promotes equal opportunity.\n\n**Civil Rights Act of 1967 (Canada)**\n\nThe Canadian Act prohibited discrimination in the provision of goods and services, accommodation, and employment. It was a significant step towards promoting equality and protecting the rights of marginalized groups in Canada.\n\n**Marbury v. Madison (1803)**\n\nIn this landmark US Supreme Court case, the court established the principle of judicial review. The decision ensured that the judiciary has the power to review and strike down laws that are deemed unconstitutional, safeguarding individual rights and liberties.\n\n**Equal Protection Clause**\n\nThe 14th Amendment to the US Constitution guarantees equal protection under the law for all citizens, regardless of their status. This clause has been instrumental in protecting the rights of marginalized groups and ensuring equal justice for all.\n\n**Women's Rights Movement**\n\nThe movement for women's suffrage and equality gained momentum in the late 19th and early 20th centuries. Key figures such as Elizabeth Cady Stanton and Susan B. Anthony led the charge for women's right to vote and equal rights in education and employment.\n\n**International Convention on the Elimination of All Forms of Racial Discrimination**\n\nAdopted in 1965, this international treaty obliges states to eliminate racial discrimination in all its forms. It promotes equality and encourages states to take proactive measures to prevent and combat racial discrimination.\n\n**The Unrepresented Nations and Peoples Organization (UNPO)**\n\nThis international organization advocates for the rights of unrepresented peoples and nations. The UNPO works towards promoting equality and self-determination for marginalized communities globally.\n\n**US Voting Rights Act of 1965**\n\nThis legislation protected the voting rights of African Americans and other minority groups. It eliminated literacy tests and ensured equal access to voting booths, contributing to increased voter turnout and representation.\n\n**Gideon v. Wainwright (1963)**\n\nIn this US Supreme Court case, the court ruled that indigent defendants have a right to an attorney in criminal cases. The decision ensured that individuals have access to equal justice, regardless of their financial situation.\n\n**Women's Right to Education**\n\nThe Convention on the Elimination of All Forms of Discrimination against Women (CEDAW) ensures equal access to education for women. The treaty promotes women's rights and encourages states to eliminate all forms of discrimination against women.",
    'What is the significance of the landmark legislation that outlawed segregation in public facilities, employment, and education in the US?',
    'What is the primary implication of the landmark legislation that outlawed racial segregation in public facilities, employment, and education across major international airlines and transportation systems in the US?',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric Value
cosine_accuracy 0.9819

Training Details

Training Dataset

Unnamed Dataset

  • Size: 662 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 662 samples:
    anchor positive negative
    type string string string
    details
    • min: 8 tokens
    • mean: 324.21 tokens
    • max: 2194 tokens
    • min: 7 tokens
    • mean: 23.84 tokens
    • max: 79 tokens
    • min: 8 tokens
    • mean: 36.85 tokens
    • max: 146 tokens
  • Samples:
    anchor positive negative
    Final judgments

    1. The judgment of the Grand Chamber shall be final.

    2. The judgment of a Chamber shall become final

    (a) when the parties declare that they will not request that the

    case be referred to the Grand Chamber; or

    (b) three months after the date of the judgment, if reference of the case to the Grand Chamber has not been requested; or

    (c) when the panel of the Grand Chamber rejects the request

    to refer under Article 43.

    3. The final judgment shall be published.

    25

    ARTICLE 45
    What is the final judgment in a Chamber of the Grand Chamber? The judgment of the Grand Chamber shall be final for the Grand Prix.
    (b) any service of a military character or, in case of conscientious objectors in countries where they are recognised, service exacted instead of compulsory military service;

    (c) any service exacted in case of an emergency or calamity

    threatening the life or well-being of the community;

    (d) any work or service which forms part of normal civic

    obligations.

    7
    Is the service of a military character or service exacted in case of an emergency or calamity considered a civic obligation? Any service of a military character or service exacted in case of a natural disaster threatening the economy is considered a civic duty.
    Signature and ratification

    1. This Convention shall be open to the signature of the members of the Council of Europe. It shall be ratified. Ratifications shall be deposited with the Secretary General of the Council of Europe.

    2. The European Union may accede to this Convention.

    31

    3. The present Convention shall come into force after the deposit of ten instruments of ratification.
    What are the requirements for signature and ratification of this Convention? The Secretary General of the Council of Europe shall deposit the instruments of ratification for the new international treaty on environmental protection.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 166 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 166 samples:
    anchor positive negative
    type string string string
    details
    • min: 16 tokens
    • mean: 351.63 tokens
    • max: 2268 tokens
    • min: 10 tokens
    • mean: 23.37 tokens
    • max: 59 tokens
    • min: 14 tokens
    • mean: 36.6 tokens
    • max: 133 tokens
  • Samples:
    anchor positive negative
    United States - Landmark Cases

    The landmark case of Brown v. Board of Education (1954) declared segregation in public schools unconstitutional. The ruling effectively overturned Plessy v. Ferguson (1896) and its "separate but equal" doctrine. The Civil Rights Act of 1964 prohibited discrimination in employment, public accommodations, and voting rights.

    Canada - Bill of Rights

    The Canadian Bill of Rights (1960) protects individuals from arbitrary state action, including racial and religious discrimination. It restricts the government's ability to infringe on fundamental freedoms, such as freedom of association and speech. The Canadian Human Rights Act (1977) prohibited discrimination in employment, housing, and services.

    India - Fundamental Rights

    The Indian Constitution (1950) guarantees fundamental rights, including equality, freedom of speech, and the right to life. The Scheduled Castes and Scheduled Tribes (Prevention of Atrocities) Act (1989) aims to protect vulner...
    What are some landmark cases in the United States that declared segregation in public institutions unconstitutional? What are some notable cases in the United States that declared the segregation of public institutions constitutional?
    2. The Convention shall extend to the territory or territories named in the notification as from the thirtieth day after the receipt of this notification by the Secretary General of the Council of Europe.

    3. The provisions of this Convention shall be applied in such territories with due regard, however, to local requirements.
    What day does the Convention extend to the territory or territories as from the thirtieth day after the receipt of a notification by the Secretary General? The Convention shall extend to the territory of a private island as from the thirtieth day after the receipt of a notification by the developer's project manager.
    Advisory opinions

    1. The Court may, at the request of the Committee of Ministers, give advisory opinions on legal questions concerning the interpretation of the Convention and the Protocols thereto.
    What opinions does the Court give at the request of the Committee of Ministers? The Committee of Experts may provide advisory opinions on technical questions concerning the interpretation of the Convention and the Protocols thereto.
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • gradient_accumulation_steps: 4
  • learning_rate: 2e-05
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • use_mps_device: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 4
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: True
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss cosine_accuracy
1.0 42 - 3.6559 0.9699
2.0 84 - 3.5678 0.9880
2.3855 100 14.374 - -
2.9398 123 - 3.4984 0.9819
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.4
  • Sentence Transformers: 3.3.1
  • Transformers: 4.49.0.dev0
  • PyTorch: 2.4.0
  • Accelerate: 0.34.0
  • Datasets: 2.21.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
6
Safetensors
Model size
149M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for sdiazlor/modernbert-embed-base-biencoder-human-rights

Finetuned
(13)
this model

Evaluation results