Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Description

This is a sentence-transformers model: It maps sentences & paragraphs to a 384 dimensional dense vector space and can be used for tasks like clustering or semantic search.The 'Calcu_Disease_Similarity' model is designed to encode two disease terms and compute their semantic similarity. The model has been fine-tuned on disease-related datasets 'MeSHDS' and achieves a high F1 score in distinguishing experimentally validated miRNA-target interactions (MTIs) and predicted MTIs by considering disease similarity.

If you use this model in your research, please cite the following paper:

@article {Chen2024.05.17.594604,
    author = {Chen, Baiming},
    title = {Refining Protein-Level MicroRNA Target Interactions in Disease from Prediction Databases Using Sentence-BERT},
    elocation-id = {2024.05.17.594604},
    year = {2024},
    doi = {10.1101/2024.05.17.594604},
    publisher = {Cold Spring Harbor Laboratory},
    URL = {https://www.biorxiv.org/content/early/2024/09/18/2024.05.17.594604},
    eprint = {https://www.biorxiv.org/content/early/2024/09/18/2024.05.17.594604.full.pdf},
    journal = {bioRxiv}
}

Key Features:

  • Fine-tuned to compute semantic similarity between disease names.
  • Achieves an F1 score of 0.88 in distinguishing protein-level experimentally (western blot, reporter assay) validated MTIs and predicted MTIs.
  • Built for applications in understanding miRNA-gene regulatory networks, disease diagnosis, treatment, and drug discovery.

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
  (2): Normalize()
)

Usage (Sentence-Transformers)

pip install -U sentence-transformers

Then you can use the model like this:

Download all the files from the "files and versions" section and create a folder named 'Calcu_Disease_Similarity'. Once you've done that, you can load the model and compute disease similarity as shown below:

# Load the pre-trained SBERT model
from sentence_transformers import SentenceTransformer, util

# Replace 'your/path/to/Calcu_Disease_Similarity' with the actual path to the model
model = SentenceTransformer('your/path/to/Calcu_Disease_Similarity')

# Example usage
disease1 = "lung cancer"
disease2 = "pulmonary fibrosis"

def sts(sentence_a, sentence_b) -> float:

  query_emb = model.encode(sentence_a)
  doc_emb = model.encode(sentence_b)
  [score] = util.dot_score(query_emb, doc_emb)[0].tolist()

  return score

similarity = sts(disease1, disease2)

Additional Information

License

This model is licensed under CC-BY-NC 4.0 International license. If you use this model, please adhere to the license requirements.

Questions or Issues

If you encounter any issues or have any questions while using the model, feel free to reach out to the author for assistance. Thank you for your support and for using this model!

Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Baiming123/Calcu_Disease_Similarity

Finetuned
(9)
this model

Dataset used to train Baiming123/Calcu_Disease_Similarity