youkad commited on
Commit
dd21426
Β·
1 Parent(s): 825a784

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -1
README.md CHANGED
@@ -15,4 +15,41 @@ python_version: 3.10.4
15
  license: mit
16
  ---
17
 
18
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  license: mit
16
  ---
17
 
18
+ # ProteinBind
19
+
20
+ [![View on GitHub](https://img.shields.io/badge/-View%20on%20GitHub-000?style=flat&logo=github&logoColor=white&link=https://github.com/svm-ai/svm-hackathon)](https://github.com/svm-ai/svm-hackathon)
21
+ [![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/BIOML-SVM/SVM)
22
+
23
+ ## ML-Driven Bioinformatics for Protein Mutation Analysis
24
+
25
+ This repository contains the source code and resources for our bioinformatics project aimed at identifying how gene/protein mutations alter function and which mutations can be pathogenic. Our approach is ML-driven and utilizes a multimodal contrastive learning framework, inspired by the ImageBind model by MetaAI.
26
+
27
+ ## Project Goal
28
+
29
+ Our goal is to develop a method that can predict the effect of sequence variation on the function of genes/proteins. This information is critical for understanding gene/protein function, designing new proteins, and aiding in drug discovery. By modeling these effects, we can better select patients for clinical trials and modify existing drug-like molecules to treat previously untreated populations of the same disease with different mutations.
30
+
31
+ ## Model Description
32
+
33
+ Our model uses contrastive learning across several modalities including amino acid (AA) sequences, Gene Ontology (GO) annotations, multiple sequence alignment (MSA), 3D structure, text annotations, and DNA sequences.
34
+
35
+ We utilize the following encoders for each modality:
36
+
37
+ - AA sequences: ESM v1/v2 by MetaAI
38
+ - Text annotations: Sentence-BERT (SBERT)
39
+ - 3D structure: ESMFold by MetaAI
40
+ - DNA nucleotide sequence: Nucleotide-Transformer
41
+ - MSA sequence: MSA-transformer
42
+
43
+
44
+ The NT-Xent loss function is used for contrastive learning.
45
+
46
+ ## Getting Started
47
+
48
+ Clone the repository and install the necessary dependencies. Note that we will assume you have already installed Git Large File Storage (Git LFS) as some files in this repository are tracked using Git LFS.
49
+
50
+ ## Contributing
51
+ Contributions are welcome! Please read the contributing guidelines before getting started.
52
+
53
+ ## License
54
+
55
+ This project is licensed under the terms of the MIT license.