File size: 1,764 Bytes
63f3feb 07123ec 63f3feb a7e373a 63f3feb d9cdfd9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
---
datasets:
- EvaKlimentova/knots_AF
license: apache-2.0
---
# M2 - small CNN trained on embeddings
The model is trained on [ProtBert-BFD](https://huggingface.co/Rostlab/prot_bert_bfd) embeddings of [knots_AF dataset](https://huggingface.co/datasets/EvaKlimentova/knots_AF) to recognize between knotted and unknotted proteins based on their amino acid sequence.
Accuracy on the test set:
| | Dataset size | Unknotted set size | Accuracy | TPR | TNR |
|:----------------------------:|:------------:|:------------------:|:--------:|:------:|:------:|
| All | 39412 | 19718 | 0.9690 | 0.9569 | 0.9811 |
| SPOUT | 7371 | 550 | 0.9712 | 0.9815 | 0.8436 |
| TDD | 612 | 24 | 0.9673 | 0.9796 | 0.6667 |
| DUF | 716 | 429 | 0.9413 | 0.8955 | 0.9720 |
| AdoMet synthase | 1794 | 240 | 0.9727 | 0.9755 | 0.9542 |
| Carbonic anhydrase | 1531 | 539 | 0.8870 | 0.8619 | 0.9332 |
| UCH | 477 | 125 | 0.8700 | 0.8892 | 0.816 |
| ATCase/OTCase | 3799 | 3352 | 0.9932 | 0.9418 | 1.0 |
| ribosomal-mitochondrial | 147 | 41 | 0.8163 | 0.8319 | 0.7805 |
| membrane | 8309 | 1577 | 0.9740 | 0.9857 | 0.9239 |
| VIT | 14347 | 12639 | 0.9742 | 0.8214 | 0.9948 |
| biosynthesis of lantibiotics | 392 | 286 | 0.9388 | 0.8019 | 0.9895 | |