AmelieSchreiber
/

esm2_t12_35M_lora_binding_sites_v2_cp1

Token Classification

protein language model

Model card Files Files and versions Community

AmelieSchreiber commited on Sep 13, 2023

Commit

400bbb0

•

1 Parent(s): 49e961f

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -26,7 +26,9 @@ and [here](https://huggingface.co/docs/transformers/model_doc/esm) for more deta
 the binay token classification task of predicting binding sites (and active sites) of protein sequences based on sequence alone.
 The model may be underfit and undertrained, however it still achieved better performance on the test set in terms of loss, accuracy,
 precision, recall, F1 score, ROC_AUC, and Matthews Correlation Coefficient (MCC) compared to the models trained on the smaller
-dataset [found here](https://huggingface.co/datasets/AmelieSchreiber/binding_sites_random_split_by_family) of ~209K protein sequences.
 ## Training procedure

 the binay token classification task of predicting binding sites (and active sites) of protein sequences based on sequence alone.
 The model may be underfit and undertrained, however it still achieved better performance on the test set in terms of loss, accuracy,
 precision, recall, F1 score, ROC_AUC, and Matthews Correlation Coefficient (MCC) compared to the models trained on the smaller
+dataset [found here](https://huggingface.co/datasets/AmelieSchreiber/binding_sites_random_split_by_family) of ~209K protein sequences. Note,
+this model has a high recall, meaning it is likely to detect binding sites, but it has a low precision, meaning the model will likely return
+false positives as well.
 ## Training procedure