YAML Metadata Error: "datasets[1]" with value "Custom Rosetta" is not valid. If possible, use a dataset id from https://hf.co/datasets.
YAML Metadata Error: "language" with value "protein" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.

ProtBert-BFD finetuned on Rosetta 20AA dataset

This model is finetuned to predict Rosetta fold energy using a dataset of 100k 20AA sequences.

Current model in this repo: prot_bert_bfd-finetuned-032722_1752

Performance

  • 20AA sequences (1k eval set):
    Metrics: 'mae': 0.090115, 'r2': 0.991208, 'mse': 0.013034, 'rmse': 0.114165

  • 40AA sequences (10k eval set):
    Metrics: 'mae': 0.537456, 'r2': 0.659122, 'mse': 0.448607, 'rmse': 0.669781

  • 60AA sequences (10k eval set):
    Metrics: 'mae': 0.629267, 'r2': 0.506747, 'mse': 0.622476, 'rmse': 0.788972

prot_bert_bfd from ProtTrans

The starting pretrained model is from ProtTrans, trained on 2.1 billion proteins from BFD. It was trained on protein sequences using a masked language modeling (MLM) objective. It was introduced in this paper and first released in this repository.

Created by Ladislav Rampasek

Downloads last month
108
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.