|
--- |
|
tags: |
|
- generated_from_trainer |
|
datasets: |
|
- jnlpba |
|
widget: |
|
- text: "The widespread circular form of DNA molecules inside cells creates very serious topological problems during replication. Due to the helical structure of the double helix the parental strands of circular DNA form a link of very high order, and yet they have to be unlinked before the cell division." |
|
- text: "It consists of 25 exons encoding a 1,278-amino acid glycoprotein that is composed of 13 transmembrane domains" |
|
metrics: |
|
- precision |
|
- recall |
|
- f1 |
|
- accuracy |
|
model-index: |
|
- name: biobert-finetuned-ner |
|
results: |
|
- task: |
|
name: Token Classification |
|
type: token-classification |
|
dataset: |
|
name: jnlpba |
|
type: jnlpba |
|
config: jnlpba |
|
split: train |
|
args: jnlpba |
|
metrics: |
|
- name: Precision |
|
type: precision |
|
value: 0.6550939663699308 |
|
- name: Recall |
|
type: recall |
|
value: 0.7646040175479104 |
|
- name: F1 |
|
type: f1 |
|
value: 0.7056253995312167 |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.9107839603371846 |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information the Trainer had access to. You |
|
should probably proofread and complete it, then remove this comment. --> |
|
|
|
# biobert-finetuned-ner |
|
|
|
This model is a fine-tuned version of [dmis-lab/biobert-base-cased-v1.2](https://huggingface.co/dmis-lab/biobert-base-cased-v1.2) on the jnlpba dataset. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.5113 |
|
- Precision: 0.6551 |
|
- Recall: 0.7646 |
|
- F1: 0.7056 |
|
- Accuracy: 0.9108 |
|
|
|
## Model description |
|
|
|
More information needed |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 2e-05 |
|
- train_batch_size: 16 |
|
- eval_batch_size: 16 |
|
- seed: 42 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 5 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 | Accuracy | |
|
|:-------------:|:-----:|:-----:|:---------------:|:---------:|:------:|:------:|:--------:| |
|
| 0.1815 | 1.0 | 2319 | 0.2706 | 0.6538 | 0.7704 | 0.7073 | 0.9160 | |
|
| 0.1226 | 2.0 | 4638 | 0.3230 | 0.6524 | 0.7675 | 0.7053 | 0.9118 | |
|
| 0.0813 | 3.0 | 6957 | 0.3974 | 0.6483 | 0.7611 | 0.7002 | 0.9101 | |
|
| 0.0521 | 4.0 | 9276 | 0.4529 | 0.6575 | 0.7652 | 0.7073 | 0.9121 | |
|
| 0.0356 | 5.0 | 11595 | 0.5113 | 0.6551 | 0.7646 | 0.7056 | 0.9108 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.21.1 |
|
- Pytorch 1.12.1+cu113 |
|
- Datasets 2.4.0 |
|
- Tokenizers 0.12.1 |
|
|