Mardiyyah
/

bioformer-ner-model

Token Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

bioformer-ner-model / README.md

Mardiyyah's picture

Update README.md

b48c91f verified 6 months ago

|

history blame contribute delete

3.92 kB

	---
	license: apache-2.0
	base_model: bioformers/bioformer-16L
	tags:
	- generated_from_trainer
	metrics:
	- f1
	- precision
	- recall
	- accuracy
	model-index:
	- name: cl_ct_custom_model
	results: []
	datasets:
	- tner/bionlp2004
	language:
	- en
	pipeline_tag: token-classification
	inference: true
	library_name: transformers
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# cl_ct_custom_model

	This model is a fine-tuned version of [bioformers/bioformer-16L](https://huggingface.co/bioformers/bioformer-16L) on the (https://huggingface.co/datasets/tner/bionlp2004) dataset.
	It achieves the following results on the evaluation set:

	- Loss: 0.2590
	- F1: 0.7609
	- Precision: 0.7112
	- Recall: 0.8181
	- Accuracy: 0.9229

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 16
	- eval_batch_size: 8
	- seed: 3407
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 64
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 10
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| F1 \| Precision \| Recall \| Accuracy \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|:------:\|:---------:\|:------:\|:--------:\|
	\| 0.4568 \| 0.9971 \| 259 \| 0.2146 \| 0.8139 \| 0.7920 \| 0.8370 \| 0.9326 \|
	\| 0.2115 \| 1.9981 \| 519 \| 0.1907 \| 0.8349 \| 0.8125 \| 0.8586 \| 0.9379 \|
	\| 0.1802 \| 2.9990 \| 779 \| 0.1912 \| 0.8407 \| 0.8178 \| 0.8650 \| 0.9394 \|
	\| 0.164 \| 4.0 \| 1039 \| 0.1869 \| 0.8449 \| 0.8255 \| 0.8652 \| 0.9401 \|
	\| 0.1518 \| 4.9971 \| 1298 \| 0.1819 \| 0.8525 \| 0.8348 \| 0.8710 \| 0.9428 \|
	\| 0.1424 \| 5.9981 \| 1558 \| 0.1842 \| 0.8506 \| 0.8351 \| 0.8666 \| 0.9422 \|
	\| 0.134 \| 6.9990 \| 1818 \| 0.1869 \| 0.8539 \| 0.8373 \| 0.8712 \| 0.9428 \|
	\| 0.128 \| 8.0 \| 2078 \| 0.1889 \| 0.8540 \| 0.8374 \| 0.8712 \| 0.9429 \|
	\| 0.1241 \| 8.9971 \| 2337 \| 0.1892 \| 0.8559 \| 0.8401 \| 0.8724 \| 0.9432 \|
	\| 0.1199 \| 9.9711 \| 2590 \| 0.1899 \| 0.8552 \| 0.8392 \| 0.8718 \| 0.9431 \|

	## Eval Classification report

	\| Class \| Precision \| Recall \| F1-Score \| Support \|
	\|-------------\|------------\|--------\|----------\|---------\|
	\| DNA \| 0.78 \| 0.84 \| 0.81 \| 2494 \|
	\| RNA \| 0.83 \| 0.89 \| 0.86 \| 238 \|
	\| Cell Line \| 0.81 \| 0.85 \| 0.83 \| 1050 \|
	\| Cell Type \| 0.74 \| 0.79 \| 0.77 \| 775 \|
	\| Protein \| 0.88 \| 0.90 \| 0.89 \| 6196 \|
	\| Micro Avg \| 0.84 \| 0.87 \| 0.86 \| 10753 \|
	\| Macro Avg \| 0.81 \| 0.86 \| 0.83 \| 10753 \|
	\| Weighted Avg \| 0.84 \| 0.87 \| 0.86 \| 10753 \|


	## Test Results

	\| Class \| Precision \| Recall \| F1-Score \| Support \|
	\|-------------\|-----------\|--------\|----------\|---------\|
	\| DNA \| 0.74 \| 0.79 \| 0.76 \| 2210 \|
	\| RNA \| 0.73 \| 0.76 \| 0.75 \| 287 \|
	\| Cell Line \| 0.50 \| 0.76 \| 0.61 \| 1057 \|
	\| Cell Type \| 0.75 \| 0.68 \| 0.71 \| 2761 \|
	\| Protein \| 0.72 \| 0.87 \| 0.79 \| 10082 \|
	\| Micro Avg \| 0.71 \| 0.82 \| 0.76 \| 16397 \|
	\| Macro Avg \| 0.69 \| 0.77 \| 0.72 \| 16397 \|
	\| Weighted Avg \| 0.72 \| 0.82 \| 0.76 \| 16397 \|


	### Framework versions

	- Transformers 4.43.4
	- Pytorch 2.4.1+cu121
	- Datasets 2.20.0
	- Tokenizers 0.19.1