metadata

license: apache-2.0
base_model: distilbert/distilbert-base-uncased
tags:
  - generated_from_trainer
metrics:
  - precision
  - recall
  - f1
  - accuracy
model-index:
  - name: distilbert-base-uncased-finetuned-FiNER
    results: []
datasets:
  - nlpaueb/finer-139
language:
  - en
pipeline_tag: token-classification

distilbert-base-uncased-finetuned-FiNER

This model is a fine-tuned version of distilbert/distilbert-base-uncased trained on a subset of the nlpaueb/finer-139 dataset. The subset is generated by filtering the dataset to contain only samples with at least one of the following NER tags:

'O',
'B-DebtInstrumentBasisSpreadOnVariableRate1',
'B-DebtInstrumentFaceAmount',
'B-LineOfCreditFacilityMaximumBorrowingCapacity',
'B-DebtInstrumentInterestRateStatedPercentage'

Then, it was fine-tuned to detect only the afforementioned 4 tags (plus other "O")

It achieves the following results on the evaluation set:

Loss: 0.0336
Precision: 0.9154
Recall: 0.9327
F1: 0.9240
Accuracy: 0.9917

Model description

Model based on distilbert/distilbert-base-uncased with all default parameters.

Intended uses & limitations

The model published here was trained for demo purposes only.

Training and evaluation data

Original train/validation/test splits from nlpaueb/finer-139, after filtering for samples containing at least one of the following NER tags:

'O',
'B-DebtInstrumentBasisSpreadOnVariableRate1',
'B-DebtInstrumentFaceAmount',
'B-LineOfCreditFacilityMaximumBorrowingCapacity',
'B-DebtInstrumentInterestRateStatedPercentage'

Training procedure

Follow information here https://github.com/bodias/DistilBERT-FiNER

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 6

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
0.0354	1.0	1773	0.0375	0.8639	0.8993	0.8812	0.9870
0.0242	2.0	3546	0.0296	0.8929	0.9159	0.9042	0.9895
0.0166	3.0	5319	0.0297	0.9079	0.9208	0.9143	0.9907
0.0117	4.0	7092	0.0303	0.9101	0.9293	0.9196	0.9913
0.0086	5.0	8865	0.0328	0.9065	0.9331	0.9196	0.9913
0.0062	6.0	10638	0.0336	0.9154	0.9327	0.9240	0.9917

Framework versions

Transformers 4.38.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2