amanpatkar's picture
Update README.md
8119227 verified
metadata
base_model: distilbert-base-cased
datasets:
  - conll2003
license: mit
metrics:
  - precision
  - recall
  - f1
  - accuracy
tags:
  - generated_from_trainer
model-index:
  - name: distilbert-finetuned-ner
    results:
      - task:
          type: token-classification
          name: Token Classification
        dataset:
          name: conll2003
          type: conll2003
          config: conll2003
          split: validation
          args: conll2003
        metrics:
          - type: precision
            value: 1
            name: Precision
          - type: recall
            value: 1
            name: Recall
          - type: f1
            value: 1
            name: F1
          - type: accuracy
            value: 1
            name: Accuracy

distilbert-finetuned-ner

This model is a fine-tuned version of distilbert-base-cased on the conll2003 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0711
  • Precision: 1.0
  • Recall: 1.0
  • F1: 1.0
  • Accuracy: 1.0

Model description

The distilbert-finetuned-ner model is designed for Named Entity Recognition (NER) tasks. It is based on the DistilBERT architecture, which is a smaller, faster, and lighter version of BERT. DistilBERT retains 97% of BERT's language understanding while being 60% faster and 40% smaller, making it efficient for deployment in production systems.

Intended Uses & Limitations

How to use

You can use this model with Transformers pipeline for NER.

from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

tokenizer = AutoTokenizer.from_pretrained("amanpatkar/distilbert-finetuned-ner")
model = AutoModelForTokenClassification.from_pretrained("amanpatkar/distilbert-finetuned-ner")

nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "My name is Aman Patkar and I live in Gurugram, India."

ner_results = nlp(example)
print(ner_results)

Intended Uses

  • Named Entity Recognition (NER): Extracting entities such as names, locations, organizations, and miscellaneous entities from text.
  • Information Extraction: Automatically identifying and classifying key information in documents.
  • Text Preprocessing: Enhancing text preprocessing for downstream tasks like sentiment analysis and text summarization.

limitations

  • Domain Specificity: The model is trained on the CoNLL-2003 dataset, which primarily consists of newswire data. Performance may degrade on text from different domains.
  • Language Limitation: This model is trained on English text. It may not perform well on text in other languages.
  • Precision in Complex Sentences: While the model performs well on standard sentences, complex sentence structures or ambiguous contexts might pose challenges.

Training and evaluation data

The model is fine-tuned on the CoNLL-2003 dataset, a widely-used dataset for training and evaluating NER systems. The dataset includes four types of named entities: Persons (PER), Organizations (ORG), Locations (LOC), and Miscellaneous (MISC).

Abbreviation Description
O Outside of a named entity
B-MISC Beginning of a miscellaneous entity right after another miscellaneous entity
I-MISC Miscellaneous entity
B-PER Beginning of a person’s name right after another person’s name
I-PER Person’s name
B-ORG Beginning of an organization right after another organization
I-ORG organization
B-LOC Beginning of a location right after another location
I-LOC Location

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy
0.0908 1.0 1756 0.0887 1.0 1.0 1.0 1.0
0.0467 2.0 3512 0.0713 1.0 1.0 1.0 1.0
0.0276 3.0 5268 0.0711 1.0 1.0 1.0 1.0

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.1
  • Datasets 2.20.0
  • Tokenizers 0.19.1