LukeGPT88
/

distilbert-base-eng-cased-ner

Token Classification

Safetensors

English

distilbert

Model card Files Files and versions Community

LukeGPT88 commited on Nov 4, 2024

Commit

59db909

verified ·

1 Parent(s): ffbf15d

Update README.md

Browse files

Files changed (1) hide show

README.md +8 -18

README.md CHANGED Viewed

@@ -7,7 +7,7 @@ metrics:
 - f1
 - accuracy
 model-index:
-- name: distilbert-NER
   results: []
 datasets:
 - conll2003
@@ -20,7 +20,7 @@ pipeline_tag: token-classification
 ## Model description
-**distilbert-NER** is the fine-tuned version of **DistilBERT**, which is a distilled variant of the BERT model. DistilBERT has fewer parameters than BERT, making it smaller, faster, and more efficient. distilbert-NER is specifically fine-tuned for the task of **Named Entity Recognition (NER)**.
 This model accurately identifies the same four types of entities as its BERT counterparts: location (LOC), organizations (ORG), person (PER), and Miscellaneous (MISC). Although it is a more compact model, distilbert-NER demonstrates a robust performance in NER tasks, balancing between size, speed, and accuracy.
@@ -36,8 +36,8 @@ This model can be utilized with the Transformers *pipeline* for NER, similar to
 from transformers import AutoTokenizer, AutoModelForTokenClassification
 from transformers import pipeline
-tokenizer = AutoTokenizer.from_pretrained("LukeGPT88/distilbert-NER")
-model = AutoModelForTokenClassification.from_pretrained("LukeGPT88/distilbert-NER")
 nlp = pipeline("ner", model=model, tokenizer=tokenizer)
 example = "My name is Wolfgang and I live in Berlin"
@@ -48,7 +48,7 @@ print(ner_results)
 #### Limitations and bias
-The performance of distilbert-NER is linked to its training on the CoNLL-2003 dataset. Therefore, it might show limited effectiveness on text data that significantly differs from this training set. Users should be aware of potential biases inherent in the training data and the possibility of entity misclassification in complex sentences.
 ## Training data
@@ -85,20 +85,10 @@ Train |946 |14,987 |203,621
 Dev |216 |3,466 |51,362
 Test |231 |3,684 |46,435
-## Training procedure
-This model was trained on a single NVIDIA V100 GPU with recommended hyperparameters from the [original BERT paper](https://arxiv.org/pdf/1810.04805) which trained & evaluated the model on CoNLL-2003 NER task.
-## Eval results
-| Metric     | Score |
-|------------|-------|
-| Loss       | 0.0710|
-| Precision  | 0.9202|
-| Recall     | 0.9232|
-| F1         | 0.9217|
-| Accuracy   | 0.9810|
-The training and validation losses demonstrate a decrease over epochs, signaling effective learning. The precision, recall, and F1 scores are competitive, showcasing the model's robustness in NER tasks.
 ### BibTeX entry and citation info

 - f1
 - accuracy
 model-index:
+- name: distilbert-base-eng-cased-ner
   results: []
 datasets:
 - conll2003
 ## Model description
+**distilbert-base-eng-cased-ner** is the fine-tuned version of **DistilBERT**, which is a distilled variant of the BERT model. DistilBERT has fewer parameters than BERT, making it smaller, faster, and more efficient. distilbert-NER is specifically fine-tuned for the task of **Named Entity Recognition (NER)**.
 This model accurately identifies the same four types of entities as its BERT counterparts: location (LOC), organizations (ORG), person (PER), and Miscellaneous (MISC). Although it is a more compact model, distilbert-NER demonstrates a robust performance in NER tasks, balancing between size, speed, and accuracy.
 from transformers import AutoTokenizer, AutoModelForTokenClassification
 from transformers import pipeline
+tokenizer = AutoTokenizer.from_pretrained("LukeGPT88/distilbert-base-eng-cased-ner")
+model = AutoModelForTokenClassification.from_pretrained("LukeGPT88/distilbert-base-eng-cased-ner")
 nlp = pipeline("ner", model=model, tokenizer=tokenizer)
 example = "My name is Wolfgang and I live in Berlin"
 #### Limitations and bias
+The performance of distilbert-base-eng-cased-ner is linked to its training on the CoNLL-2003 dataset. Therefore, it might show limited effectiveness on text data that significantly differs from this training set. Users should be aware of potential biases inherent in the training data and the possibility of entity misclassification in complex sentences.
 ## Training data
 Dev |216 |3,466 |51,362
 Test |231 |3,684 |46,435
+## Training procedure and Eval Results
+Training and evaluation results come from the model on
+https://huggingface.co/dslim/distilbert-NER
 ### BibTeX entry and citation info