metadata
language: bn
tags:
- bengali-ner
- bengali
- bangla
- NER
license: mit
datasets:
- wikiann
- xtreme
Multi-lingual BERT Bengali Name Entity Recognition
mBERT-Bengali-NER
is a transformer-based Bengali NER model build with bert-base-multilingual-uncased model and Wikiann Datasets.
How to Use
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("sagorsarker/mbert-bengali-ner")
model = AutoModelForTokenClassification.from_pretrained("sagorsarker/mbert-bengali-ner")
nlp = pipeline("ner", model=model, tokenizer=tokenizer, grouped_entities=True)
example = "আমি জাহিদ এবং আমি ঢাকায় বাস করি।"
ner_results = nlp(example)
print(ner_results)
Label and ID Mapping
Label ID | Label |
---|---|
0 | O |
1 | B-PER |
2 | I-PER |
3 | B-ORG |
4 | I-ORG |
5 | B-LOC |
6 | I-LOC |
Training Details
- mBERT-Bengali-NER trained with Wikiann datasets
- mBERT-Bengali-NER trained with transformers-token-classification script
- mBERT-Bengali-NER total trained 5 epochs.
- Trained in Kaggle GPU
Evaluation Results
Model | F1 | Precision | Recall | Accuracy | Loss |
---|---|---|---|---|---|
mBert-Bengali-NER | 0.97105 | 0.96769 | 0.97443 | 0.97682 | 0.12511 |