Bengali Named Entity Recognition
Fine-tuning bert-base-multilingual-cased on Wikiann dataset for performing NER on Bengali language.Label ID and its corresponding label name
Label ID | Label Name |
---|---|
0 | O |
1 | B-PER |
2 | I-PER |
3 | B-ORG |
4 | I-ORG |
5 | B-LOC |
6 | I-LOC |
Results
Name | Overall F1 | LOC F1 | ORG F1 | PER F1 |
---|---|---|---|---|
Train set | 0.997927 | 0.998246 | 0.996613 | 0.998769 |
Validation set | 0.970187 | 0.969212 | 0.956831 | 0.982079 |
Test set | 0.9673011 | 0.967120 | 0.963614 | 0.970938 |
Example
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline
tokenizer = AutoTokenizer.from_pretrained("Suchandra/bengali_language_NER")
model = AutoModelForTokenClassification.from_pretrained("Suchandra/bengali_language_NER")
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "মারভিন দি মারসিয়ান"
ner_results = nlp(example)
ner_results
- Downloads last month
- 6,863
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.