File size: 1,133 Bytes
f5afbf0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
## City-Country-NER
A `bert-base-uncased` model finetuned on a custom dataset to detect `Country` and `City` names from a given sentence.
### Custom Dataset
We weakly supervised the `Ultra-Fine Entity Typing[https://www.cs.utexas.edu/~eunsol/html_pages/open_entity.html]` dataset to include the `City` and `Country` information. We also did some extra preprocessing to remove false labels.
The model predicts 3 different tags:
| **Predicted Tag**| **Meaning** |
|------------------|-------------|
| LABEL_0 | Others |
| LABEL_2 | Country |
| LABEL_3 | City |
### How to use the finetuned model?
```
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("ml6team/bert-base-uncased-city-country-ner", use_auth_token=True)
model = AutoModelForTokenClassification.from_pretrained("ml6team/bert-base-uncased-city-country-ner", use_auth_token=True)
from transformers import pipeline
nlp = pipeline('ner', model=model, tokenizer=tokenizer, aggregation_strategy="simple")
nlp("My name is Kermit and I live in London.")
``` |