City-Country-NER
A bert-base-uncased
model finetuned on a custom dataset to detect Country
and City
names from a given sentence.
Custom Dataset
We weakly supervised the Ultra-Fine Entity Typing[https://www.cs.utexas.edu/~eunsol/html_pages/open_entity.html]
dataset to include the City
and Country
information. We also did some extra preprocessing to remove false labels.
The model predicts 3 different tags:
Predicted Tag | Meaning |
---|---|
LABEL_0 | Others |
LABEL_2 | Country |
LABEL_3 | City |
How to use the finetuned model?
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("ml6team/bert-base-uncased-city-country-ner", use_auth_token=True)
model = AutoModelForTokenClassification.from_pretrained("ml6team/bert-base-uncased-city-country-ner", use_auth_token=True)
from transformers import pipeline
nlp = pipeline('ner', model=model, tokenizer=tokenizer, aggregation_strategy="simple")
nlp("My name is Kermit and I live in London.")