shubhamkrishna's picture
Create README.md
f5afbf0
|
raw
history blame
1.13 kB

City-Country-NER

A bert-base-uncased model finetuned on a custom dataset to detect Country and City names from a given sentence.

Custom Dataset

We weakly supervised the Ultra-Fine Entity Typing[https://www.cs.utexas.edu/~eunsol/html_pages/open_entity.html] dataset to include the City and Country information. We also did some extra preprocessing to remove false labels.

The model predicts 3 different tags:

Predicted Tag Meaning
LABEL_0 Others
LABEL_2 Country
LABEL_3 City

How to use the finetuned model?

from transformers import AutoTokenizer, AutoModelForTokenClassification

tokenizer = AutoTokenizer.from_pretrained("ml6team/bert-base-uncased-city-country-ner", use_auth_token=True)

model = AutoModelForTokenClassification.from_pretrained("ml6team/bert-base-uncased-city-country-ner", use_auth_token=True)

from transformers import pipeline

nlp = pipeline('ner', model=model, tokenizer=tokenizer, aggregation_strategy="simple")
nlp("My name is Kermit and I live in London.")