Portuguese NER BERT-CRF Conll 2003

This model is a fine-tuned BERT model adapted for Named Entity Recognition (NER) tasks. It utilizes Conditional Random Fields (CRF) as the decoder.

The model follows the Conll 2003 labeling scheme for NER. Additionally, it provides options for HAREM Default and Selective labeling schemes.

How to Use

You can employ this model using the Transformers library's pipeline for NER, or incorporate it as a conventional Transformer in the HuggingFace ecosystem.

from transformers import pipeline
import torch
import nltk

ner_classifier = pipeline(
    "ner",
    model="arubenruben/NER-PT-BERT-CRF-Conll2003",
    device=torch.device("cuda:0") if torch.cuda.is_available() else torch.device("cpu"),
    trust_remote_code=True
)

text = "FCPorto vence o Benfica por 5-0 no Estádio do Dragão"
tokens = nltk.wordpunct_tokenize(text)
result = ner_classifier(tokens)

Demo

There is a Notebook available to test our code.

PT-Pump-Up

This model is integrated in the project PT-Pump-Up

Evaluation

Testing Data

The model was tested on the Portuguese Wikineural Dataset.

Results

F1-Score: 0.951

Citation

Citation will be made available soon.

BibTeX: :(

Downloads last month
56
Inference Examples
Inference API (serverless) has been turned off for this model.

Datasets used to train arubenruben/NER-PT-BERT-CRF-Conll2003

Collection including arubenruben/NER-PT-BERT-CRF-Conll2003