Model Specification
- This is the state-of-the-art Twitter NER model (with 74.35% Entity-Level F1) on Tweebank V2's NER benchmark (also called
Tweebank-NER
), trained on the corpus combining both Tweebank-NER and WNUT 17 training data. - For more details about the
TweebankNLP
project, please refer to this our paper and github page. - In the paper, it is referred as
HuggingFace-BERTweet (TB2+W17).
How to use the model
- PRE-PROCESSING: when you apply the model on tweets, please make sure that tweets are preprocessed by the TweetTokenizer to get the best performance.
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("TweebankNLP/bertweet-tb2_wnut17-ner")
model = AutoModelForTokenClassification.from_pretrained("TweebankNLP/bertweet-tb2_wnut17-ner")
References
If you use this repository in your research, please kindly cite our paper:
@article{jiang2022tweetnlp,
title={Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis},
author={Jiang, Hang and Hua, Yining and Beeferman, Doug and Roy, Deb},
journal={In Proceedings of the 13th Language Resources and Evaluation Conference (LREC)},
year={2022}
}
- Downloads last month
- 784
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.