Model description
Cased fine-tuned BERT model for Hungarian, trained on (manually annotated) parliamentary pre-agenda speeches scraped from parlament.hu
.
Intended uses & limitations
The model can be used as any other (cased) BERT model. It has been tested recognizing positive, negative, and neutral sentences in (parliamentary) pre-agenda speeches, where:
- 'Label_0': Neutral
- 'Label_1': Positive
- 'Label_2': Negative
Training
The fine-tuned version of the original huBERT model (SZTAKI-HLT/hubert-base-cc
), trained on HunEmPoli corpus.
Category | Count | Ratio | Sentiment | Count | Ratio |
---|---|---|---|---|---|
Neutral | 351 | 1.85% | Neutral | 351 | 1.85% |
Fear | 162 | 0.85% | Negative | 11180 | 58.84% |
Sadness | 4258 | 22.41% | |||
Anger | 643 | 3.38% | |||
Disgust | 6117 | 32.19% | |||
Success | 6602 | 34.74% | Positive | 7471 | 39.32% |
Joy | 441 | 2.32% | |||
Trust | 428 | 2.25% | |||
Sum | 19002 |
Eval results
Class | Precision | Recall | F-Score |
---|---|---|---|
Neutral | 0.83 | 0.71 | 0.76 |
Positive | 0.87 | 0.91 | 0.9 |
Negative | 0.94 | 0.91 | 0.93 |
Macro AVG | 0.88 | 0.85 | 0.86 |
Weighted WVG | 0.91 | 0.91 | 0.91 |
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("poltextlab/HunEmBERT3")
model = AutoModelForSequenceClassification.from_pretrained("poltextlab/HunEmBERT3")
BibTeX entry and citation info
If you use the model, please cite the following paper:
Bibtex:
@ARTICLE{10149341,
author={{"U}veges, Istv{\'a}n and Ring, Orsolya},
journal={IEEE Access},
title={HunEmBERT: a fine-tuned BERT-model for classifying sentiment and emotion in political communication},
year={2023},
volume={11},
number={},
pages={60267-60278},
doi={10.1109/ACCESS.2023.3285536}
}
- Downloads last month
- 277
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.