license: mit
language:
- en
metrics:
- accuracy
- precision
- recall
model-index:
- name: PolicyBERTa-7d
results: []
widget:
- text: Russia must end the war.
- text: Democratic institutions must be supported.
- text: The state must fight political corruption.
- text: Our energy economy must be nationalised.
- text: We must increase social spending.
PolicyBERTa-7d
This model is a fine-tuned version of roberta-base on data from the Manifesto Project. It was inspired by the model from Laurer (2020).
It achieves the following results on the evaluation set:
- Loss: 0.8549
- Accuracy: 0.7059
- F1-micro: 0.7059
- F1-macro: 0.6683
- F1-weighted: 0.7033
- Precision: 0.7059
- Recall: 0.7059
Model description
This model was trained on 115,943 manually annotated sentences to classify text into one of seven political categories: "external relations", "freedom and democracy", "political system", "economy", "welfare and quality of life", "fabric of society" and "social groups".
Intended uses & limitations
The model output reproduces the limitations of the dataset in terms of country coverage, time span, domain definitions and potential biases of the annotators - as any supervised machine learning model would. Applying the model to other types of data (other types of texts, countries etc.) will reduce performance.
from transformers import pipeline
import pandas as pd
classifier = pipeline(
task="text-classification",
model="niksmer/PolicyBERTa-7d")
# Load text data you want to classify
text = pd.read_csv(text.csv)
# Inference
output = classifier(df_text)
# Print output
pd.DataFrame(output).head()
Training and evaluation data
PolicyBERTa-7d was trained on the English-speaking subset of the Manifesto Project Dataset (MPDS2020a). The model was trained on 115,943 sentences from 163 political manifestos in 7 English-speaking countries (Australia, Canada, Ireland, New Zealand, South Africa, United Kingdom, United States). The manifestos were published between 1992 - 2020.
Country | Count manifestos | Count sentences | Time span |
---|---|---|---|
Australia | 18 | 14,887 | 2010-2016 |
Ireland | 23 | 24,966 | 2007-2016 |
Canada | 14 | 12,344 | 2004-2008 & 2015 |
New Zealand | 46 | 35,079 | 1993-2017 |
South Africa | 29 | 13,334 | 1994-2019 |
USA | 9 | 13,188 | 1992 & 2004-2020 |
United Kingdom | 34 | 30,936 | 1997-2019 |
Canadian manifestos between 2004 and 2008 are used as test data.
The Manifesto Project mannually annotates individual sentences from political party manifestos in 7 main political domains: 'Economy', 'External Relations', 'Fabric of Society', 'Freedom and Democracy', 'Political System', 'Welfare and Quality of Life' or 'Social Groups' - see the codebook for the exact definitions of each domain.
Tain data
Train data was higly imbalanced.
Label | Description | Count |
---|---|---|
0 | external relations | 7,640 |
1 | freedom and democracy | 5,880 |
2 | political system | 11,234 |
3 | economy | 29,218 |
4 | welfare and quality of life | 37,200 |
5 | fabric of society | 13,594 |
6 | social groups | 11,177 |
Overall count: 115,943
Validation data
The validation was created by chance.
Label | Description | Count |
---|---|---|
0 | external relations | 1,345 |
1 | freedom and democracy | 1,043 |
2 | political system | 2,038 |
3 | economy | 5,140 |
4 | welfare and quality of life | 6,554 |
5 | fabric of society | 2,384 |
6 | social groups | 1,957 |
Overall count: 20,461
Test data
The test dataset contains ten canadian manifestos between 2004 and 2008.
Label | Description | Count |
---|---|---|
0 | external relations | 824 |
1 | freedom and democracy | 296 |
2 | political system | 1,041 |
3 | economy | 2,188 |
4 | welfare and quality of life | 2,654 |
5 | fabric of society | 940 |
6 | social groups | 387 |
Overall count: 8,330
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
training_args = TrainingArguments(
warmup_steps=0,
weight_decay=0.1,
learning_rate=1e-05,
fp16 = True,
evaluation_strategy="epoch",
num_train_epochs=5,
per_device_train_batch_size=16,
overwrite_output_dir=True,
per_device_eval_batch_size=16,
save_strategy="no",
logging_dir='logs',
logging_strategy= 'steps',
logging_steps=10,
push_to_hub=True,
hub_strategy="end")
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1-micro | F1-macro | F1-weighted | Precision | Recall |
---|---|---|---|---|---|---|---|---|---|
0.9154 | 1.0 | 1812 | 0.8984 | 0.6785 | 0.6785 | 0.6383 | 0.6772 | 0.6785 | 0.6785 |
0.8374 | 2.0 | 3624 | 0.8569 | 0.6957 | 0.6957 | 0.6529 | 0.6914 | 0.6957 | 0.6957 |
0.7053 | 3.0 | 5436 | 0.8582 | 0.7019 | 0.7019 | 0.6594 | 0.6967 | 0.7019 | 0.7019 |
0.7178 | 4.0 | 7248 | 0.8488 | 0.7030 | 0.7030 | 0.6662 | 0.7011 | 0.7030 | 0.7030 |
0.6688 | 5.0 | 9060 | 0.8549 | 0.7059 | 0.7059 | 0.6683 | 0.7033 | 0.7059 | 0.7059 |
Validation evaluation
Model | Micro F1-Score | Macro F1-Score | Weighted F1-Score |
---|---|---|---|
PolicyBERTa-7d | 0.71 | 0.67 | 0.70 |
Test evaluation
Model | Micro F1-Score | Macro F1-Score | Weighted F1-Score |
---|---|---|---|
PolicyBERTa-7d | 0.65 | 0.60 | 0.65 |
Evaluation per category
Label | Validation F1-Score | Test F1-Score |
---|---|---|
external relations | 0.76 | 0.70 |
freedom and democracy | 0.61 | 0.55 |
political system | 0.55 | 0.55 |
economy | 0.74 | 0.67 |
welfare and quality of life | 0.77 | 0.72 |
fabric of society | 0.67 | 0.60 |
social groups | 0.58 | 0.41 |
Framework versions
- Transformers 4.16.2
- Pytorch 1.9.0+cu102
- Datasets 1.8.0
- Tokenizers 0.10.3