PolicyBERTa-7d / README.md
niksmer's picture
Update README.md
3b2c9a6
|
raw
history blame
7.55 kB
metadata
license: mit
language:
  - en
metrics:
  - accuracy
  - precision
  - recall
model-index:
  - name: PolicyBERTa-7d
    results: []
widget:
  - text: Russia must end the war.
  - text: Democratic institutions must be supported.
  - text: The state must fight political corruption.
  - text: Our energy economy must be nationalised.
  - text: We must increase social spending.

PolicyBERTa-7d

This model is a fine-tuned version of roberta-base on data from the Manifesto Project. It was inspired by the model from Laurer (2020).

It achieves the following results on the evaluation set:

  • Loss: 0.8549
  • Accuracy: 0.7059
  • F1-micro: 0.7059
  • F1-macro: 0.6683
  • F1-weighted: 0.7033
  • Precision: 0.7059
  • Recall: 0.7059

Model description

This model was trained on 115,943 manually annotated sentences to classify text into one of seven political categories: "external relations", "freedom and democracy", "political system", "economy", "welfare and quality of life", "fabric of society" and "social groups".

Intended uses & limitations

The model output reproduces the limitations of the dataset in terms of country coverage, time span, domain definitions and potential biases of the annotators - as any supervised machine learning model would. Applying the model to other types of data (other types of texts, countries etc.) will reduce performance.

from transformers import pipeline
import pandas as pd

classifier = pipeline(
    task="text-classification",
    model="niksmer/PolicyBERTa-7d")

# Load text data you want to classify
text = pd.read_csv(text.csv)

# Inference
output = classifier(df_text)

# Print output
pd.DataFrame(output).head()

Training and evaluation data

PolicyBERTa-7d was trained on the English-speaking subset of the Manifesto Project Dataset (MPDS2020a). The model was trained on 115,943 sentences from 163 political manifestos in 7 English-speaking countries (Australia, Canada, Ireland, New Zealand, South Africa, United Kingdom, United States). The manifestos were published between 1992 - 2020.

Country Count manifestos Count sentences Time span
Australia 18 14,887 2010-2016
Ireland 23 24,966 2007-2016
Canada 14 12,344 2004-2008 & 2015
New Zealand 46 35,079 1993-2017
South Africa 29 13,334 1994-2019
USA 9 13,188 1992 & 2004-2020
United Kingdom 34 30,936 1997-2019

Canadian manifestos between 2004 and 2008 are used as test data.

The Manifesto Project mannually annotates individual sentences from political party manifestos in 7 main political domains: 'Economy', 'External Relations', 'Fabric of Society', 'Freedom and Democracy', 'Political System', 'Welfare and Quality of Life' or 'Social Groups' - see the codebook for the exact definitions of each domain.

Tain data

Train data was higly imbalanced.

Label Description Count
0 external relations 7,640
1 freedom and democracy 5,880
2 political system 11,234
3 economy 29,218
4 welfare and quality of life 37,200
5 fabric of society 13,594
6 social groups 11,177

Overall count: 115,943

Validation data

The validation was created by chance.

Label Description Count
0 external relations 1,345
1 freedom and democracy 1,043
2 political system 2,038
3 economy 5,140
4 welfare and quality of life 6,554
5 fabric of society 2,384
6 social groups 1,957

Overall count: 20,461

Test data

The test dataset contains ten canadian manifestos between 2004 and 2008.

Label Description Count
0 external relations 824
1 freedom and democracy 296
2 political system 1,041
3 economy 2,188
4 welfare and quality of life 2,654
5 fabric of society 940
6 social groups 387

Overall count: 8,330

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

training_args = TrainingArguments(
    warmup_steps=0,
    weight_decay=0.1, 
    learning_rate=1e-05,
    fp16 = True,
    evaluation_strategy="epoch",
    num_train_epochs=5,
    per_device_train_batch_size=16,
    overwrite_output_dir=True,
    per_device_eval_batch_size=16,
    save_strategy="no",
    logging_dir='logs',   
    logging_strategy= 'steps',     
    logging_steps=10,
    push_to_hub=True,
    hub_strategy="end")

Training results

Training Loss Epoch Step Validation Loss Accuracy F1-micro F1-macro F1-weighted Precision Recall
0.9154 1.0 1812 0.8984 0.6785 0.6785 0.6383 0.6772 0.6785 0.6785
0.8374 2.0 3624 0.8569 0.6957 0.6957 0.6529 0.6914 0.6957 0.6957
0.7053 3.0 5436 0.8582 0.7019 0.7019 0.6594 0.6967 0.7019 0.7019
0.7178 4.0 7248 0.8488 0.7030 0.7030 0.6662 0.7011 0.7030 0.7030
0.6688 5.0 9060 0.8549 0.7059 0.7059 0.6683 0.7033 0.7059 0.7059

Validation evaluation

Model Micro F1-Score Macro F1-Score Weighted F1-Score
PolicyBERTa-7d 0.71 0.67 0.70

Test evaluation

Model Micro F1-Score Macro F1-Score Weighted F1-Score
PolicyBERTa-7d 0.65 0.60 0.65

Evaluation per category

Label Validation F1-Score Test F1-Score
external relations 0.76 0.70
freedom and democracy 0.61 0.55
political system 0.55 0.55
economy 0.74 0.67
welfare and quality of life 0.77 0.72
fabric of society 0.67 0.60
social groups 0.58 0.41

Framework versions

  • Transformers 4.16.2
  • Pytorch 1.9.0+cu102
  • Datasets 1.8.0
  • Tokenizers 0.10.3