PolicyBERTa-7d / README.md

Update README.md

bc43590 almost 3 years ago

7.55 kB

	---
	license: mit
	language:
	- en
	metrics:
	- accuracy
	- precision
	- recall
	model-index:
	- name: PolicyBERTa-7d
	results: []
	widget:
	- text: "Russia must end the war."
	- text: "Democratic institutions must be supported."
	- text: "The state must fight political corruption."
	- text: "Our energy economy must be nationalised."
	- text: "We must increase social spending."

	---

	# PolicyBERTa-7d

	This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on data from the [Manifesto Project](https://manifesto-project.wzb.eu/). It was inspired by the model from [Laurer (2020)](https://huggingface.co/MoritzLaurer/policy-distilbert-7d).

	It achieves the following results on the evaluation set:
	- Loss: 0.8549
	- Accuracy: 0.7059
	- F1-micro: 0.7059
	- F1-macro: 0.6683
	- F1-weighted: 0.7033
	- Precision: 0.7059
	- Recall: 0.7059

	## Model description

	This model was trained on 115,943 manually annotated sentences to classify text into one of seven political categories: "external relations", "freedom and democracy", "political system", "economy", "welfare and quality of life", "fabric of society" and "social groups".


	## Intended uses & limitations

	The model output reproduces the limitations of the dataset in terms of country coverage, time span, domain definitions and potential biases of the annotators - as any supervised machine learning model would. Applying the model to other types of data (other types of texts, countries etc.) will reduce performance.

	```python
	from transformers import pipeline
	import pandas as pd

	classifier = pipeline(
	task="text-classification",
	model="niksmer/PolicyBERTa-7d")

	# Load text data you want to classify
	text = pd.read_csv("text.csv")

	# Inference
	output = classifier(text)

	# Print output
	pd.DataFrame(output).head()
	```

	## Training and evaluation data

	PolicyBERTa-7d was trained on the English-speaking subset of the [Manifesto Project Dataset (MPDS2020a)](https://manifesto-project.wzb.eu/datasets). The model was trained on 115,943 sentences from 163 political manifestos in 7 English-speaking countries (Australia, Canada, Ireland, New Zealand, South Africa, United Kingdom, United States). The manifestos were published between 1992 - 2020.

	\| Country \| Count manifestos \| Count sentences \| Time span \|
	\|----------------\|------------------\|-----------------\|--------------------\|
	\| Australia \| 18 \| 14,887 \| 2010-2016 \|
	\| Ireland \| 23 \| 24,966 \| 2007-2016 \|
	\| Canada \| 14 \| 12,344 \| 2004-2008 & 2015 \|
	\| New Zealand \| 46 \| 35,079 \| 1993-2017 \|
	\| South Africa \| 29 \| 13,334 \| 1994-2019 \|
	\| USA \| 9 \| 13,188 \| 1992 & 2004-2020 \|
	\| United Kingdom \| 34 \| 30,936 \| 1997-2019 \|

	Canadian manifestos between 2004 and 2008 are used as test data.


	The Manifesto Project mannually annotates individual sentences from political party manifestos in 7 main political domains: 'Economy', 'External Relations', 'Fabric of Society', 'Freedom and Democracy', 'Political System', 'Welfare and Quality of Life' or 'Social Groups' - see the [codebook](https://manifesto-project.wzb.eu/down/papers/handbook_2021_version_5.pdf) for the exact definitions of each domain.

	### Tain data

	Train data was higly imbalanced.

	\| Label \| Description \| Count \|
	\|------------\|--------------\|--------\|
	\| 0 \| external relations \| 7,640 \|
	\| 1 \| freedom and democracy \| 5,880 \|
	\| 2 \| political system \| 11,234 \|
	\| 3 \| economy \| 29,218 \|
	\| 4 \| welfare and quality of life \| 37,200 \|
	\| 5 \| fabric of society \| 13,594 \|
	\| 6 \| social groups \| 11,177 \|

	Overall count: 115,943

	### Validation data

	The validation was created by chance.

	\| Label \| Description \| Count \|
	\|------------\|--------------\|--------\|
	\| 0 \| external relations \| 1,345 \|
	\| 1 \| freedom and democracy \| 1,043 \|
	\| 2 \| political system \| 2,038 \|
	\| 3 \| economy \| 5,140 \|
	\| 4 \| welfare and quality of life \| 6,554 \|
	\| 5 \| fabric of society \| 2,384 \|
	\| 6 \| social groups \| 1,957 \|

	Overall count: 20,461

	## Test data

	The test dataset contains ten canadian manifestos between 2004 and 2008.

	\| Label \| Description \| Count \|
	\|------------\|--------------\|--------\|
	\| 0 \| external relations \| 824 \|
	\| 1 \| freedom and democracy \| 296 \|
	\| 2 \| political system \| 1,041 \|
	\| 3 \| economy \| 2,188 \|
	\| 4 \| welfare and quality of life \| 2,654 \|
	\| 5 \| fabric of society \| 940 \|
	\| 6 \| social groups \| 387 \|

	Overall count: 8,330

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	```
	training_args = TrainingArguments(
	warmup_steps=0,
	weight_decay=0.1,
	learning_rate=1e-05,
	fp16 = True,
	evaluation_strategy="epoch",
	num_train_epochs=5,
	per_device_train_batch_size=16,
	overwrite_output_dir=True,
	per_device_eval_batch_size=16,
	save_strategy="no",
	logging_dir='logs',
	logging_strategy= 'steps',
	logging_steps=10,
	push_to_hub=True,
	hub_strategy="end")
	```

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| F1-micro \| F1-macro \| F1-weighted \| Precision \| Recall \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:--------:\|:--------:\|:-----------:\|:---------:\|:------:\|
	\| 0.9154 \| 1.0 \| 1812 \| 0.8984 \| 0.6785 \| 0.6785 \| 0.6383 \| 0.6772 \| 0.6785 \| 0.6785 \|
	\| 0.8374 \| 2.0 \| 3624 \| 0.8569 \| 0.6957 \| 0.6957 \| 0.6529 \| 0.6914 \| 0.6957 \| 0.6957 \|
	\| 0.7053 \| 3.0 \| 5436 \| 0.8582 \| 0.7019 \| 0.7019 \| 0.6594 \| 0.6967 \| 0.7019 \| 0.7019 \|
	\| 0.7178 \| 4.0 \| 7248 \| 0.8488 \| 0.7030 \| 0.7030 \| 0.6662 \| 0.7011 \| 0.7030 \| 0.7030 \|
	\| 0.6688 \| 5.0 \| 9060 \| 0.8549 \| 0.7059 \| 0.7059 \| 0.6683 \| 0.7033 \| 0.7059 \| 0.7059 \|

	### Validation evaluation

	\| Model \| Micro F1-Score \| Macro F1-Score \| Weighted F1-Score \|
	\|----------------\|----------------\|----------------\|-------------------\|
	\| PolicyBERTa-7d \| 0.71 \| 0.67 \| 0.70 \|



	### Test evaluation

	\| Model \| Micro F1-Score \| Macro F1-Score \| Weighted F1-Score \|
	\|----------------\|----------------\|----------------\|-------------------\|
	\| PolicyBERTa-7d \| 0.65 \| 0.60 \| 0.65 \|


	### Evaluation per category

	\| Label \| Validation F1-Score \| Test F1-Score \|
	\|-----------------------------\|---------------------\|---------------\|
	\| external relations \| 0.76 \| 0.70 \|
	\| freedom and democracy \| 0.61 \| 0.55 \|
	\| political system \| 0.55 \| 0.55 \|
	\| economy \| 0.74 \| 0.67 \|
	\| welfare and quality of life \| 0.77 \| 0.72 \|
	\| fabric of society \| 0.67 \| 0.60 \|
	\| social groups \| 0.58 \| 0.41 \|



	### Framework versions

	- Transformers 4.16.2
	- Pytorch 1.9.0+cu102
	- Datasets 1.8.0
	- Tokenizers 0.10.3