This is a transformers model trained on the U.S. Comparative Agendas Project (CAP) dataset, annotated with a top-level taxonomy covering 20 policy areas, as well as an "Others" category for non-policy-related text. The model is designed to identify policy and non-policy issues in political discourse.
This model was trained specifically for additional analyses presented in this paper.
Model performance
The model performance on unseen test set is as follows:
Label | F1 score |
---|---|
Macroeconomics | 0.8303 |
Civil rights | 0.7676 |
Health | 0.8886 |
Agriculture | 0.8439 |
Labor | 0.7818 |
Education | 0.9005 |
Environment | 0.8481 |
Energy | 0.8629 |
Immigration | 0.8682 |
Transportation | 0.8731 |
Law and crime | 0.8207 |
Social welfare | 0.7957 |
Housing | 0.8462 |
Domestic commerce | 0.8421 |
Defense | 0.8627 |
Technology | 0.8333 |
Foreign trade | 0.8269 |
International affairs | 0.8907 |
Government operations | 0.8777 |
Public lands | 0.8758 |
Others | 0.6543 |
Macro average | 0.8573 |
Citation
If you find this model useful for your work, please consider citing:
@article{aroyehun2024computational,
title={Computational analysis of US Congressional speeches reveals a shift from evidence to intuition},
author={Aroyehun, Segun Taofeek and Simchon, Almog and Carrella, Fabio and Lasser, Jana and Lewandowsky, Stephan and Garcia, David},
journal={arXiv preprint arXiv:2405.07323},
year={2024}
}
- Downloads last month
- 0
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.