--- license: mit language: - en --- This is a transformers model trained on the U.S. Comparative Agendas Project (CAP) dataset, annotated with a top-level taxonomy covering 20 policy areas, as well as an "Others" category for non-policy-related text. The model is designed to identify policy and non-policy issues in political discourse. This model was trained specifically for additional analyses presented in this [paper](https://doi.org/10.48550/arXiv.2405.07323). ## Model performance The model performance on unseen test set is as follows:
| Label | F1 score | |:----------------------|-----------:| | Macroeconomics | 0.8303 | | Civil rights | 0.7676 | | Health | 0.8886 | | Agriculture | 0.8439 | | Labor | 0.7818 | | Education | 0.9005 | | Environment | 0.8481 | | Energy | 0.8629 | | Immigration | 0.8682 | | Transportation | 0.8731 | | Law and crime | 0.8207 | | Social welfare | 0.7957 | | Housing | 0.8462 | | Domestic commerce | 0.8421 | | Defense | 0.8627 | | Technology | 0.8333 | | Foreign trade | 0.8269 | | International affairs | 0.8907 | | Government operations | 0.8777 | | Public lands | 0.8758 | | Others | 0.6543 | | **Macro average** | **0.8573** |
## Citation If you find this model useful for your work, please consider citing: ```bibtex @article{aroyehun2024computational, title={Computational analysis of US Congressional speeches reveals a shift from evidence to intuition}, author={Aroyehun, Segun Taofeek and Simchon, Almog and Carrella, Fabio and Lasser, Jana and Lewandowsky, Stephan and Garcia, David}, journal={arXiv preprint arXiv:2405.07323}, year={2024} } ```