---
license: mit
language:
- en
---
This is a transformers model trained on the U.S. Comparative Agendas Project (CAP) dataset, annotated with a top-level taxonomy covering 20 policy areas, as well as an "Others" category for non-policy-related text. The model is designed to identify policy and non-policy issues in political discourse.
This model was trained specifically for additional analyses presented in this [paper](https://doi.org/10.48550/arXiv.2405.07323).
## Model performance
The model performance on unseen test set is as follows:
| Label | F1 score |
|:----------------------|-----------:|
| Macroeconomics | 0.8303 |
| Civil rights | 0.7676 |
| Health | 0.8886 |
| Agriculture | 0.8439 |
| Labor | 0.7818 |
| Education | 0.9005 |
| Environment | 0.8481 |
| Energy | 0.8629 |
| Immigration | 0.8682 |
| Transportation | 0.8731 |
| Law and crime | 0.8207 |
| Social welfare | 0.7957 |
| Housing | 0.8462 |
| Domestic commerce | 0.8421 |
| Defense | 0.8627 |
| Technology | 0.8333 |
| Foreign trade | 0.8269 |
| International affairs | 0.8907 |
| Government operations | 0.8777 |
| Public lands | 0.8758 |
| Others | 0.6543 |
| **Macro average** | **0.8573** |
## Citation
If you find this model useful for your work, please consider citing:
```bibtex
@article{aroyehun2024computational,
title={Computational analysis of US Congressional speeches reveals a shift from evidence to intuition},
author={Aroyehun, Segun Taofeek and Simchon, Almog and Carrella, Fabio and Lasser, Jana and Lewandowsky, Stephan and Garcia, David},
journal={arXiv preprint arXiv:2405.07323},
year={2024}
}
```