--- license: mit --- ## Model description An xlm-roberta-large model fine-tuned on ~1,6 million annotated statements contained in the manifesto corpus (version 2023a). The model can be used to categorize any type of text into 56 different political topics according to the Manifesto Project's coding scheme (Handbook 4). ## How to use ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer model = AutoModelForSequenceClassification.from_pretrained("manifesto-project/xlm-roberta-political-56topics-sentence-2023a") tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-large") sentence = "We will restore funding to the Global Environment Facility and the Intergovernmental Panel on Climate Change, to support critical climate science research around the world" inputs = tokenizer(sentence, return_tensors="pt", max_length=200, #we limited the input to 200 tokens during finetuning padding="max_length", truncation=True ) logits = model(**inputs).logits probabilities = torch.softmax(logits, dim=1).tolist()[0] probabilities = {model.config.id2label[index]: round(probability * 100, 2) for index, probability in enumerate(probabilities)} probabilities = dict(sorted(probabilities.items(), key=lambda item: item[1], reverse=True)) print(probabilities) # {'501 - Environmental Protection: Positive': 67.28, '411 - Technology and Infrastructure': 15.19, '107 - Internationalism: Positive': 13.63, '416 - Anti-Growth Economy: Positive': 2.02... predicted_class = model.config.id2label[logits.argmax().item()] print(predicted_class) # 501 - Environmental Protection: Positive ``` ## Model Performance ## Model Performance The model was evaluated on a test set of 199,046 annotated manifesto statements. ### Overall | | Accuracy | Top2_Acc | Top3_Acc | Precision| Recall | F1_Macro | MCC | Cross-Entropy | |-------------------------------------------------------------------------------------------------------|:--------:|:--------:|:--------:|:--------:|:------:|:--------:|:---:|:-------------:| [Sentence Model](https://huggingface.co/manifesto-project/xlm-roberta-political-56topics-sentence-2023a)| 0.57 | 0.73 | 0.81 | 0.49 | 0.43 | 0.45 | 0.55| 1.5 | [Context Model](https://huggingface.co/manifesto-project/xlm-roberta-political-56topics-sentence-2023a) | 0.64 | 0.81 | 0.88 | 0.54 | 0.52 | 0.53 | 0.62| 1.15 |