Edit model card
YAML Metadata Error: "datasets[0]" must be a string
YAML Metadata Error: "license" must be one of [apache-2.0, mit, openrail, bigscience-openrail-m, creativeml-openrail-m, bigscience-bloom-rail-1.0, bigcode-openrail-m, afl-3.0, artistic-2.0, bsl-1.0, bsd, bsd-2-clause, bsd-3-clause, bsd-3-clause-clear, c-uda, cc, cc0-1.0, cc-by-2.0, cc-by-2.5, cc-by-3.0, cc-by-4.0, cc-by-sa-3.0, cc-by-sa-4.0, cc-by-nc-2.0, cc-by-nc-3.0, cc-by-nc-4.0, cc-by-nd-4.0, cc-by-nc-nd-3.0, cc-by-nc-nd-4.0, cc-by-nc-sa-2.0, cc-by-nc-sa-3.0, cc-by-nc-sa-4.0, cdla-sharing-1.0, cdla-permissive-1.0, cdla-permissive-2.0, wtfpl, ecl-2.0, epl-1.0, epl-2.0, etalab-2.0, eupl-1.1, agpl-3.0, gfdl, gpl, gpl-2.0, gpl-3.0, lgpl, lgpl-2.1, lgpl-3.0, isc, lppl-1.3c, ms-pl, apple-ascl, mpl-2.0, odc-by, odbl, openrail++, osl-3.0, postgresql, ofl-1.1, ncsa, unlicense, zlib, pddl, lgpl-lr, deepfloyd-if-license, llama2, llama3, llama3.1, llama3.2, gemma, unknown, other, array]

Propaganda Techniques Analysis BERT

This model is a BERT based model to make predictions of propaganda techniques in news articles in English. The model is described in this paper.

Model description

Please find propaganda definition here: https://propaganda.qcri.org/annotations/definitions.html

You can also try the model in action here: https://www.tanbih.org/prta

How to use

>>> from transformers import BertTokenizerFast
>>> from .model import BertForTokenAndSequenceJointClassification
>>>
>>> tokenizer = BertTokenizerFast.from_pretrained('bert-base-cased')
>>> model = BertForTokenAndSequenceJointClassification.from_pretrained(
>>>     "QCRI/PropagandaTechniquesAnalysis-en-BERT",
>>>     revision="v0.1.0",
>>> )
>>> 
>>> inputs = tokenizer.encode_plus("Hello, my dog is cute", return_tensors="pt")
>>> outputs = model(**inputs)
>>> sequence_class_index = torch.argmax(outputs.sequence_logits, dim=-1)
>>> sequence_class = model.sequence_tags[sequence_class_index[0]]
>>> token_class_index = torch.argmax(outputs.token_logits, dim=-1)
>>> tokens = tokenizer.convert_ids_to_tokens(inputs.input_ids[0][1:-1])
>>> tags = [model.token_tags[i] for i in token_class_index[0].tolist()[1:-1]]

BibTeX entry and citation info

@inproceedings{da-san-martino-etal-2019-fine,
    title = "Fine-Grained Analysis of Propaganda in News Article",
    author = "Da San Martino, Giovanni  and
      Yu, Seunghak  and
      Barr{\'o}n-Cede{\~n}o, Alberto  and
      Petrov, Rostislav  and
      Nakov, Preslav",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D19-1565",
    doi = "10.18653/v1/D19-1565",
    pages = "5636--5646",
    abstract = "Propaganda aims at influencing people{'}s mindset with the purpose of advancing a specific agenda. Previous work has addressed propaganda detection at document level, typically labelling all articles from a propagandistic news outlet as propaganda. Such noisy gold labels inevitably affect the quality of any learning system trained on them. A further issue with most existing systems is the lack of explainability. To overcome these limitations, we propose a novel task: performing fine-grained analysis of texts by detecting all fragments that contain propaganda techniques as well as their type. In particular, we create a corpus of news articles manually annotated at fragment level with eighteen propaganda techniques and propose a suitable evaluation measure. We further design a novel multi-granularity neural network, and we show that it outperforms several strong BERT-based baselines.",
}
Downloads last month
51,914
Inference API
Unable to determine this modelโ€™s pipeline type. Check the docs .

Spaces using QCRI/PropagandaTechniquesAnalysis-en-BERT 2