YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

CoNTACT

Model description

Contextual Neural Transformer Adapted to COVID-19 Tweets or CoNTACT is a Dutch RobBERT model (pdelobelle/robbert-v2-dutch-base) adapted to the domain of COVID-19 tweets. The model was developed at CLiPS by Jens Lemmens, Jens Van Nooten, Tim Kreutz and Walter Daelemans. A full description of the model, the data that was used and the experiments that were conducted can be found in this ArXiv preprint: https://arxiv.org/abs/2203.07362

Intended use

The model was developed with the intention of achieving high results on NLP tasks involving Dutch social media messages related to COVID-19.

How to use

CoNTACT should be fine-tuned on a downstream task. This can be achieved by referring to clips/contact in the --model_name_or_path argument in Huggingface/Transformers' example scripts, or by loading CoNTACT (as shown below) and fine-tuning it using your own code:

from transformers import AutoModel, AutoTokenizer

model = AutoModel.from_pretrained('clips/contact')
tokenizer = AutoTokenizer.from_pretrained('clips/contact')

...

Training data

CoNTACT was trained on 2.8M Dutch tweets related to COVID-19 that were posted in 2021.

Training Procedure

The model's pre-training phase was extended by performing Masked Language Modeling (MLM) on the training data described above. This was done for 4 epochs, using the largest possible batch size that fit working memory (32).

Evaluation

The model was evaluated on two tasks using data from two social media platforms: Twitter and Facebook. Task 1 involved the binary classification of COVID-19 vaccine stance (hesitant vs. not hesitant), whereas task 2 consisted of the mulilabel, multiclass classification of arguments for vaccine hesitancy. CoNTACT outperformed out-of-the-box RobBERT in virtually all our experiments, and with statistical significance in most cases.

How to cite

@misc{lemmens2022contact,
    title={CoNTACT: A Dutch COVID-19 Adapted BERT for Vaccine Hesitancy and Argumentation Detection},
    author={Jens Lemmens and Jens Van Nooten and Tim Kreutz and Walter Daelemans},
    year={2022},
    eprint={2203.07362},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
24
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.