Model: covid-19-vaccination-tweet-stance
Overview
This model is a text classifier trained to determine the stance of a tweet towards the COVID-19 vaccination. It is designed to classify tweets into three categories: in-favor, against, and neutral-or-unclear. Note that this classifier only works for tweet that is related to COVID-19 vaccination. For classifying whether a tweet is related to COVID-19 vaccination or not, please refer to covid-19-vaccination-tweet-relevance.
Usage
tokenizer = AutoTokenizer.from_pretrained("seantw/covid-19-vaccination-tweet-stance")
model = AutoModel.from_pretrained("seantw/covid-19-vaccination-tweet-stance")
Training corpus
The training corpus consists of 5000 tweets, randomly sampled daily from December 2020 to June 2022. These tweets were labeled by domain experts. These tweets are all related to COVID-19 vaccination.
We have seperated trained another model for classifying whether a tweet is related to COVID-19 vaccination or not. Please refer to covid-19-vaccination-tweet-relevance for more information.
Output Label Index
- LABEL_0: "neutral-or-unclear"
- LABEL_1: "in-favor"
- LABEL_2: "against"
Performance Metrics
The model's performance metrics on the test set are as follows:
- Accuracy: 0.7747
- Macro-average metrics (across "in-favor" and "against" classes):
- F1-score: 0.8288
- Recall: 0.8
- Precision: 0.86135
- Macro-average metrics (across all 3 classes):
- F1-score: 0.7408
- Recall: 0.7568
- Precision: 0.7369
- Class-wise metrics:
- For class "in-favor":
- F1-score: 0.8423
- Precision: 0.9022
- Recall: 0.7899
- For class "against":
- F1-score: 0.8153
- Precision: 0.8205
- Recall: 0.8101
- For class "neutral-or-unclear":
- F1-score: 0.5648
- Precision: 0.488
- Recall: 0.6703
- For class "in-favor":
These metrics are based on a test set with a total size of 506 samples.
Note: Because the performance on the "neutral-or-unclear" class is significantly worse than the other two classes, we recommend users to exercise caution when interpreting the label of this "neutral-or-unclear" class. If you are only interested in either the "in-favor" or "against" classes, you can frame this as a binary classification problem and combine the "neutral-or-unclear" class with either the "in-favor" or "against" class.
Confusion Matrix
The confusion matrix of predictions on the test set is as follows:
Predicted: neutral-or-unclear | Predicted: in-favor | Predicted: against | |
---|---|---|---|
True: neutral-or-unclear | 61 | 16 | 14 |
True: in-favor | 40 | 203 | 14 |
True: against | 24 | 6 | 128 |
Model Architecture
The model is fine-tuned based on COVID-Twitter-BERT v2.
Contact
Sean Yun-Shiuan Chuang (yunshiuan.chuang@wisc.edu)
- Downloads last month
- 12