Pre-CoFactv3-Text-Classification
Model description
This is a Text Classification model for AAAI 2024 Workshop Paper: “Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning”
Its input are claim and evidence, and output is the predicted label, which falls into one of the categories: Support, Neutral, or Refute.
It is fine-tuned by FACTIFY5WQA dataset based on microsoft/deberta-v3-large model.
For more details, you can see our paper or GitHub.
How to use?
- Download the model by hugging face transformers.
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained("AndyChiang/Pre-CoFactv3-Text-Classification")
tokenizer = AutoTokenizer.from_pretrained("AndyChiang/Pre-CoFactv3-Text-Classification")
- Create a pipeline.
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
- Use the pipeline to predict the label.
label = classifier("Micah Richards spent an entire season at Aston Vila without playing a single game. [SEP] Despite speculation that Richards would leave Aston Villa before the transfer deadline for the 2018~19 season , he remained at the club , although he is not being considered for first team selection.")
print(label)
Dataset
We utilize the dataset FACTIFY5WQA provided by the AAAI-24 Workshop Factify 3.0.
This dataset is designed for fact verification, with the task of determining the veracity of a claim based on the given evidence.
- claim: the statement to be verified.
- evidence: the facts to verify the claim.
- question: the questions generated from the claim by the 5W framework (who, what, when, where, and why).
- claim_answer: the answers derived from the claim.
- evidence_answer: the answers derived from the evidence.
- label: the veracity of the claim based on the given evidence, which is one of three categories: Support, Neutral, or Refute.
Training | Validation | Testing | Total | |
---|---|---|---|---|
Support | 3500 | 750 | 750 | 5000 |
Neutral | 3500 | 750 | 750 | 5000 |
Refute | 3500 | 750 | 750 | 5000 |
Total | 10500 | 2250 | 2250 | 15000 |
Fine-tuning
Fine-tuning is conducted by the Hugging Face Trainer API on the Text Classification task.
Training hyperparameters
The following hyperparameters were used during training:
- Pre-train language model: microsoft/deberta-v3-large
- Optimizer: adam
- Learning rate: 0.00001
- Max token of input: 650
- Batch size: 4
- Epoch: 12
- Device: NVIDIA RTX A5000
Testing
In the case of the Text Classification task, accuracy serves as the evaluation metric.
Accuracy |
---|
0.8502 |
Other models
AndyChiang/Pre-CoFactv3-Question-Answering
Citation
- Downloads last month
- 15
Model tree for AndyChiang/Pre-CoFactv3-Text-Classification
Base model
microsoft/deberta-v3-large