Pre-CoFactv3-Text-Classification

Model description

This is a Text Classification model for AAAI 2024 Workshop Paper: “Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning”

Its input are claim and evidence, and output is the predicted label, which falls into one of the categories: Support, Neutral, or Refute.

It is fine-tuned by FACTIFY5WQA dataset based on microsoft/deberta-v3-large model.

For more details, you can see our paper or GitHub.

How to use?

  1. Download the model by hugging face transformers.
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained("AndyChiang/Pre-CoFactv3-Text-Classification")
tokenizer = AutoTokenizer.from_pretrained("AndyChiang/Pre-CoFactv3-Text-Classification")
  1. Create a pipeline.
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
  1. Use the pipeline to predict the label.
label = classifier("Micah Richards spent an entire season at Aston Vila without playing a single game. [SEP] Despite speculation that Richards would leave Aston Villa before the transfer deadline for the 2018~19 season , he remained at the club , although he is not being considered for first team selection.")
print(label)

Dataset

We utilize the dataset FACTIFY5WQA provided by the AAAI-24 Workshop Factify 3.0.

This dataset is designed for fact verification, with the task of determining the veracity of a claim based on the given evidence.

  • claim: the statement to be verified.
  • evidence: the facts to verify the claim.
  • question: the questions generated from the claim by the 5W framework (who, what, when, where, and why).
  • claim_answer: the answers derived from the claim.
  • evidence_answer: the answers derived from the evidence.
  • label: the veracity of the claim based on the given evidence, which is one of three categories: Support, Neutral, or Refute.
Training Validation Testing Total
Support 3500 750 750 5000
Neutral 3500 750 750 5000
Refute 3500 750 750 5000
Total 10500 2250 2250 15000

Fine-tuning

Fine-tuning is conducted by the Hugging Face Trainer API on the Text Classification task.

Training hyperparameters

The following hyperparameters were used during training:

  • Pre-train language model: microsoft/deberta-v3-large
  • Optimizer: adam
  • Learning rate: 0.00001
  • Max token of input: 650
  • Batch size: 4
  • Epoch: 12
  • Device: NVIDIA RTX A5000

Testing

In the case of the Text Classification task, accuracy serves as the evaluation metric.

Accuracy
0.8502

Other models

AndyChiang/Pre-CoFactv3-Question-Answering

Citation

Downloads last month
15
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for AndyChiang/Pre-CoFactv3-Text-Classification

Finetuned
(125)
this model