Model Card for POLLCHECK/RoBERTa-classifier

Model Description

This RoBERTa model has been fine-tuned for a binary classification task to determine whether statements are 0 OR "biased/ fake" or 1 OR "unbiased/ real". The model is based on the RoBERTa architecture, a robustly optimized BERT pretraining approach.

Intended Use

Primary Use: This model is intended for the classification of textual statements into two categories: biased and unbiased. It is suitable for analyzing news articles, editorials, and opinion pieces. Users: This model can be used by data scientists, journalists, content moderators, and social media platforms to detect bias in text.

Model Details

Architecture: The model uses the RoBERTa-base architecture. Training Data: The model was trained on a curated dataset comprising news articles, editorials, and opinion pieces labeled as biased or unbiased by domain experts. Performance Metrics

Usage

import torch

model_name = "POLLCHECK/RoBERTa-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

texts = [
    "Religious Extremists Threaten Our Way of Life.",
    "Public Health Officials are working."
]
for text in texts:
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
    with torch.no_grad():
        outputs = model(**inputs)
        probabilities = torch.softmax(outputs.logits, dim=-1)
        predicted_label = "biased" if probabilities[0][0] > 0.5 else "unbiased"
        print(f"Text: {text}\nPredicted label: {predicted_label}")

Results

The following table presents the evaluation metrics for each class along with macro averages:

Class	Precision	Recall	F1-Score
Biased/ fake (0)	0.93	0.96	0.94
Unbiased/ real (1)	0.96	0.92	0.94
Macro Avg	0.94	0.94	0.94