--- license: apache-2.0 --- model base: https://huggingface.co/google-bert/bert-base-uncased dataset: https://github.com/ramybaly/Article-Bias-Prediction training parameters: - batch_size: 100 - epochs: 5 - dropout: 0.05 - max_length: 512 - learning_rate: 3e-5 - warmup_steps: 100 - random_state: 239 training methodology: - sanitize dataset following specific rule-set, utilize random split as provided in the dataset - train on train split and evaluate on validation split in each epoch - evaluate test split only on the model that performed best on validation loss result summary: - throughout the five training epochs, model of second epoch achieved the lowest validation loss of 0.3314 - on test split second epoch model achieved f1 score of 0.9041 usage: ``` from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer def main(repository: str): model = AutoModelForSequenceClassification.from_pretrained(repository) tokenizer = AutoTokenizer.from_pretrained(repository) nlp = pipeline("text-classification", model=model, tokenizer=tokenizer) print(nlp("the masses are controlled by media.")) if __name__ == "__main__": main(repository="premsa/political-bias-prediction-allsides-BERT") ```