--- license: apache-2.0 --- --- license: apache-2.0 --- model base: https://huggingface.co/google-bert/bert-base-multilingual-uncased dataset: https://github.com/ramybaly/Article-Bias-Prediction training parameters: - batch_size: 100 - epochs: 5 - dropout: 0.05 - max_length: 512 - learning_rate: 3e-5 - warmup_steps: 100 - random_state: 239 training methodology: - sanitize dataset following specific rule-set, utilize random split as provided in the dataset - train on train split and evaluate on validation split in each epoch - evaluate test split only on the model that performed best on validation loss result summary: - throughout the five training epochs, model of second epoch achieved the lowest validation loss of 0.3003 - on test split second epoch model achieved f1 score of 0.8842 usage: ``` model = AutoModelForSequenceClassification.from_pretrained("premsa/political-bias-prediction-allsides-mBERT") tokenizer = AutoTokenizer.from_pretrained(premsa/"political-bias-prediction-allsides-mBERT") nlp = pipeline("text-classification", model=model, tokenizer=tokenizer) print(nlp("die massen werden von den medien kontrolliert.")) ```