README.md · FpOliveira/tupi-bert-base-portuguese-cased at c80eb66f09f01b33434c1ec7e81944e11b59e635

metadata

license: mit
datasets:
  - FpOliveira/TuPi-Portuguese-Hate-Speech-Dataset-Binary
language:
  - pt
metrics:
  - accuracy
  - precision
  - recall
  - f1
pipeline_tag: text-classification

Introduction

Tupi-BERT-Base represents a fine-tuned BERT model designed specifically for binary classification of hate speech in Portuguese. Derived from the BERTimbau base, TuPi are model family dedicated solution for addressing hate speech concerns. For more details or specific inquiries, please refer to the BERTimbau repository. The efficacy of Language Models can exhibit notable variations when confronted with a shift in domain between training and test data. In the creation of a specialized Portuguese Language Model tailored for hate speech classification, the original BERTimbau model underwent meticulous fine-tuning. This process entailed a singular "PreTraining" epoch carried out on the TuPi Hate Speech DataSet, sourced from diverse social networks.

Available models

Model	Arch.	#Layers	#Params
`FpOliveira/tupi-bert-base-portuguese-cased`	BERT-Base	12	110M
`FpOliveira/tupi-bert-large-portuguese-cased`	BERT-Large	24	335M
`FpOliveira/tupi-bert-large-portuguese-cased`	BERT-Large	24	335M
`FpOliveira/tupi-bert-large-portuguese-cased`	BERT-Large	24	335M