knowhate
/

HateBERTimbau

Not-For-All-Audiences

Model card Files Files and versions Community

gilramos commited on May 13, 2024

Commit

fa1a732

·

verified ·

1 Parent(s): 1ec1f57

Update README.md

Files changed (1) hide show

README.md +18 -6

README.md CHANGED Viewed

@@ -31,22 +31,32 @@ HateBERTimbau is a transformer-based encoder model for identifying hate speech i
 - **Language:** Portuguese
 - **Finetuned from model:** [neuralmind/bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-large-portuguese-cased)
 ## Uses
 [More Information Needed]
-## Training Data
 229,103 tweets associated with offensive content were used to retrain the base model.
-## Training Hyperparameters
 - Batch Size: 4 samples
 - Epochs: 100
 - Learning Rate: 5e-5 with Adam optimizer
 - Maximum Sequence Length: 512 sentence pieces
-## Testing Data
 We used two different datasets for testing, one for YouTube comments [here](https://huggingface.co/datasets/knowhate/youtube-test) and another for Tweets [here](https://huggingface.co/datasets/knowhate/twitter-test).
@@ -58,13 +68,15 @@ Twitter Test Set:
 - Total nº of tweets: 805
 - % Hate Speech: 20.62%
-## Results
 | Dataset         | Precision  | Recall    | F1-score     |
-|:-----------------|:----------- |:-----------|:--------------|
 | **YouTube**     | 0.928      | 0.108     | **0.193**    |
 | **Twitter**     | 0.686      | 0.211     | **0.323**    |
 ## BibTeX Citation
 ``` latex
@@ -82,7 +94,7 @@ copyright = {embargoed-access},
 }
 ```
 ## Acknowledgements

 - **Language:** Portuguese
 - **Finetuned from model:** [neuralmind/bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-large-portuguese-cased)
+<br>
 ## Uses
 [More Information Needed]
+<br>
+## Training
+### Data
 229,103 tweets associated with offensive content were used to retrain the base model.
+### Training Hyperparameters
 - Batch Size: 4 samples
 - Epochs: 100
 - Learning Rate: 5e-5 with Adam optimizer
 - Maximum Sequence Length: 512 sentence pieces
+<br>
+## Testing
+### Data
 We used two different datasets for testing, one for YouTube comments [here](https://huggingface.co/datasets/knowhate/youtube-test) and another for Tweets [here](https://huggingface.co/datasets/knowhate/twitter-test).
 - Total nº of tweets: 805
 - % Hate Speech: 20.62%
+### Results
 | Dataset         | Precision  | Recall    | F1-score     |
+|:----------------|:-----------|:----------|:-------------|
 | **YouTube**     | 0.928      | 0.108     | **0.193**    |
 | **Twitter**     | 0.686      | 0.211     | **0.323**    |
+<br>
 ## BibTeX Citation
 ``` latex
 }
 ```
+<br>
 ## Acknowledgements