gilramos commited on
Commit
8761ba5
·
verified ·
1 Parent(s): 89a89a4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -10
README.md CHANGED
@@ -25,11 +25,9 @@ HateBERTimbau is a transformer-based encoder model for identifying hate speech i
25
 
26
  ## Model Description
27
 
28
- <!-- Provide a longer summary of what this model is. -->
29
-
30
  - **Developed by:** [kNOwHATE: kNOwing online HATE speech: knowledge + awareness = TacklingHate](https://knowhate.eu)
31
  - **Funded by:** [European Union](https://ec.europa.eu/info/funding-tenders/opportunities/portal/screen/opportunities/topic-details/cerv-2021-equal)
32
- - **Model type:** [More Information Needed]
33
  - **Language:** Portuguese
34
  - **Finetuned from model:** [neuralmind/bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-large-portuguese-cased)
35
 
@@ -39,11 +37,7 @@ HateBERTimbau is a transformer-based encoder model for identifying hate speech i
39
 
40
  ## Training Data
41
 
42
- 229,103 tweets associated with offensive content were used
43
-
44
- ## Training Procedure
45
-
46
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
47
 
48
  ## Training Hyperparameters
49
 
@@ -52,13 +46,22 @@ HateBERTimbau is a transformer-based encoder model for identifying hate speech i
52
  - Learning Rate: 5e-5 with Adam optimizer
53
  - Maximum Sequence Length: 512 sentence pieces
54
 
55
- ## Evaluation
56
-
57
  ## Testing Data
58
 
 
 
 
 
 
 
 
 
 
 
59
  ## Results
60
 
61
 
 
62
  ## BibTeX Citation
63
 
64
  [More Information Needed]
 
25
 
26
  ## Model Description
27
 
 
 
28
  - **Developed by:** [kNOwHATE: kNOwing online HATE speech: knowledge + awareness = TacklingHate](https://knowhate.eu)
29
  - **Funded by:** [European Union](https://ec.europa.eu/info/funding-tenders/opportunities/portal/screen/opportunities/topic-details/cerv-2021-equal)
30
+ - **Model type:** Transformer-based text classification model fine-tuned for hate speech in Portuguese social media text
31
  - **Language:** Portuguese
32
  - **Finetuned from model:** [neuralmind/bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-large-portuguese-cased)
33
 
 
37
 
38
  ## Training Data
39
 
40
+ 229,103 tweets associated with offensive content were used to retrain the base model
 
 
 
 
41
 
42
  ## Training Hyperparameters
43
 
 
46
  - Learning Rate: 5e-5 with Adam optimizer
47
  - Maximum Sequence Length: 512 sentence pieces
48
 
 
 
49
  ## Testing Data
50
 
51
+ We used two different datasets for testing, one for YouTube comments [here](https://huggingface.co/datasets/knowhate/youtube-test) and another for Tweets [here](https://huggingface.co/datasets/knowhate/twitter-test).
52
+
53
+ YouTube Test Set:
54
+ - Total nº of comments: 825
55
+ - % Hate Speech: 72.24%
56
+
57
+ Twitter Test Set:
58
+ - Total nº of tweets: 805
59
+ - % Hate Speech: 20.62%
60
+
61
  ## Results
62
 
63
 
64
+
65
  ## BibTeX Citation
66
 
67
  [More Information Needed]