petraknovak commited on
Commit
6d6f8f2
1 Parent(s): ffe5433

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -11,6 +11,9 @@ license: mit
11
 
12
  A monolingual model for hate speech classification of social media content in English language. The model was trained on 103190 YouTube comments and tested on an independent test set of 20554 YouTube comments. It is based on English BERT base pre-trained language model.
13
 
 
 
 
14
  ## Tokenizer
15
 
16
  During training the text was preprocessed using the original English BERT base tokenizer. We suggest the same tokenizer is used for inference.
@@ -21,4 +24,5 @@ The model classifies each input into one of four distinct classes:
21
  * 0 - acceptable
22
  * 1 - inappropriate
23
  * 2 - offensive
24
- * 3 - violent
 
 
11
 
12
  A monolingual model for hate speech classification of social media content in English language. The model was trained on 103190 YouTube comments and tested on an independent test set of 20554 YouTube comments. It is based on English BERT base pre-trained language model.
13
 
14
+ ## Details on data acquisition and labeling including the Annotation guidelines
15
+ http://imsypp.ijs.si/wp-content/uploads/2021/12/IMSyPP_D2.2_multilingual-dataset.pdf
16
+
17
  ## Tokenizer
18
 
19
  During training the text was preprocessed using the original English BERT base tokenizer. We suggest the same tokenizer is used for inference.
 
24
  * 0 - acceptable
25
  * 1 - inappropriate
26
  * 2 - offensive
27
+ * 3 - violent
28
+