Transformers
Keras
English
dieineb commited on
Commit
efdf1d0
·
1 Parent(s): 63cf52d

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ ---
6
+ # Toxicity_model
7
+
8
+ The Toxicity_model is used to differentiates polite from unpolite responses.
9
+
10
+ The model was trained with a dataset composed of toxic_response and non_toxic_response.
11
+
12
+ ## Details
13
+ - Size: 4,689,681 parameters
14
+ - Dataset: [Toxic Comment Classification Challenge Dataset](https://github.com/tianqwang/Toxic-Comment-Classification-Challenge)
15
+ - Language: English
16
+ - Number of Training Steps: 20
17
+ - Batch size: 16
18
+ - Optimizer: Adam
19
+ - Learning Rate: 0.001
20
+ - GPU: T4
21
+ - This repository has the source [code used](https://github.com/Nkluge-correa/teeny-tiny_castle/blob/master/ML%20Intro%20Course/15_toxicity_detection.ipynb) to train this model.
22
+
23
+ ## Usage
24
+
25
+ ⚠️ THE EXAMPLES BELOW CONTAIN TOXIC/OFFENSIVE LANGUAGE ⚠️
26
+
27
+ ```
28
+ import tensorflow as tf
29
+
30
+ toxicity_model = tf.keras.models.load_model('toxicity_model.keras')
31
+
32
+ with open('toxic_vocabulary.txt', encoding='utf-8') as fp:
33
+ vocabulary = [line.strip() for line in fp]
34
+ fp.close()
35
+
36
+ vectorization_layer = tf.keras.layers.TextVectorization(max_tokens=20000,
37
+ output_mode="int",
38
+ output_sequence_length=100,
39
+ vocabulary=vocabulary)
40
+
41
+ strings = [
42
+ 'I think you should shut up your big mouth',
43
+ 'I do not agree with you'
44
+ ]
45
+
46
+ preds = toxicity_model.predict(vectorization_layer(strings),verbose=0)
47
+
48
+ for i, string in enumerate(strings):
49
+ print(f'{string}\n')
50
+ print(f'Toxic 🤬 {round((1 - preds[i][0]) * 100, 2)}% | Not toxic 😊 {round(preds[i][0] * 100, 2)}\n')
51
+ print("_" * 50)
52
+
53
+ ```
54
+
55
+ This will output the following:
56
+ ```
57
+ I think you should shut up your big mouth
58
+
59
+ Toxic 🤬 95.73% | Not toxic 😊 4.27
60
+ __________________________________________________
61
+ I do not agree with you
62
+
63
+ Toxic 🤬 0.99% | Not toxic 😊 99.01
64
+ __________________________________________________
65
+ ```
66
+
67
+ # Cite as 🤗
68
+ ```
69
+ @misc{teenytinycastle,
70
+ doi = {10.5281/zenodo.7112065},
71
+ url = {https://huggingface.co/AiresPucrs/toxicity_model},
72
+ author = {Nicholas Kluge Corr{\^e}a},
73
+ title = {Teeny-Tiny Castle},
74
+ year = {2023},
75
+ publisher = {HuggingFace},
76
+ journal = {HuggingFace repository},
77
+ }
78
+ ```
79
+ ## License
80
+ The ToxicityModel is licensed under the Apache License, Version 2.0. See the LICENSE file for more details.