TeetouchQQ commited on
Commit
d0f56a6
•
1 Parent(s): 423eeb1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -34
README.md CHANGED
@@ -11,34 +11,86 @@ Typhoon Safety is a lightweight binary classifier designed to detect harmful con
11
 
12
  Train on mixed of Thai Sensitive topic dataset and Wildguard.
13
 
14
- ### Thai Sensitive Topics Distribution
15
- | Category | English Samples | Thai Samples |
16
- |----------|----------------|--------------|
17
- | The Monarchy | 1,380 | 352 |
18
- | Gambling | 1,075 | 264 |
19
- | Cannabis | 818 | 201 |
20
- | Drug Policies | 448 | 111 |
21
- | Thai-Burmese Border Issues | 442 | 119 |
22
- | Military and Coup d'États | 297 | 72 |
23
- | LGBTQ+ Rights | 275 | 75 |
24
- | Religion and Buddhism | 252 | 57 |
25
- | Political Corruption | 237 | 58 |
26
- | Freedom of Speech and Censorship | 218 | 56 |
27
- | National Identity and Immigration | 216 | 57 |
28
- | Southern Thailand Insurgency | 211 | 56 |
29
- | Sex Tourism and Prostitution | 198 | 55 |
30
- | Student Protests and Activism | 175 | 44 |
31
- | Cultural Appropriation | 171 | 42 |
32
- | Human Trafficking | 158 | 39 |
33
- | Political Divide | 156 | 43 |
34
- | Foreign Influence | 124 | 30 |
35
- | Vape | 127 | 24 |
36
- | COVID-19 Management | 105 | 27 |
37
- | Migrant Labor Issues | 79 | 23 |
38
- | Royal Projects and Policies | 55 | 17 |
39
- | Environmental Issues and Land Rights | 19 | 5 |
40
- | **Total** | **9,321** | **4,563** |
41
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
  ## Model Details
44
 
@@ -68,12 +120,10 @@ Train on mixed of Thai Sensitive topic dataset and Wildguard.
68
 
69
 
70
  - **Developed by:** [More Information Needed]
71
- - **Funded by [optional]:** [More Information Needed]
72
- - **Shared by [optional]:** [More Information Needed]
73
- - **Model type:** [More Information Needed]
74
- - **Language(s) (NLP):** [More Information Needed]
75
- - **License:** [More Information Needed]
76
- - **Finetuned from model [optional]:** [More Information Needed]
77
 
78
 
79
  ## How to Get Started with the Model
 
11
 
12
  Train on mixed of Thai Sensitive topic dataset and Wildguard.
13
 
14
+ this model is trained to predict safety labels on below categories.
15
+
16
+ <div class="section-header">Thai Sensitive Topics</div>
17
+ <table align="center">
18
+ <tr>
19
+ <th colspan="3">Category</th>
20
+ </tr>
21
+ <tr>
22
+ <td>The Monarchy</td>
23
+ <td>Student Protests and Activism</td>
24
+ <td>Drug Policies</td>
25
+ </tr>
26
+ <tr>
27
+ <td>Gambling</td>
28
+ <td>Cultural Appropriation</td>
29
+ <td>Thai-Burmese Border Issues</td>
30
+ </tr>
31
+ <tr>
32
+ <td>Cannabis</td>
33
+ <td>Human Trafficking</td>
34
+ <td>Military and Coup/td>
35
+ </tr>
36
+ <tr>
37
+ <td>LGBTQ+ Rights</td>
38
+ <td>Political Divide</td>
39
+ <td>Religion and Buddhism</td>
40
+ </tr>
41
+ <tr>
42
+ <td>Political Corruption</td>
43
+ <td>Foreign Influence</td>
44
+ <td>National Identity and Immigration</td>
45
+ </tr>
46
+ <tr>
47
+ <td>Freedom of Speech and Censorship</td>
48
+ <td>Vape</td>
49
+ <td>Southern Thailand Insurgency</td>
50
+ </tr>
51
+ <tr>
52
+ <td>Sex Tourism and Prostitution</td>
53
+ <td>COVID-19 Management</td>
54
+ <td>Royal Projects and Policies</td>
55
+ </tr>
56
+ <tr>
57
+ <td>Migrant Labor Issues</td>
58
+ <td>Environmental Issues and Land Rights</td>
59
+ <td></td>
60
+ </tr>
61
+ </table>
62
+
63
+ <div class="section-header">Wildguard Topics</div>
64
+ <table>
65
+ <tr>
66
+ <th colspan="3">Category</th>
67
+ </tr>
68
+ <tr>
69
+ <td>Others</td>
70
+ <td>Sensitive Information Organization</td>
71
+ <td>Mental Health Over-reliance Crisis</td>
72
+ </tr>
73
+ <tr>
74
+ <td>Social Stereotypes & Discrimination</td>
75
+ <td>Defamation & Unethical Actions</td>
76
+ <td>Cyberattack</td>
77
+ </tr>
78
+ <tr>
79
+ <td>Disseminating False Information</td>
80
+ <td>Private Information Individual</td>
81
+ <td>Copyright Violations</td>
82
+ </tr>
83
+ <tr>
84
+ <td>Toxic Language & Hate Speech</td>
85
+ <td>Fraud Assisting Illegal Activities</td>
86
+ <td>Causing Material Harm by Misinformation</td>
87
+ </tr>
88
+ <tr>
89
+ <td>Violence and Physical Harm</td>
90
+ <td>Sexual Content</td>
91
+ <td></td>
92
+ </tr>
93
+ </table>
94
 
95
  ## Model Details
96
 
 
120
 
121
 
122
  - **Developed by:** [More Information Needed]
123
+ - **Model type:** Transformer Encoder
124
+ - **Language(s) (NLP):** Thai 🇹🇭 and English 🇬🇧
125
+ - **License:** MIT
126
+ - **Finetuned from model [optional]:** mDeBERTa v3 base https://huggingface.co/microsoft/mdeberta-v3-base
 
 
127
 
128
 
129
  ## How to Get Started with the Model