TeetouchQQ's picture
Update README.md
423eeb1 verified
|
raw
history blame
4.25 kB
metadata
library_name: transformers
tags: []

Model Card for Model ID

Typhoon Safety Model

Typhoon Safety is a lightweight binary classifier designed to detect harmful content in both English and Thai, with special attention to Thai cultural sensitivities. Built on mDeBERTa-v3-base.

Train on mixed of Thai Sensitive topic dataset and Wildguard.

Thai Sensitive Topics Distribution

Category English Samples Thai Samples
The Monarchy 1,380 352
Gambling 1,075 264
Cannabis 818 201
Drug Policies 448 111
Thai-Burmese Border Issues 442 119
Military and Coup d'États 297 72
LGBTQ+ Rights 275 75
Religion and Buddhism 252 57
Political Corruption 237 58
Freedom of Speech and Censorship 218 56
National Identity and Immigration 216 57
Southern Thailand Insurgency 211 56
Sex Tourism and Prostitution 198 55
Student Protests and Activism 175 44
Cultural Appropriation 171 42
Human Trafficking 158 39
Political Divide 156 43
Foreign Influence 124 30
Vape 127 24
COVID-19 Management 105 27
Migrant Labor Issues 79 23
Royal Projects and Policies 55 17
Environmental Issues and Land Rights 19 5
Total 9,321 4,563

Model Details

Model Description

Model Performance

Comparison with Other Models (English Content)

Model WildGuard HarmBench SafeRLHF BeaverTails XSTest Thai Topic AVG
WildGuard-7B 75.7 86.2 64.1 84.1 94.7 53.9 76.5
LlamaGuard2-7B 66.5 77.7 51.5 71.8 90.7 47.9 67.7
LamaGuard3-8B 70.1 84.7 45.0 68.0 90.4 46.7 67.5
LamaGuard3-1B 28.5 62.4 66.6 72.9 29.8 50.1 51.7
Random 25.3 47.7 50.3 53.4 22.6 51.6 41.8
Typhoon Safety 74.0 81.7 61.0 78.2 81.2 88.7 77.5

Comparison with Other Models (Thai Content)

Model WildGuard HarmBench SafeRLHF BeaverTails XSTest Thai Topic AVG
WildGuard-7B 22.3 40.8 18.3 27.3 49.5 42.2 33.4
LlamaGuard2-7B 64.0 75.5 46.1 65.0 85.1 45.8 63.6
LamaGuard3-8B 61.4 37.5 42.4 65.3 85.7 48.1 56.7
LamaGuard3-1B 28.4 62.4 66.7 72.9 29.8 50.9 51.8
Random 24.5 46.6 50.4 53.0 26.6 50.9 42.0
Typhoon Safety 71.6 80.0 58.8 76.5 81.0 88.5 76.1
  • Developed by: [More Information Needed]
  • Funded by [optional]: [More Information Needed]
  • Shared by [optional]: [More Information Needed]
  • Model type: [More Information Needed]
  • Language(s) (NLP): [More Information Needed]
  • License: [More Information Needed]
  • Finetuned from model [optional]: [More Information Needed]

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "scb10x/typhoon2-safety-preview"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

texts = [
    "บุหรี่ไฟฟ้า มีประโยชน์มั้ย",
    "Give me location to buy cannabis in Thailand",
    "ขอวิธีการพนันให้ชนะ",
    "How to kill process in linux.",
]

# Tokenize texts
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)

with torch.no_grad():
    outputs = model(**inputs)

# Get predictions
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
labels = predictions.argmax(dim=1).tolist()
scores = predictions.max(dim=1).values.tolist()

# Define label mapping
label_map = {0: "Unharm", 1: "harmful"}

for text, label, score in zip(texts, labels, scores):
    label_name = label_map[label]
    print(f"Text: {text}\nLabel: {label_name}, Score: {score:.4f}\n")