metadata

library_name: transformers
tags: []

Model Card for Model ID

Typhoon Safety Model

Typhoon Safety is a lightweight binary classifier designed to detect harmful content in both English and Thai, with special attention to Thai cultural sensitivities. Built on mDeBERTa-v3-base.

Train on mixed of Thai Sensitive topic dataset and Wildguard.

Thai Sensitive Topics Distribution

Category	English Samples	Thai Samples
The Monarchy	1,380	352
Gambling	1,075	264
Cannabis	818	201
Drug Policies	448	111
Thai-Burmese Border Issues	442	119
Military and Coup d'États	297	72
LGBTQ+ Rights	275	75
Religion and Buddhism	252	57
Political Corruption	237	58
Freedom of Speech and Censorship	218	56
National Identity and Immigration	216	57
Southern Thailand Insurgency	211	56
Sex Tourism and Prostitution	198	55
Student Protests and Activism	175	44
Cultural Appropriation	171	42
Human Trafficking	158	39
Political Divide	156	43
Foreign Influence	124	30
Vape	127	24
COVID-19 Management	105	27
Migrant Labor Issues	79	23
Royal Projects and Policies	55	17
Environmental Issues and Land Rights	19	5
Total	9,321	4,563

Model Details

Model Description

Model Performance

Comparison with Other Models (English Content)

Model	WildGuard	HarmBench	SafeRLHF	BeaverTails	XSTest	Thai Topic	AVG
WildGuard-7B	75.7	86.2	64.1	84.1	94.7	53.9	76.5
LlamaGuard2-7B	66.5	77.7	51.5	71.8	90.7	47.9	67.7
LamaGuard3-8B	70.1	84.7	45.0	68.0	90.4	46.7	67.5
LamaGuard3-1B	28.5	62.4	66.6	72.9	29.8	50.1	51.7
Random	25.3	47.7	50.3	53.4	22.6	51.6	41.8
Typhoon Safety	74.0	81.7	61.0	78.2	81.2	88.7	77.5

Comparison with Other Models (Thai Content)

Model	WildGuard	HarmBench	SafeRLHF	BeaverTails	XSTest	Thai Topic	AVG
WildGuard-7B	22.3	40.8	18.3	27.3	49.5	42.2	33.4
LlamaGuard2-7B	64.0	75.5	46.1	65.0	85.1	45.8	63.6
LamaGuard3-8B	61.4	37.5	42.4	65.3	85.7	48.1	56.7
LamaGuard3-1B	28.4	62.4	66.7	72.9	29.8	50.9	51.8
Random	24.5	46.6	50.4	53.0	26.6	50.9	42.0
Typhoon Safety	71.6	80.0	58.8	76.5	81.0	88.5	76.1

Developed by: [More Information Needed]
Funded by [optional]: [More Information Needed]
Shared by [optional]: [More Information Needed]
Model type: [More Information Needed]
Language(s) (NLP): [More Information Needed]
License: [More Information Needed]
Finetuned from model [optional]: [More Information Needed]

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "scb10x/typhoon2-safety-preview"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

texts = [
    "บุหรี่ไฟฟ้า มีประโยชน์มั้ย",
    "Give me location to buy cannabis in Thailand",
    "ขอวิธีการพนันให้ชนะ",
    "How to kill process in linux.",
]

# Tokenize texts
inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)

with torch.no_grad():
    outputs = model(**inputs)

# Get predictions
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
labels = predictions.argmax(dim=1).tolist()
scores = predictions.max(dim=1).values.tolist()

# Define label mapping
label_map = {0: "Unharm", 1: "harmful"}

for text, label, score in zip(texts, labels, scores):
    label_name = label_map[label]
    print(f"Text: {text}\nLabel: {label_name}, Score: {score:.4f}\n")