keelezibel/id-hatespeech

This model will detect hate speech in native language Indo. Finetuned upon the indobert-base-p2 model from Gojek.

Usage

HS_DOMAIN = ['hs', 'abusive', 'hs_individual', 'hs_group', 'hs_religion', 'hs_race', 'hs_physical', 'hs_gender', 'hs_other', 'hs_weak', 'hs_moderate', 'hs_strong']
LABEL2INDEX = {'false': 0, 'true': 1}
INDEX2LABEL = {0: 'false', 1: 'true'}

# Load Tokenizer
tokenizer_model_id = "indobenchmark/indobert-base-p2"
tokenizer = BertTokenizer.from_pretrained(tokenizer_model_id)
config = AutoConfig.from_pretrained(tokenizer_model_id)
config.num_labels_list = [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
# Load this model
model_id = "keelezibel/id-hatespeech"
model = BertForMultiLabelClassification.from_pretrained(model_id, config=config)

# Encode Text
subwords = tokenizer.encode(text.text)
subwords = torch.LongTensor(subwords).view(1, -1).to(model.device)

logits = model(subwords)[0]
labels = [torch.topk(logit, k=1, dim=-1)[-1].squeeze().item() for logit in logits]

res = dict()
for idx, label in enumerate(labels):
    pred = INDEX2LABEL[label]
    proba = float(F.softmax(logits[idx], dim=-1).squeeze()[label]*100)
    res[HS_DOMAIN[idx]] = (pred, round(proba,2))

Output

{
    "hs": [
        "true",
        99.94
    ],
    "abusive": [
        "true",
        86.8
    ],
    "hs_individual": [
        "false",
        99.97
    ],
    "hs_group": [
        "true",
        99.96
    ],
    "hs_religion": [
        "false",
        99.86
    ],
    "hs_race": [
        "false",
        99.98
    ],
    "hs_physical": [
        "false",
        99.99
    ],
    "hs_gender": [
        "false",
        99.95
    ],
    "hs_other": [
        "true",
        99.7
    ],
    "hs_weak": [
        "false",
        99.98
    ],
    "hs_moderate": [
        "true",
        99.8
    ],
    "hs_strong": [
        "false",
        99.94
    ]
}