grammarly
/

detexd-roberta-base

Text Classification

Inference Endpoints

Model card Files Files and versions Community

detexd-roberta-base / README.md

syavnyi's picture

Update README.md

a212615 over 1 year ago

|

1.54 kB

	---
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-classification
	---

	# DeTexD-RoBERTa-base delicate text detection

	This is a baseline RoBERTa-base model for the delicate text detection task.

	* Paper: [DeTexD: A Benchmark Dataset for Delicate Text Detection](TODO)
	* [GitHub repository](https://github.com/grammarly/detexd)

	## Classification example code

	Here's a short usage example with the torch library in a binary classification task:

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	tokenizer = AutoTokenizer.from_pretrained("grammarly/detexd-roberta")
	model = AutoModelForSequenceClassification.from_pretrained("grammarly/detexd-roberta")
	model.eval()

	def predict_binary_score(text: str, break_class_ix=3):
	with torch.no_grad():
	# get multiclass probability scores
	logits = model(**tokenizer(text, return_tensors='pt'))[0]
	probs = torch.nn.functional.softmax(logits, dim=-1)

	# convert to a binary prediction by summing the probability scores
	# for the higher-index classes, as defined by break_class_ix
	bin_score = probs[..., break_class_ix:].sum(dim=-1)

	return bin_score.item()

	def predict_delicate(text: str, threshold=0.72496545):
	return predict_binary_score(text) > threshold

	print(predict_delicate("Time flies like an arrow. Fruit flies like a banana."))
	```

	Expected output:

	```
	False
	```

	## BibTeX entry and citation info

	Please cite [our paper](TODO) if you use this model.

	```bibtex
	TODO
	```