nanelimon
/

bert-base-turkish-offensive

Text Classification

Inference Endpoints

Model card Files Files and versions Community

bert-base-turkish-offensive / README.md

Seymasa's picture

Update README.md

ab8dbfb verified 4 months ago

|

history blame contribute delete

No virus

2.02 kB

	---
	license: mit
	datasets:
	- nanelimon/insult-dataset
	language:
	- tr
	pipeline_tag: text-classification
	---

	# About the model
	This model is designed for text classification, specifically for identifying offensive content in Turkish text. The model classifies text into five categories: INSULT, OTHER, PROFANITY, RACIST, and SEXIST.

	## Model Metrics

	\| \| INSULT \| OTHER \| PROFANITY \| RACIST \| SEXIST \|
	\| ------ \| ------ \| ------ \| ------ \| ------ \| ------ \|
	\| Precision \| 0.901 \| 0.924 \| 0.978 \| 1.000 \| 0.980 \|
	\| Recall \| 0.920 \| 0.980 \| 0.900 \| 0.980 \| 1.000 \|
	\| F1 Score \| 0.910 \| 0.9514 \| 0.937 \| 0.989 \| 0.990 \|

	- F-Score: 0.9559690799177005
	- Recall: 0.9559999999999998
	- Precision: 0.9570284225256961
	- Accuracy: 0.956

	## Training Information
	- Device : macOS 14.5 23F79 arm64 \| GPU: Apple M2 Max \| Memory: 5840MiB / 32768MiB
	- Training completed in 0:22:54 (hh:mm:ss)
	- Optimizer: AdamW
	- learning_rate: 2e-5
	- eps: 1e-8
	- epochs: 10
	- Batch size: 64

	## Dependency
	```sh
	pip install torch torchvision torchaudio
	pip install tf-keras
	pip install transformers
	pip install tensorflow
	```
	## Example
	```sh
	from transformers import AutoTokenizer, TFAutoModelForSequenceClassification, TextClassificationPipeline

	# Load the tokenizer and model
	model_name = "nanelimon/bert-base-turkish-offensive"
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = TFAutoModelForSequenceClassification.from_pretrained(model_name)

	# Create the pipeline
	pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True, top_k=2)

	# Test the pipeline
	print(pipe('Bu bir denemedir hadi sende dene!'))

	```
	Result;
	```sh
	[[{'label': 'OTHER', 'score': 1.000}, {'label': 'INSULT', 'score': 0.000}]]
	```
	- label= It shows which class the sent Turkish text belongs to according to the model.
	- score= It shows the compliance rate of the Turkish text sent to the label found.

	## Authors
	- Seyma SARIGIL: seymasargil@gmail.com

	## License

	gpl-3.0

	Free Software, Hell Yeah!