README.md · nanelimon/bert-base-turkish-job-advertisement at 205448e57b749e1f43bb7e5c0636aed83f4ea055

metadata

license: apache-2.0
language:
  - tr
pipeline_tag: text-classification
tags:
  - job advertisement
  - turkish bert
  - bert-based
  - StratifiedKFold

language: - tr tags: - translation license: apache-2.0

About the model

It has been trained with 15451 real job advertisement data taken as tagged by isinolsun.com

Included classes;

Uygun İlan
Is Ilani Degil
Mustehcen
Cift Pozisyon

Accordingly, the success rates in education are as follows;

Model is Turkish bert-based.
Used StratifiedKFold(5) for validation.
results [0.806858621805241, 0.8912621359223301, 0.9440129449838188, 0.9750809061488673, 0.9851132686084142]

Mean-Precision: 0.9204655754937342

	Uygun İlan	Is Ilani Degil	Mustehcen	Cift Pozisyon
Precision	0.986	0.996	0.966	0.970
Recall	0.992	0.986	0.966	0.959
F1 Score	0.989	0.991	0.966	0.965
Accuracy : 0.975

Example

from transformers import AutoTokenizer, TextClassificationPipeline, TFBertForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("nanelimon/bert-base-turkish-job-advertisement")
model = TFBertForSequenceClassification.from_pretrained("nanelimon/bert-base-turkish-job-advertisement", from_pt=True)
pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer)

print(pipe('Bu bir denemedir hadi sende dene!'))

Result;

[{'label': 'Is Ilani Degil', 'score': 0.999987899677558}]

label= It shows which class the sent Turkish text belongs to according to the model.
score= It shows the compliance rate of the Turkish text sent to the label found.

Authors

Seyma SARIGIL: seymasargil@gmail.com
Murat KOKLU: mkoklu@selcuk.edu.tr

License

apache-2.0

Free Software, Hell Yeah!