metadata
license: apache-2.0
language:
- tr
pipeline_tag: text-classification
tags:
- job advertisement
- turkish bert
- bert-based
- StratifiedKFold
language: - tr tags: - translation license: apache-2.0
About the model
It has been trained with 15451 real job advertisement data taken as tagged by isinolsun.com
Included classes;
- Uygun İlan
- Is Ilani Degil
- Mustehcen
- Cift Pozisyon
Accordingly, the success rates in education are as follows;
- Model is Turkish bert-based.
- Used StratifiedKFold(5) for validation.
- results [0.806858621805241, 0.8912621359223301, 0.9440129449838188, 0.9750809061488673, 0.9851132686084142]
Mean-Precision: 0.9204655754937342
Uygun İlan | Is Ilani Degil | Mustehcen | Cift Pozisyon | |
---|---|---|---|---|
Precision | 0.986 | 0.996 | 0.966 | 0.970 |
Recall | 0.992 | 0.986 | 0.966 | 0.959 |
F1 Score | 0.989 | 0.991 | 0.966 | 0.965 |
Accuracy : 0.975 |
Example
from transformers import AutoTokenizer, TextClassificationPipeline, TFBertForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("nanelimon/bert-base-turkish-job-advertisement")
model = TFBertForSequenceClassification.from_pretrained("nanelimon/bert-base-turkish-job-advertisement", from_pt=True)
pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer)
print(pipe('Bu bir denemedir hadi sende dene!'))
Result;
[{'label': 'Is Ilani Degil', 'score': 0.999987899677558}]
- label= It shows which class the sent Turkish text belongs to according to the model.
- score= It shows the compliance rate of the Turkish text sent to the label found.
Authors
- Seyma SARIGIL: seymasargil@gmail.com
- Murat KOKLU: mkoklu@selcuk.edu.tr
License
apache-2.0
Free Software, Hell Yeah!