Aspect Based Sentiment Analysis with Turkish ๐Ÿ‡น๐Ÿ‡ท Data

This model performs Aspect-Based Sentiment Analysis (ABSA) ๐Ÿš€ for Turkish text. It predicts sentiment polarity (Positive, Neutral, Negative) towards specific aspects within a given sentence.


Model Details

Model Description

This model is fine-tuned from the dbmdz/bert-base-turkish-cased pretrained BERT model. It is trained on the Turkish-ABSA-Wsynthetic dataset, which contains Turkish restaurant reviews annotated with aspect-based sentiments. The model is capable of identifying the sentiment polarity for specific aspects (e.g., "servis," "fiyatlar") mentioned in Turkish sentences.

  • Developed by: Sengil
  • Language(s): Turkish ๐Ÿ‡น๐Ÿ‡ท
  • License: Apache-2.0
  • Finetuned from model: dbmdz/bert-base-turkish-cased
  • Number of Labels: 3 (Negative, Neutral, Positive)

Sources


Uses

Direct Use

This model can be used directly for analyzing aspect-specific sentiment in Turkish text, especially in domains like restaurant reviews.

Downstream Use

It can be fine-tuned for similar tasks in different domains (e.g., e-commerce, hotel reviews, or customer feedback analysis).

Out-of-Scope Use

  • Not suitable for tasks unrelated to sentiment analysis or Turkish language.
  • May not perform well on datasets with significantly different domain-specific vocabulary.

Limitations

  • May struggle with rare or ambiguous aspects not covered in the training data.
  • May exhibit biases present in the training dataset.

How to Get Started with the Model

!pip install -U transformers

Use the code below to get started with the model:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load the model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Sengil/ABSA-Turkish-bert-based-small")
model = AutoModelForSequenceClassification.from_pretrained("Sengil/ABSA-Turkish-bert-based-small")

# Example inference
text = "Servis รงok yavaลŸtฤฑ ama yemekler lezzetliydi."
aspect = "servis"
formatted_text = f"[CLS] {text} [SEP] {aspect} [SEP]"

inputs = tokenizer(formatted_text, return_tensors="pt", padding="max_length", truncation=True, max_length=128)
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=1).item()

# Map prediction to label
labels = {0: "Negative", 1: "Neutral", 2: "Positive"}
print(f"Sentiment for '{aspect}': {labels[predicted_class]}")

Training Details

Training Data

Training Data The model was fine-tuned on the Turkish-ABSA-Wsynthetic.csv dataset. The dataset contains semi-synthetic Turkish sentences annotated for aspect-based sentiment analysis.

  • Training Procedure
  • Optimizer: AdamW
  • Learning Rate: 2e-5
  • Batch Size: 16
  • Epochs: 5
  • Max Sequence Length: 128

Evaluation

The model achieved the following scores on the test set:

  • Accuracy: 95.48%
  • F1 Score (Weighted): 95.46%

Citation

@misc{absa_turkish_bert_based_small,
  title={Aspect-Based Sentiment Analysis for Turkish},
  author={Sengil},
  year={2024},
  url={https://huggingface.co/Sengil/ABSA_Turkish_BERT_Based_Small}
}

Model Card Contact

For any questions or issues, please open an issue in the repository or contact LinkedIN.

Downloads last month
79
Safetensors
Model size
111M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Sengil/ABSA-Turkish-bert-based-small

Finetuned
(101)
this model

Dataset used to train Sengil/ABSA-Turkish-bert-based-small