RamiIbrahim
commited on
Commit
•
f4e8439
1
Parent(s):
33efff4
Update app.py
Browse files
app.py
CHANGED
@@ -7,6 +7,13 @@ model = joblib.load('tunisian_arabiz_sentiment_analysis_model.pkl')
|
|
7 |
vectorizer = joblib.load('tfidf_vectorizer.pkl')
|
8 |
|
9 |
def predict_sentiment(text):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
text_vectorized = vectorizer.transform([text])
|
11 |
prediction = model.predict(text_vectorized)[0]
|
12 |
probabilities = model.predict_proba(text_vectorized)[0]
|
@@ -54,8 +61,7 @@ iface = gr.Interface(
|
|
54 |
],
|
55 |
examples=formatted_examples,
|
56 |
title="Tunisian Arabiz Sentiment Analysis",
|
57 |
-
description=
|
58 |
-
"""
|
59 |
<p>This model predicts the sentiment of Tunisian text as either Positive or Negative. It works with both Tunisian Arabiz and standard Arabic script.</p>
|
60 |
|
61 |
<h4>What is Tunisian Arabiz? / ما هي العربيزية التونسية؟</h4>
|
@@ -74,21 +80,19 @@ iface = gr.Interface(
|
|
74 |
<p>This sentiment analysis model was trained on a combined dataset from TuniziDataset and the Tunisian Dialect Corpus.
|
75 |
It uses TF-IDF vectorization for feature extraction and Logistic Regression for classification.</p>
|
76 |
|
77 |
-
|
78 |
<p>The model accepts Tunisian Arabiz written with Latin and Arabic script.</p>
|
79 |
|
80 |
<h3>Limitations</h3>
|
81 |
<p>Due to dataset limitations, neutral sentiment data was removed to achieve maximum performance. </p>
|
82 |
<p>The model may not perform well on very colloquial expressions or new slang terms not present in the training data.
|
83 |
Sentiment can be nuanced and context-dependent, which may not always be captured accurately by this model.</p>
|
84 |
-
|
85 |
<h2>This model is open-source, and contributions of additional datasets are welcome to improve its capabilities.</h2>
|
86 |
|
87 |
<h2>هذا النموذج مفتوح المصدر، ونرحب بمساهمات مجموعات البيانات الإضافية لتحسين قدراته.</h2>
|
88 |
-
|
89 |
-
|
90 |
"""
|
91 |
)
|
92 |
|
93 |
# Launch the interface
|
94 |
-
iface.launch()
|
|
|
7 |
vectorizer = joblib.load('tfidf_vectorizer.pkl')
|
8 |
|
9 |
def predict_sentiment(text):
|
10 |
+
if not text.strip():
|
11 |
+
return (
|
12 |
+
"No input provided",
|
13 |
+
"N/A",
|
14 |
+
"Please enter some text to get a sentiment prediction."
|
15 |
+
)
|
16 |
+
|
17 |
text_vectorized = vectorizer.transform([text])
|
18 |
prediction = model.predict(text_vectorized)[0]
|
19 |
probabilities = model.predict_proba(text_vectorized)[0]
|
|
|
61 |
],
|
62 |
examples=formatted_examples,
|
63 |
title="Tunisian Arabiz Sentiment Analysis",
|
64 |
+
description="""
|
|
|
65 |
<p>This model predicts the sentiment of Tunisian text as either Positive or Negative. It works with both Tunisian Arabiz and standard Arabic script.</p>
|
66 |
|
67 |
<h4>What is Tunisian Arabiz? / ما هي العربيزية التونسية؟</h4>
|
|
|
80 |
<p>This sentiment analysis model was trained on a combined dataset from TuniziDataset and the Tunisian Dialect Corpus.
|
81 |
It uses TF-IDF vectorization for feature extraction and Logistic Regression for classification.</p>
|
82 |
|
|
|
83 |
<p>The model accepts Tunisian Arabiz written with Latin and Arabic script.</p>
|
84 |
|
85 |
<h3>Limitations</h3>
|
86 |
<p>Due to dataset limitations, neutral sentiment data was removed to achieve maximum performance. </p>
|
87 |
<p>The model may not perform well on very colloquial expressions or new slang terms not present in the training data.
|
88 |
Sentiment can be nuanced and context-dependent, which may not always be captured accurately by this model.</p>
|
89 |
+
<center>
|
90 |
<h2>This model is open-source, and contributions of additional datasets are welcome to improve its capabilities.</h2>
|
91 |
|
92 |
<h2>هذا النموذج مفتوح المصدر، ونرحب بمساهمات مجموعات البيانات الإضافية لتحسين قدراته.</h2>
|
93 |
+
</center>
|
|
|
94 |
"""
|
95 |
)
|
96 |
|
97 |
# Launch the interface
|
98 |
+
iface.launch()
|