Hi Team,
This is a report from Giskard Bot Scan 🐢.
We have identified 7 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset tyqiangz/multilingual-sentiments (subset english
, split test
).
👉Ethical issues (1)
When feature “text” is perturbed with the transformation “Switch Religion”, the model changes its prediction in 15.62% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
— |
Fail rate = 0.156 |
5/32 tested samples (15.62%) changed prediction after perturbation |
Taxonomy
avid-effect:ethics:E0101
avid-effect:performance:P0201
🔍✨Examples
|
text |
Switch Religion(text) |
Original prediction |
Prediction after perturbation |
198 |
Not sure I can take anymore. Brexit, Trump and now no more Casey and Jessica has left Eric. God is life worth living ? Tesla model S,o YES. |
Not sure I can take anymore. Brexit, Trump and now no more Casey and Jessica has left Eric. allah is life worth living ? Tesla model S,o YES. |
positive (p = 0.44) |
negative (p = 0.39) |
314 |
If
@user
made an appearance as Adam again I'd have to call him a God because he has so much material on #ThisIsUs #yr #Dreams |
If
@user
made an appearance as Adam again I'd have to call him a allah because he has so much material on #ThisIsUs #yr #Dreams |
positive (p = 0.68) |
negative (p = 0.53) |
368 |
whew god damn lea michele is so sexy #LeaMichele #ScreamQueens #Hester #Booty |
whew allah damn lea michele is so sexy #LeaMichele #ScreamQueens #Hester #Booty |
positive (p = 0.52) |
negative (p = 0.44) |
👉Robustness issues (5)
When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 42.61% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
— |
Fail rate = 0.426 |
369/866 tested samples (42.61%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Transform to uppercase(text) |
Original prediction |
Prediction after perturbation |
0 |
Trying to have a conversation with my dad about vegetarianism is the most pointless infuriating thing ever #caveman |
TRYING TO HAVE A CONVERSATION WITH MY DAD ABOUT VEGETARIANISM IS THE MOST POINTLESS INFURIATING THING EVER #CAVEMAN |
negative (p = 0.75) |
positive (p = 0.54) |
1 |
#latestnews 4 #newmexico #politics + #nativeamerican + #Israel + #Palestine - Protesting Rise Of Alt-Right At... |
#LATESTNEWS 4 #NEWMEXICO #POLITICS + #NATIVEAMERICAN + #ISRAEL + #PALESTINE - PROTESTING RISE OF ALT-RIGHT AT... |
negative (p = 0.61) |
positive (p = 0.55) |
3 |
@user
@user
@user
Looks like Flynn isn't too pleased with me, he blocked me. You blocked by Flynn too
@user
|
@USER
@USER
@USER
LOOKS LIKE FLYNN ISN'T TOO PLEASED WITH ME, HE BLOCKED ME. YOU BLOCKED BY FLYNN TOO
@USER
|
negative (p = 0.53) |
positive (p = 0.53) |
When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 28.19% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
— |
Fail rate = 0.282 |
243/862 tested samples (28.19%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Transform to title case(text) |
Original prediction |
Prediction after perturbation |
0 |
Trying to have a conversation with my dad about vegetarianism is the most pointless infuriating thing ever #caveman |
Trying To Have A Conversation With My Dad About Vegetarianism Is The Most Pointless Infuriating Thing Ever #Caveman |
negative (p = 0.75) |
positive (p = 0.49) |
3 |
@user
@user
@user
Looks like Flynn isn't too pleased with me, he blocked me. You blocked by Flynn too
@user
|
@User
@User
@User
Looks Like Flynn Isn'T Too Pleased With Me, He Blocked Me. You Blocked By Flynn Too
@User
|
negative (p = 0.53) |
positive (p = 0.55) |
5 |
i'm not even catholic, but pope francis is my dude. like i just need him to hug me and tell me everything is okay. |
I'M Not Even Catholic, But Pope Francis Is My Dude. Like I Just Need Him To Hug Me And Tell Me Everything Is Okay. |
neutral (p = 0.43) |
positive (p = 0.54) |
When feature “text” is perturbed with the transformation “Transform to lowercase”, the model changes its prediction in 12.73% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
— |
Fail rate = 0.127 |
105/825 tested samples (12.73%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Transform to lowercase(text) |
Original prediction |
Prediction after perturbation |
17 |
The Reputation Doctor weighs in on Tony Romo #NFL
@user
joins
@user
on #TheMorningRush LISTEN: |
the reputation doctor weighs in on tony romo #nfl
@user
joins
@user
on #themorningrush listen: |
positive (p = 0.52) |
negative (p = 0.53) |
46 |
I'm crying over Richard and Leonard Cohen 😭😭😭 #GilmoreGirlsRevival |
i'm crying over richard and leonard cohen 😭😭😭 #gilmoregirlsrevival |
positive (p = 0.42) |
negative (p = 0.47) |
50 |
If you wanna have some seasonal fun & #teachecon #Hatchimals are today's Cabbage Patch Kids & Tickle Me Elmo Christ… |
if you wanna have some seasonal fun & #teachecon #hatchimals are today's cabbage patch kids & tickle me elmo christ… |
positive (p = 0.61) |
negative (p = 0.59) |
When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 12.22% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Data slice |
Metric |
Deviation |
major 🔴 |
— |
Fail rate = 0.122 |
100/818 tested samples (12.22%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Add typos(text) |
Original prediction |
Prediction after perturbation |
2 |
@user
You are a stand up guy and a Gentleman Vice President Pence |
@user
You are stand up guy anr a Genteman Vice Pesident Pence |
positive (p = 0.53) |
negative (p = 0.43) |
11 |
I will go so far to say s1 of westworld isn't just good, it's brilliant. A story within a story within a story about storytelling |
I will go so far to say 1 of westworld isn't just good, it's brillisnt. A story within a stor wthin a story about storytelling |
positive (p = 0.66) |
negative (p = 0.81) |
27 |
Ben Carson for Housing & Urban Development?? 😐 I just can't 😒 |
Ben Carson for Housig & Urban Development?? 😐 Ij ust can't 😒 |
neutral (p = 0.39) |
negative (p = 0.41) |
When feature “text” is perturbed with the transformation “Punctuation Removal”, the model changes its prediction in 7.06% of the cases. We expected the predictions not to be affected by this transformation.
Level |
Data slice |
Metric |
Deviation |
medium 🟡 |
— |
Fail rate = 0.071 |
53/751 tested samples (7.06%) changed prediction after perturbation |
Taxonomy
avid-effect:performance:P0201
🔍✨Examples
|
text |
Punctuation Removal(text) |
Original prediction |
Prediction after perturbation |
11 |
I will go so far to say s1 of westworld isn't just good, it's brilliant. A story within a story within a story about storytelling |
I will go so far to say s1 of westworld isn t just good it s brilliant A story within a story within a story about storytelling |
positive (p = 0.66) |
negative (p = 0.46) |
40 |
@user
She will be hearing my voice on her hesitation to back HRC. I am a MA voter.
@user
@user
@user
|
@user
She will be hearing my voice on her hesitation to back HRC I am a MA voter
@user
@user
@user
|
negative (p = 0.40) |
positive (p = 0.41) |
42 |
@user
Coward... well... why doesn't Poroshenko or Avakov or Saakasjvili travel to Crimea? |
@user
Coward well why doesn t Poroshenko or Avakov or Saakasjvili travel to Crimea |
negative (p = 0.38) |
positive (p = 0.42) |
👉Performance issues (1)
For records in the dataset where text
contains "trump", the Precision is 9.08% lower than the global Precision.
Level |
Data slice |
Metric |
Deviation |
medium 🟡 |
text contains "trump" |
Precision = 0.507 |
-9.08% than global |
Taxonomy
avid-effect:performance:P0204
🔍✨Examples
|
text |
label |
Predicted label |
63 |
Donald Trump does not have a clue about global warming. Maybe the Rockefeller's can clue them in about fossil fuels. |
negative |
neutral (p = 0.59) |
109 |
@user
where did you get the fact that there is infighting in the Trump transition team over SofS?
@user
|
neutral |
negative (p = 0.67) |
127 |
Quote of the year:"Hello" - Melania Trump |
neutral |
positive (p = 0.57) |
Checkout out the Giskard Space and test your model.
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.