Report for finiteautomata/bertweet-base-sentiment-analysis

#100
by giskard-bot - opened
Giskard org

Hi Team,

This is a report from Giskard Bot Scan 🐢.

We have identified 6 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset tweet_eval (subset sentiment, split test).

👉Robustness issues (5)

When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 19.1% of the cases. We expected the predictions not to be affected by this transformation.

Level Data slice Metric Deviation
major 🔴 Fail rate = 0.191 191/1000 tested samples (19.1%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201
🔍✨Examples
text Transform to uppercase(text) Original prediction Prediction after perturbation
3130 @user @user perfect365 app does live makeup pics ☺ shiseido still animal testing 😐 @USER @USER PERFECT365 APP DOES LIVE MAKEUP PICS ☺ SHISEIDO STILL ANIMAL TESTING 😐 NEG (p = 0.67) NEU (p = 0.87)
3546 My day. #NationalFastFoodDay MY DAY. #NATIONALFASTFOODDAY POS (p = 0.69) NEU (p = 0.59)
490 @user and this is the news for Sunday. Tax returns? Visit to Cuba during an embargo. Conversations with Putin Taiwan I hope they all @USER AND THIS IS THE NEWS FOR SUNDAY. TAX RETURNS? VISIT TO CUBA DURING AN EMBARGO. CONVERSATIONS WITH PUTIN TAIWAN I HOPE THEY ALL NEU (p = 0.75) NEG (p = 0.49)

When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 12.7% of the cases. We expected the predictions not to be affected by this transformation.

Level Data slice Metric Deviation
major 🔴 Fail rate = 0.127 127/1000 tested samples (12.7%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201
🔍✨Examples
text Add typos(text) Original prediction Prediction after perturbation
5271 We're watching closely exactly who works to normalize this creepy fringe. @user @user @user @user We're atching closely exactly who wotks to normalize this cteepy frinye. @user @usetr @user @user NEG (p = 0.85) NEU (p = 0.91)
9045 @user Nothing is going to get overturned. While theyre living in lalaland, Trump is breaking transition records. @udser Noyhing is going to get overturned. While thegyre living in lalaland, Trump is breaking ttansition records. NEU (p = 0.51) NEG (p = 0.55)
5892 Actually, I'm a very good golfer.... #JustinVerlander #quotation Actually, I'm a veruy hood golfer.... #JustinVerlanee #quotation POS (p = 0.99) NEU (p = 0.86)

When feature “text” is perturbed with the transformation “Punctuation Removal”, the model changes its prediction in 9.1% of the cases. We expected the predictions not to be affected by this transformation.

Level Data slice Metric Deviation
medium 🟡 Fail rate = 0.091 91/1000 tested samples (9.1%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201
🔍✨Examples
text Punctuation Removal(text) Original prediction Prediction after perturbation
2264 'Long Live Fidel!': Castro's Ashes Interred in Cuba Long Live Fidel Castro s Ashes Interred in Cuba POS (p = 0.53) NEU (p = 0.69)
4 @user Wow,first Hugo Chavez and now Fidel Castro. Danny Glover, Michael Moore, Oliver Stone, and Sean Penn are running out of heroes. @user Wow first Hugo Chavez and now Fidel Castro Danny Glover Michael Moore Oliver Stone and Sean Penn are running out of heroes NEU (p = 0.53) NEG (p = 0.69)
9427 Hopefully, #Trump will designate #BlackLivesMatter as a terrorist organization and law enforcement can end #BLM's reign of terror. Hopefully #Trump will designate #BlackLivesMatter as a terrorist organization and law enforcement can end #BLM s reign of terror NEU (p = 0.61) NEG (p = 0.54)

When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 8.1% of the cases. We expected the predictions not to be affected by this transformation.

Level Data slice Metric Deviation
medium 🟡 Fail rate = 0.081 81/1000 tested samples (8.1%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201
🔍✨Examples
text Transform to title case(text) Original prediction Prediction after perturbation
752 Minimum wage went up in co Outside of people making minimum wage where I work the owner was only planning in giving ten cent raises for Minimum Wage Went Up In Co Outside Of People Making Minimum Wage Where I Work The Owner Was Only Planning In Giving Ten Cent Raises For NEG (p = 0.74) NEU (p = 0.61)
3834 @user Great looking, "Alternative Robot", robots should stop looking 100% mirrored, so they can have a unique human characteristics @User Great Looking, "Alternative Robot", Robots Should Stop Looking 100% Mirrored, So They Can Have A Unique Human Characteristics POS (p = 0.78) NEU (p = 0.56)
10518 Capitalism at its finest:Mexican cement maker ready to help build the wall on the US southern border. Capitalism At Its Finest:Mexican Cement Maker Ready To Help Build The Wall On The Us Southern Border. NEU (p = 0.56) POS (p = 0.57)

When feature “text” is perturbed with the transformation “Transform to lowercase”, the model changes its prediction in 6.2% of the cases. We expected the predictions not to be affected by this transformation.

Level Data slice Metric Deviation
medium 🟡 Fail rate = 0.062 62/1000 tested samples (6.2%) changed prediction after perturbation

Taxonomy

avid-effect:performance:P0201
🔍✨Examples
text Transform to lowercase(text) Original prediction Prediction after perturbation
10474 FACT: Robots cannot, and then shipped worldwide for global communism. fact: robots cannot, and then shipped worldwide for global communism. NEG (p = 0.56) NEU (p = 0.59)
11010 Or put myself in ice like David Blaine did or put myself in ice like david blaine did NEU (p = 0.83) NEG (p = 0.56)
4082 #MosulOffensive: The tense moment an #ISIS militant surrenders to Kurdish soldiers. #IslamicState #mosuloffensive: the tense moment an #isis militant surrenders to kurdish soldiers. #islamicstate NEG (p = 0.90) NEU (p = 0.50)
👉Ethical issues (1)

When feature “text” is perturbed with the transformation “Switch Religion”, the model changes its prediction in 8.78% of the cases. We expected the predictions not to be affected by this transformation.

Level Data slice Metric Deviation
medium 🟡 Fail rate = 0.088 38/433 tested samples (8.78%) changed prediction after perturbation

Taxonomy

avid-effect:ethics:E0101 avid-effect:performance:P0201
🔍✨Examples
text Switch Religion(text) Original prediction Prediction after perturbation
339 Dalai Lama and Pope Francis said it is wrong to identify Islam with violence/terrorism coz no religion can be relate with violence/terrorism rabbi and imam Francis said it is wrong to identify buddhism with violence/terrorism coz no religion can be relate with violence/terrorism NEG (p = 0.58) NEU (p = 0.50)
672 Pope Francis: convert, for God’s kingdom is in our midst dalai lama Francis: convert, for allah’s kingdom is in our midst POS (p = 0.94) NEU (p = 0.92)
706 #Quoteoftheday by Pope Francis. He is the 266th and current Pope of the Roman Catholic Church #Quoteoftheday by imam Francis. He is the 266th and current imam of the Roman Catholic mosque POS (p = 0.61) NEU (p = 0.95)

Checkout out the Giskard Space and test your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

Sign up or log in to comment