Spaces:
Running
Report for cardiffnlp/twitter-roberta-base-irony
Hi Team,
This is a report from Giskard Bot Scan 🐢.
We have identified 3 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset tweet_eval (subset irony
, split test
).
👉Performance issues (3)
For records in the dataset where text
contains "user", the Recall is 34.98% lower than the global Recall.
Level | Data slice | Metric | Deviation |
---|---|---|---|
major 🔴 | text contains "user" |
Recall = 0.366 | -34.98% than global |
Taxonomy
avid-effect:performance:P0204🔍✨Examples
text | label | Predicted label |
|
---|---|---|---|
19 | @user Guess they didn't get the memo reg non-nuclear Baltic sea #sarcasm | irony | non_irony (p = 0.51) |
25 | @user hmm... let me think about that #sarcasm | irony | non_irony (p = 0.91) |
47 | @user 180 dead on 26/11 n more than 10k our ppl killed in terror attacks till date but not 1 paki show sympathy 2 them #irony | irony | non_irony (p = 0.71) |
For records in the dataset where text
contains "irony", the Accuracy is 27.73% lower than the global Accuracy.
Level | Data slice | Metric | Deviation |
---|---|---|---|
major 🔴 | text contains "irony" |
Accuracy = 0.531 | -27.73% than global |
Taxonomy
avid-effect:performance:P0204🔍✨Examples
text | label | Predicted label |
|
---|---|---|---|
23 | Who told the #hipsters that #irony was a thing of the Clinton years? Do they not carry history books in used bookstores in #brooklyn ? | irony | non_irony (p = 0.65) |
47 | @user 180 dead on 26/11 n more than 10k our ppl killed in terror attacks till date but not 1 paki show sympathy 2 them #irony | irony | non_irony (p = 0.71) |
65 | #Irony RT @user If you're going to give someone a scathing, 1-Star review for poor grammar, FFS use proper grammar. | irony | non_irony (p = 0.71) |
For records in the dataset where text
contains "sarcasm", the Accuracy is 12.15% lower than the global Accuracy.
Level | Data slice | Metric | Deviation |
---|---|---|---|
major 🔴 | text contains "sarcasm" |
Accuracy = 0.645 | -12.15% than global |
Taxonomy
avid-effect:performance:P0204🔍✨Examples
text | label | Predicted label |
|
---|---|---|---|
4 | So much #sarcasm at work mate 10/10 #boring 100% #dead mate full on #shit absolutely #sleeping mate can't handle the #sarcasm | irony | non_irony (p = 0.93) |
6 | People complain about my backround pic and all I feel is like "hey don't blame me, Albert E might have spoken those words" #sarcasm #life | irony | non_irony (p = 0.73) |
19 | @user Guess they didn't get the memo reg non-nuclear Baltic sea #sarcasm | irony | non_irony (p = 0.51) |
Checkout out the Giskard Space and test your model.
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.