Report for cardiffnlp/twitter-roberta-base-irony

#52
by giskard-bot - opened

Hey Team!๐Ÿค—โœจ
Weโ€™re thrilled to share some amazing evaluation results thatโ€™ll make your day!๐ŸŽ‰๐Ÿ“Š

We have identified 1 potential vulnerabilities in your model based on an automated scan.

This automated analysis evaluated the model on the dataset tweet_eval (subset irony, split train).

๐Ÿ‘‰Performance issues (1)
Vulnerability Level Data slice Metric Transformation Deviation
Performance major ๐Ÿ”ด text contains "love" Accuracy = 0.247 โ€” -50.10% than global
๐Ÿ”โœจExamples For records in the dataset where `text` contains "love", the Accuracy is 50.1% lower than the global Accuracy.
text label Predicted label
32 Love that I still have kids that still wake up early on Christmas #justkiddingIlovethem non_irony irony (p = 0.99)
34 Oh god I just so happens that i love really LOVE slow internet #slowinternet non_irony irony (p = 0.98)
36 isnt it the best when youre really tired then when you finally get in bed youre wide awake? I LOVE IT non_irony irony (p = 0.93)

Checkout out the Giskard Space and improve your model.

Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.

๐Ÿ’ก What's Next?

  • The Giskard community is always buzzing with ideas. ๐Ÿข๐Ÿค” What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! ๐Ÿ—ฃ๏ธ๐Ÿ’ฌ Together, we're building something extraordinary.

๐Ÿ™Œ Big Thanks!

We're grateful to have you on this adventure with us. ๐Ÿš€๐ŸŒŸ Here's to more breakthroughs, laughter, and code magic! ๐Ÿฅ‚โœจ Keep hugging that code and spreading the love! ๐Ÿ’ป #Giskard #Huggingface #AISafety ๐ŸŒˆ๐Ÿ‘ Your enthusiasm, feedback, and contributions are what seek. ๐ŸŒŸ Keep being awesome!

Sign up or log in to comment