Report for Sigma/financial-sentiment-analysis
Hey Team!🤗✨
We’re thrilled to share some amazing evaluation results that’ll make your day!🎉📊
We have identified 8 potential vulnerabilities in your model based on an automated scan.
This automated analysis evaluated the model on the dataset financial_phrasebank (subset sentences_50agree
, split train
).
👉Robustness issues (3)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | major 🔴 | — | Fail rate = 0.396 | Transform to title case | 396/1000 tested samples (39.6%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 39.6% of the cases. We expected the predictions not to be affected by this transformation.text | Transform to title case(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
996 | These moderate but significant changes resulted in a significant 24-32 % reduction in the estimated CVD risk . | These Moderate But Significant Changes Resulted In A Significant 24-32 % Reduction In The Estimated Cvd Risk . | LABEL_2 (p = 1.00) | LABEL_1 (p = 1.00) |
4662 | Cash flow after investments amounted to EUR45m , down from EUR46m . | Cash Flow After Investments Amounted To Eur45M , Down From Eur46M . | LABEL_0 (p = 1.00) | LABEL_1 (p = 1.00) |
300 | The stock rose for a second day on Wednesday bringing its two-day rise to GBX12 .0 or 2.0 % . | The Stock Rose For A Second Day On Wednesday Bringing Its Two-Day Rise To Gbx12 .0 Or 2.0 % . | LABEL_2 (p = 1.00) | LABEL_1 (p = 1.00) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | major 🔴 | — | Fail rate = 0.392 | Transform to uppercase | 392/1000 tested samples (39.2%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 39.2% of the cases. We expected the predictions not to be affected by this transformation.text | Transform to uppercase(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
996 | These moderate but significant changes resulted in a significant 24-32 % reduction in the estimated CVD risk . | THESE MODERATE BUT SIGNIFICANT CHANGES RESULTED IN A SIGNIFICANT 24-32 % REDUCTION IN THE ESTIMATED CVD RISK . | LABEL_2 (p = 1.00) | LABEL_1 (p = 1.00) |
4662 | Cash flow after investments amounted to EUR45m , down from EUR46m . | CASH FLOW AFTER INVESTMENTS AMOUNTED TO EUR45M , DOWN FROM EUR46M . | LABEL_0 (p = 1.00) | LABEL_1 (p = 1.00) |
300 | The stock rose for a second day on Wednesday bringing its two-day rise to GBX12 .0 or 2.0 % . | THE STOCK ROSE FOR A SECOND DAY ON WEDNESDAY BRINGING ITS TWO-DAY RISE TO GBX12 .0 OR 2.0 % . | LABEL_2 (p = 1.00) | LABEL_1 (p = 1.00) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Robustness | major 🔴 | — | Fail rate = 0.109 | Add typos | 109/1000 tested samples (10.9%) changed prediction after perturbation |
🔍✨Examples
When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 10.9% of the cases. We expected the predictions not to be affected by this transformation.text | Add typos(text) | Original prediction | Prediction after perturbation | |
---|---|---|---|---|
982 | The financial impact is estimated to be an annual improvement of EUR2 .0 m in the division 's results , as of fiscal year 2008 . | The fjinancial impafct is wstimated to be an annaul improvement of EUR2 .0 m in the ivision 's results , az of fisca year 2008 | LABEL_2 (p = 1.00) | LABEL_1 (p = 0.98) |
1289 | NASDAQ-listed Yahoo Inc has introduced a new service that enables Malaysians to take their favorite Internet content and services with them on their mobile phones . | NASDAQ-listed Yahoo Inc has intrlduced a new servjce that enablez Malaysians to take their favorite Internet content and serfices with them on their mobile phones . | LABEL_2 (p = 0.74) | LABEL_1 (p = 1.00) |
4561 | The Baltimore Police and Fire Pension , which has about $ 1.5 billion , lost about $ 3.5 million in Madoff Ponzi scheme . | The Baltimore Pokice and Fire Penxsion , which ha about $ 1.5 bioilon , olst about $ 3.5 million in Madoff Ponzi scheme . | LABEL_0 (p = 1.00) | LABEL_1 (p = 1.00) |
👉Performance issues (5)
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | major 🔴 | avg_digits(text) < 0.038 |
Balanced Accuracy = 0.731 | — | -10.48% than global |
🔍✨Examples
For records in the dataset where `avg_digits(text)` < 0.038, the Balanced Accuracy is 10.48% lower than the global Balanced Accuracy.text | avg_digits(text) | label | Predicted label |
|
---|---|---|---|---|
12 | A purchase agreement for 7,200 tons of gasoline with delivery at the Hamina terminal , Finland , was signed with Neste Oil OYj at the average Platts index for this September plus eight US dollars per month . | 0.0193237 | LABEL_2 | LABEL_1 (p = 1.00) |
21 | ( Filippova ) A trilateral agreement on investment in the construction of a technology park in St Petersburg was to have been signed in the course of the forum , Days of the Russian Economy , that opened in Helsinki today . | 0 | LABEL_2 | LABEL_1 (p = 0.99) |
42 | Nyrstar has also agreed to supply to Talvivaara up to 150,000 tonnes of sulphuric acid per annum for use in Talvivaara 's leaching process during the period of supply of the zinc in concentrate . | 0.0307692 | LABEL_2 | LABEL_1 (p = 0.99) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | avg_word_length(text) >= 4.597 AND avg_word_length(text) < 4.707 |
Balanced Accuracy = 0.737 | — | -9.72% than global |
🔍✨Examples
For records in the dataset where `avg_word_length(text)` >= 4.597 AND `avg_word_length(text)` < 4.707, the Balanced Accuracy is 9.72% lower than the global Balanced Accuracy.text | avg_word_length(text) | label | Predicted label |
|
---|---|---|---|---|
42 | Nyrstar has also agreed to supply to Talvivaara up to 150,000 tonnes of sulphuric acid per annum for use in Talvivaara 's leaching process during the period of supply of the zinc in concentrate . | 4.6 | LABEL_2 | LABEL_1 (p = 0.99) |
79 | TELECOMWORLDWIRE-7 April 2006-TJ Group Plc sells stake in Morning Digital Design Oy Finnish IT company TJ Group Plc said on Friday 7 April that it had signed an agreement on selling its shares of Morning Digital Design Oy to Edita Oyj . | 4.64286 | LABEL_2 | LABEL_1 (p = 0.61) |
150 | According to Deputy MD Pekka Silvennoinen the aim is double turnover over the next three years . | 4.70588 | LABEL_2 | LABEL_1 (p = 0.77) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | avg_word_length(text) >= 4.707 AND avg_word_length(text) < 5.213 |
Balanced Accuracy = 0.741 | — | -9.26% than global |
🔍✨Examples
For records in the dataset where `avg_word_length(text)` >= 4.707 AND `avg_word_length(text)` < 5.213, the Balanced Accuracy is 9.26% lower than the global Balanced Accuracy.text | avg_word_length(text) | label | Predicted label |
|
---|---|---|---|---|
62 | `` The new agreement is a continuation to theagreement signed earlier this year with the Lemminkainen Group , whereby Cramo acquired the entire construction machine fleet ofLemminkainen Talo Oy Ita - ja Pohjois Suomo , and signed asimilar agreement , '' said Tatu Hauhio , managing director ofCramo Finland . | 5.18 | LABEL_1 | LABEL_2 (p = 0.92) |
68 | The contract covers HDO platform , AC800 and CXE880 optical Fttb nodes designed to increase the forward and return path capacity of the transmission networks . | 5.15385 | LABEL_2 | LABEL_1 (p = 0.99) |
74 | Finnish real estate investor Sponda Plc said on Wednesday 12 March that it has signed agreements with Danske Bank A-S , Helsinki Branch for a 7-year EUR150m credit facility and with Ilmarinen Mutual Pension Insurance Company for a 7-year EUR50m credit facility . | 5.11628 | LABEL_1 | LABEL_2 (p = 1.00) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | avg_whitespace(text) < 0.162 AND avg_whitespace(text) >= 0.148 |
Balanced Accuracy = 0.753 | — | -7.75% than global |
🔍✨Examples
For records in the dataset where `avg_whitespace(text)` < 0.162 AND `avg_whitespace(text)` >= 0.148, the Balanced Accuracy is 7.75% lower than the global Balanced Accuracy.text | avg_whitespace(text) | label | Predicted label |
|
---|---|---|---|---|
47 | The agreement was signed with Biohit Healthcare Ltd , the UK-based subsidiary of Biohit Oyj , a Finnish public company which develops , manufactures and markets liquid handling products and diagnostic test systems . | 0.153488 | LABEL_2 | LABEL_1 (p = 0.77) |
62 | `` The new agreement is a continuation to theagreement signed earlier this year with the Lemminkainen Group , whereby Cramo acquired the entire construction machine fleet ofLemminkainen Talo Oy Ita - ja Pohjois Suomo , and signed asimilar agreement , '' said Tatu Hauhio , managing director ofCramo Finland . | 0.159091 | LABEL_1 | LABEL_2 (p = 0.92) |
68 | The contract covers HDO platform , AC800 and CXE880 optical Fttb nodes designed to increase the forward and return path capacity of the transmission networks . | 0.157233 | LABEL_2 | LABEL_1 (p = 0.99) |
Vulnerability | Level | Data slice | Metric | Transformation | Deviation |
---|---|---|---|---|---|
Performance | medium 🟡 | text_length(text) >= 100.500 AND text_length(text) < 108.500 |
Precision = 0.805 | — | -5.26% than global |
🔍✨Examples
For records in the dataset where `text_length(text)` >= 100.500 AND `text_length(text)` < 108.500, the Precision is 5.26% lower than the global Precision.text | text_length(text) | label | Predicted label |
|
---|---|---|---|---|
153 | After the takeover , Cramo will become the second largest rental services provider in the Latvian market . | 106 | LABEL_2 | LABEL_1 (p = 0.89) |
298 | The increase in capital stock has been registered in the Finnish Trade Register on 20 November 2006 . | 101 | LABEL_2 | LABEL_1 (p = 1.00) |
314 | YIT says the acquisition is a part of its strategy for expansion in Central and Eastern European markets . | 106 | LABEL_2 | LABEL_1 (p = 0.99) |
Disclaimer: it's important to note that automated scans may produce false positives or miss certain vulnerabilities. We encourage you to review the findings and assess the impact accordingly.
💡 What's Next?
- Checkout the Giskard Space and improve your model.
- The Giskard community is always buzzing with ideas. 🐢🤔 What do you want to see next? Your feedback is our favorite fuel, so drop your thoughts in the community forum! 🗣️💬 Together, we're building something extraordinary.
🙌 Big Thanks!
We're grateful to have you on this adventure with us. 🚀🌟 Here's to more breakthroughs, laughter, and code magic! 🥂✨ Keep hugging that code and spreading the love! 💻 #Giskard #Huggingface #AISafety 🌈👏 Your enthusiasm, feedback, and contributions are what seek. 🌟 Keep being awesome!