Spaces:

inoki-giskard
/

scan-report-temp

Running

App Files Files Community

Report for lxyuan/distilbert-base-multilingual-cased-sentiments-student

#3

by inoki-giskard - opened Dec 9, 2023

Owner Dec 9, 2023

Overconfidence issues (1)

Vulnerability	Level	Data slice	Metric	Transformation	Deviation	Description
Overconfidence	medium	`avg_digits(text)` < 0.011	Overconfidence rate = 0.291	—	+18.82% than global	For records in the dataset where `avg_digits(text)` < 0.011, we found a significantly higher number of overconfident wrong predictions (183 samples, corresponding to 29.093799682034977% of the wrong predictions in the data slice).

Robustness issues (5)

Vulnerability	Level	Data slice	Metric	Transformation	Deviation	Description
Robustness	major	—	Fail rate = 0.393	Transform to uppercase	393/1000 tested samples (39.3%) changed prediction after perturbation	When feature “text” is perturbed with the transformation “Transform to uppercase”, the model changes its prediction in 39.3% of the cases. We expected the predictions not to be affected by this transformation.
Robustness	major	—	Fail rate = 0.307	Transform to title case	307/1000 tested samples (30.7%) changed prediction after perturbation	When feature “text” is perturbed with the transformation “Transform to title case”, the model changes its prediction in 30.7% of the cases. We expected the predictions not to be affected by this transformation.
Robustness	major	—	Fail rate = 0.153	Add typos	153/1000 tested samples (15.3%) changed prediction after perturbation	When feature “text” is perturbed with the transformation “Add typos”, the model changes its prediction in 15.3% of the cases. We expected the predictions not to be affected by this transformation.
Robustness	major	—	Fail rate = 0.144	Transform to lowercase	144/1000 tested samples (14.4%) changed prediction after perturbation	When feature “text” is perturbed with the transformation “Transform to lowercase”, the model changes its prediction in 14.4% of the cases. We expected the predictions not to be affected by this transformation.
Robustness	medium	—	Fail rate = 0.092	Punctuation Removal	92/1000 tested samples (9.2%) changed prediction after perturbation	When feature “text” is perturbed with the transformation “Punctuation Removal”, the model changes its prediction in 9.2% of the cases. We expected the predictions not to be affected by this transformation.

Performance issues (1)

Vulnerability	Level	Data slice	Metric	Transformation	Deviation	Description
Performance	medium	`text` contains "friday"	Precision = 0.432	—	-7.05% than global	For records in the dataset where `text` contains "friday", the Precision is 7.05% lower than the global Precision.

inoki-giskard changed discussion status to closed Dec 16, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment