Add in option to ignore punctuation and case

#1
by divi212 - opened

For many applications, it may be useful to ignore the punctuation and/or case when evaluating word error rate. This PR adds checkboxes to ignore punctuation and case, and then applies the relevant transforms to the ground truth and hypothesis text.

ignore-punctuation-and-case.png

I contemplated using jiwer's off the shelf transformations as defined here - https://jitsi.github.io/jiwer/reference/transformations/, which can directly be passed into the process_words function. However, this didn't have the option to remove punctuation.

Additionally, I also add typing to the python file and format the file.

divi212 changed pull request status to closed
divi212 changed pull request status to open
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment