Generally improved OCR recognition of texts, corrected postcode regex a748df6 seanpedrickcase commited on Sep 24
Improved allow list, handwriting/signature identification, logging 6ea0852 seanpedrickcase commited on Sep 19
Updated default AWS_FUNCTION value. Logs seconds values from outputs correctly. 7aa4d5f seanpedrickcase commited on Sep 17
Should now correctly extract and sum up total processing time f8700a5 seanpedrickcase commited on Sep 16
Enhanced logging of usage. Small buffer added to redaction rectangles as it seems to miss the tops of text often. 34addbf seanpedrickcase commited on Sep 16
Can now select only specific pages in document to redact. Image based redaction should work correctly now. bc4bdbd seanpedrickcase commited on Sep 3
Handles multiple runs with multiple files correctly now. Logging and feedback improvements. bbf818d seanpedrickcase commited on Aug 21
Decision process now saved as log files. Other log files and feedback added 8c33828 seanpedrickcase commited on Aug 20
Added logging, anonymising all Excel sheets, simple redaction tags, some Dockerfile optimisation 01c88c0 seanpedrickcase commited on Aug 15
Added possibility to do authentication with AWS Cognito on load. Other minor changes. bc22fc4 seanpedrickcase commited on Jul 15
Can now redaction text or csv/xlsx files. Can redact multiple files. Embeds redactions as image-based file by default 7810536 seanpedrickcase commited on Jun 21
Better redaction output formatting. Custom output folders allowed. Upgraded Gradio version 12224f5 seanpedrickcase commited on Jun 6
Version 0.1. Adapted code for pyinstaller local executable conversion (Windows) 2a4b347 seanpedrickcase commited on May 22
Added TLDExtract cache files so that internet connection is not required dce6100 seanpedrickcase commited on May 20
Re-arranged image and text analysis to encourage text analysis (faster) 72a4f68 seanpedrickcase commited on May 16
Separated file preparation and file redaction functions. Hopefully sts endpoint access now works on AWS 0f18146 seanpedrickcase commited on May 15
Page conversion now page by page calls hopefully to avoid fastapi timeouts on AWS. gunicorn keep_alive parameter extended to 60 seconds just in case that helps too. 43287c3 seanpedrickcase commited on May 13
Unspecifying gradio and spacy in requirements, then reinstalling latest gradio afterwards in Dockerfile. All to try to avoid typer conflict 619a281 seanpedrickcase commited on May 13
Removed spacy version specification (3.7.4), as it creates a conflict with latest gradio version (4.31.0) 16dc1f9 seanpedrickcase commited on May 13
Changed boto3 package version in requirements to latest valid version (1.34.103) efd2dce seanpedrickcase commited on May 13
Updated gradio version to latest (4.31.0) in hope to address AWS server timeout issues. Other tested package versions specified in requirements. 44647fa seanpedrickcase commited on May 13
Specify GRADIO_SERVER_NAME variable in Dockerfile as 0.0.0.0 85a7cbf seanpedrickcase commited on Apr 25
Modified Dockerfile to run with user 1000. Changed port to standard 7860 and removed server name specification. 71761cb seanpedrickcase commited on Apr 25
Added opencv installation to dockerfile and reverted to slim-bookworm bffbd2b seanpedrickcase commited on Apr 25
Changed base python distribution to (hopefully) have access to tesseract-ocr package 5f91219 seanpedrickcase commited on Apr 25
Added -y to poppler-utils installation in Dockerfile. Added support for image files in image-based redaction. 37d982e seanpedrickcase commited on Apr 25