Added logging, anonymising all Excel sheets, simple redaction tags, some Dockerfile optimisation 01c88c0 seanpedrickcase commited on Aug 15, 2024
Can now redaction text or csv/xlsx files. Can redact multiple files. Embeds redactions as image-based file by default 7810536 seanpedrickcase commited on Jun 21, 2024
Better redaction output formatting. Custom output folders allowed. Upgraded Gradio version 12224f5 seanpedrickcase commited on Jun 6, 2024
Added TLDExtract cache files so that internet connection is not required dce6100 seanpedrickcase commited on May 20, 2024
Page conversion now page by page calls hopefully to avoid fastapi timeouts on AWS. gunicorn keep_alive parameter extended to 60 seconds just in case that helps too. 43287c3 seanpedrickcase commited on May 13, 2024
Unspecifying gradio and spacy in requirements, then reinstalling latest gradio afterwards in Dockerfile. All to try to avoid typer conflict 619a281 seanpedrickcase commited on May 13, 2024
Specify GRADIO_SERVER_NAME variable in Dockerfile as 0.0.0.0 85a7cbf seanpedrickcase commited on Apr 25, 2024
Modified Dockerfile to run with user 1000. Changed port to standard 7860 and removed server name specification. 71761cb seanpedrickcase commited on Apr 25, 2024
Added opencv installation to dockerfile and reverted to slim-bookworm bffbd2b seanpedrickcase commited on Apr 25, 2024
Changed base python distribution to (hopefully) have access to tesseract-ocr package 5f91219 seanpedrickcase commited on Apr 25, 2024
Added -y to tesseract-ocr installation in Dockerfile b723aad seanpedrickcase commited on Apr 25, 2024
Added -y to poppler-utils installation in Dockerfile. Added support for image files in image-based redaction. 37d982e seanpedrickcase commited on Apr 25, 2024