Commit History

Multithreaded file preparation. Can call Textract without signature detection
9504619

seanpedrickcase commited on

Can now specify the root path that the app will run on with an environment variable
b8e245f

seanpedrickcase commited on

Can now define queue size, max file size, and server port in environment variables
dc17f6e

seanpedrickcase commited on

Updated Dockerfile and entrypoint file to hopefully deal correctly with APP_MODE environment variable
7c7fd7c

seanpedrickcase commited on

Removed default custom header values so as not to cause errors
7f5a542

seanpedrickcase commited on

Moved chmod command to before user switch in Dockerfile
05c20d6

seanpedrickcase commited on

Ensure entrypoint.sh is copied
3dc1171

seanpedrickcase commited on

Modified Dockerfile hopefully to not need Lambda overrides. Looking into custom headers from Cloudfront to try to get them to work
bf7bb79

seanpedrickcase commited on

Allowed for overwriting of default output folder in choose_and_run_redactor function.
68a91f4

seanpedrickcase commited on

Updated output file creation variables for Lambda direct redaction runs
e85b74e

seanpedrickcase commited on

Removed need to write result.stdout in lambda entrypoint
5d649ba

seanpedrickcase commited on

Added a little more debugging code to lambda_entrypoint
653bd2d

seanpedrickcase commited on

Created custom csvlogger to try to overcome AWS Lambda's incompatibility with multithread locks
34bd97b

seanpedrickcase commited on

Changed app_mode arg position in dockerfile, changed default to gradio
d0b63c6

seanpedrickcase commited on

Moved entrypoint.sh creation to before user switch to avoid permission errors
7e8c1c9

seanpedrickcase commited on

Updated Dockerfile and requirements to include relevant Lambda packages
3f9e976

seanpedrickcase commited on

Moved gradio run code to outside of lambda_handler function in lambda_entrypoint.py
1cfa6e8

seanpedrickcase commited on

Switched start py file through Dockerfile to lambda_entrypoint. Added gradio links from this .py
6622361

seanpedrickcase commited on

Some more debugging. Added aws-lambda-adapter just in case that's useful in AWS Lambda
a3ba5e2

seanpedrickcase commited on

Added some debugging statements for entrypoint_router and lambda_entrypoint.py
18fb7ec

seanpedrickcase commited on

Removed test event from entrypoint_router.py
7f2dc0f

seanpedrickcase commited on

Added lambda_entrypoint.py to main folder
9337aae

seanpedrickcase commited on

Correctly called authenticate user function in entrypoint router.py
35e6d45

seanpedrickcase commited on

Corrected references to max_queue_size and max_file_size
2bb3ff5

seanpedrickcase commited on

Added option for running redact function through CLI (i.e. not going through Gradio UI or API). Test functions for running this through AWS Lambda.
e5dfae7

seanpedrickcase commited on

Only shows AWS options when AWS functions enabled. Can now upload previous review files to continue review later. Some review debugging.
e2aae24

seanpedrickcase commited on

Submitting modified redactions will no longer overwrite default labels
e69ae00

seanpedrickcase commited on

Should now retain modified redactions on first use of zoom
face41c

seanpedrickcase commited on

Comprehend now uses custom spacy recognisers on top of defaults. Added zoom functionality to annotator. Fixed some pdf mediabox issues and redacted image output issues.
ec98119

seanpedrickcase commited on

AWS Comprehend query numbers in logs should now add up correctly
c71d0c1

seanpedrickcase commited on

Returned file redaction timeout (before resending request) to 105 seconds default
f5b6c1b

seanpedrickcase commited on

logs should only be updated once per file run now
2e71433

seanpedrickcase commited on

Improved time taken reporting and readme
04d80a1

seanpedrickcase commited on

Consolidated AWS Comprehend redaction calls to reduce total number
542c252

seanpedrickcase commited on

When on AWS, now loads in a default allow_list to exclude common words from redaction. Improved checks on AWS Comprehend calls.
390bef2

seanpedrickcase commited on

Changed default options for AWS.
056204b

seanpedrickcase commited on

Added support for AWS Comprehend for PII identification. OCR and detection results now written to main output
f0f9378

seanpedrickcase commited on

Updated requirements for latest gradio-image-annotation version
aaf0acb

seanpedrickcase commited on

Allowed for time limits on redact to avoid timeouts. Improved review interface. Now accepts only one file at a time. Upgraded Gradio version
eea5c07

seanpedrickcase commited on

Added user guide and modified intro text
21d060c

seanpedrickcase commited on

App will now try to save modified redactions from user to json file.
4805b1c

seanpedrickcase commited on

Upgraded packages. Fixed some issues with review process. Better progress reporting for user.
5b4b5fb

seanpedrickcase commited on

Allowed for PIL to load truncated images to avoid some load errors
a680619

seanpedrickcase commited on

Added 'Review redactions' tab to the app. You can now visually inspect suggested redactions and modify/add with a point and click interface.
ebf9010

seanpedrickcase commited on

Adjusted outputs correctly for situations where the pdf mediabox size is different from the visible page size
15026f7

seanpedrickcase commited on

Updated requirements to include pymupdf
d8c98c8

seanpedrickcase commited on

Redaction tool can now export pdfs with selectable text retained - redacted text is deleted and covered with a black box. Licence change for pymupdf use.
339a165

seanpedrickcase commited on

General improvement in quick image matching and merging
84c83c0

seanpedrickcase commited on