Commit History

Input image creation during redaction should now respect input folders. Minor json output path change
409bdc5

seanpedrickcase commited on

Fixed whole page redactions being incorrectly positions, and without IDs. Fixed duplicate pages output issue. Minor changes to output redaction box format and related code.
5086da0

seanpedrickcase commited on

Further changes to fix duplicate tests
78403ba

seanpedrickcase commited on

Update to secure path file to fix duplicate pages test
204c034

seanpedrickcase commited on

Updated test suite to deal with missing file issues
5aec971

seanpedrickcase commited on

Added possibility to specify allowed hosts. Fixed some tests to return more reliably. Fixed some issues related to file path checks not working correctly. Redaction should now return review files correctly at redaction and apply changes stages.
b1f183d

seanpedrickcase commited on

Added health check route for FastAPI. Removed unnecessary path references from uvicorn command, and mount_gradio_app function
1ee0970

seanpedrickcase commited on

Removed config s3 load from entrypoint - will instead be defined directly in task definition
825cf38

seanpedrickcase commited on

Load in environment variables prior to uvicorn run in entrypoint.sh
add45f8

seanpedrickcase commited on

Updated entrypoint to match root path defined in main app
64e67bd

seanpedrickcase commited on

Moved from gunicorn to uvicorn for AWS deployment
799caf1

seanpedrickcase commited on

Added FASTAPI_ROOT_PATH environment variable. Revised save path issue.
43624ed

seanpedrickcase commited on

Added line to copy CLI binaries (e.g. gunicorn) across between build and run stages in Dockerfile
e421fb3

seanpedrickcase commited on

Added gunicorn to requirements for when building Dockerfile based on FastAPI rather than Gradio directly. Updated minor some file path issues. Set return review PDF as default.
b38d4b9

seanpedrickcase commited on

Allow for RUN_FASTAPI variable to be passed in to Dockerfile as part of the build spec
93fcae3

seanpedrickcase commited on

Fixed a couple more secure path join locations
151e26e

seanpedrickcase commited on

Corrected references to blocks objects in test scripts
fdb34a8

seanpedrickcase commited on

Changed default of RUN_FASTAPI in Dockerfile to 0
8773642

seanpedrickcase commited on

Added possibility to mount Gradio app in FastAPI and restrict allowed origins (for security). Fixed some mismatched config variable references. Updated Dockerfile and related files to allow for FastAPI/Uvicorn deployment.
09ae4e0

seanpedrickcase commited on

Added capability of loading in redaction annotations from PDF documents directly into the app. Minor function documentation improvements, GUI changes, package updates.
b61459d

seanpedrickcase commited on

Enabled export of both review pdfs and redacted pdfs from same redaction run. Added config variables for user guide url and showing redaction settings. Moved config variables around a bit. Minor GUI improvements
44d987c

seanpedrickcase commited on

Added the possibility of saving initial redacted pdfs with redaction comments directly attached. Fix for missing Textract pages. Better Textract forms element extraction and save.
f333cf5

seanpedrickcase commited on

OCR outputs now return confidence values
a159312

seanpedrickcase commited on

Return tabular data redaction logs as csv rather than txt. Minor path creation security fix for duplicate page identification.
c688ac3

seanpedrickcase commited on

Removed some extraneous test steps. Improved Example loading and feedback, and redaction feedback. Minor security updates. Fixed Adobe xfdf file parsing.
1cb1897

seanpedrickcase commited on

Made example display conditional on example file existence. Turned example display off by default. Removed (mostly) unnecessary multi-os-test workflow
96ac47b

seanpedrickcase commited on

Added examples to tops of various tabs to demonstrate basic functions (optional). Minor changes to example csv ocr output
bbf844d

seanpedrickcase commited on

Corrected a polynomial regex issue. Reformatted code.
6a6aac2

seanpedrickcase commited on

Excluded Windows from OS tests once again as tesseract cannot be installed silently
b730fdd

seanpedrickcase commited on

Updated Windows Tesseract install location for test
96b0e0e

seanpedrickcase commited on

Further fixes on uncontrolled path issue
5345e1f

seanpedrickcase commited on

Fixed duplicate page argument mismatch. Readded Windows tests. Added refresh token options to cdk. Package updates
ad8fef5

seanpedrickcase commited on

Fixes to test suites. Minor default package changes (paddleocr not required)
f5b280d

seanpedrickcase commited on

General code changes and reformatting to address code vulnerabilities highlighted by codeQL scan, and black/ruff repplied to code. Fixes/optimisation of Github Actions
f957846

seanpedrickcase commited on

Fixed on deprecated Github workflow functions. Applied linter and formatter to code throughout. Added tests for GUI load.
bafcf39

seanpedrickcase commited on

Updated sync_to_hf github action yml with corrected Hugging Face space reference
6635049

seanpedrickcase commited on

Added a test suite based on the functions in cli_redact.py
084af54

seanpedrickcase commited on

Added further file limits to deduplication and file load functions
826ed50

seanpedrickcase commited on

Correction on AWS Textract page by page calls with custom analysis types
0e9dd2d

seanpedrickcase commited on

Added form, table, and layout extraction options to AWS Textract calls. Added options to config to bound document length, maximum table rows, etc.
d3e6a24

seanpedrickcase commited on

Added example data files. Greatly revised CLI redaction for redaction, deduplication, and AWS Textract batch calls. Various minor fixes and package updates.
d60759d

seanpedrickcase commited on

Merge pull request #59 from seanpedrick-case/tabular_duplicates
64ab318
unverified

Sean Pedrick-Case commited on

Fix to tabular redaction, added tabular deduplication. Updated cli call capability for both
aa5c211

seanpedrickcase commited on