Commit History
Fixed linting issues
150a8d9
Fixed whole page redactions being incorrectly positions, and without IDs. Fixed duplicate pages output issue. Minor changes to output redaction box format and related code.
5086da0
Further changes to fix duplicate tests
78403ba
Update to secure path file to fix duplicate pages test
204c034
Updated test suite to deal with missing file issues
5aec971
Added possibility to specify allowed hosts. Fixed some tests to return more reliably. Fixed some issues related to file path checks not working correctly. Redaction should now return review files correctly at redaction and apply changes stages.
b1f183d
Added health check route for FastAPI. Removed unnecessary path references from uvicorn command, and mount_gradio_app function
1ee0970
Removed config s3 load from entrypoint - will instead be defined directly in task definition
825cf38
Load in environment variables prior to uvicorn run in entrypoint.sh
add45f8
Updated entrypoint to match root path defined in main app
64e67bd
Moved from gunicorn to uvicorn for AWS deployment
799caf1
Added FASTAPI_ROOT_PATH environment variable. Revised save path issue.
43624ed
Added line to copy CLI binaries (e.g. gunicorn) across between build and run stages in Dockerfile
e421fb3
Added gunicorn to requirements for when building Dockerfile based on FastAPI rather than Gradio directly. Updated minor some file path issues. Set return review PDF as default.
b38d4b9
Allow for RUN_FASTAPI variable to be passed in to Dockerfile as part of the build spec
93fcae3
Fixed a couple more secure path join locations
151e26e
Corrected references to blocks objects in test scripts
fdb34a8
Changed default of RUN_FASTAPI in Dockerfile to 0
8773642
Added possibility to mount Gradio app in FastAPI and restrict allowed origins (for security). Fixed some mismatched config variable references. Updated Dockerfile and related files to allow for FastAPI/Uvicorn deployment.
09ae4e0
Added capability of loading in redaction annotations from PDF documents directly into the app. Minor function documentation improvements, GUI changes, package updates.
b61459d
Enabled export of both review pdfs and redacted pdfs from same redaction run. Added config variables for user guide url and showing redaction settings. Moved config variables around a bit. Minor GUI improvements
44d987c
Added the possibility of saving initial redacted pdfs with redaction comments directly attached. Fix for missing Textract pages. Better Textract forms element extraction and save.
f333cf5
OCR outputs now return confidence values
a159312
Return tabular data redaction logs as csv rather than txt. Minor path creation security fix for duplicate page identification.
c688ac3
Removed some extraneous test steps. Improved Example loading and feedback, and redaction feedback. Minor security updates. Fixed Adobe xfdf file parsing.
1cb1897
Made example display conditional on example file existence. Turned example display off by default. Removed (mostly) unnecessary multi-os-test workflow
96ac47b
Added examples to tops of various tabs to demonstrate basic functions (optional). Minor changes to example csv ocr output
bbf844d
Corrected a polynomial regex issue. Reformatted code.
6a6aac2
Excluded Windows from OS tests once again as tesseract cannot be installed silently
b730fdd
Windows test path update
dd08a8e
Updated Windows Tesseract install location for test
96b0e0e
Updated Windows test implementation
cb39e46
Further fixes on uncontrolled path issue
5345e1f
Fixed duplicate page argument mismatch. Readded Windows tests. Added refresh token options to cdk. Package updates
ad8fef5
Removed duplicate test suite
fcd55d0
Minor fix to test imports
963130d
Fixes to test suites. Minor default package changes (paddleocr not required)
f5b280d
General code changes and reformatting to address code vulnerabilities highlighted by codeQL scan, and black/ruff repplied to code. Fixes/optimisation of Github Actions
f957846
Fixed on deprecated Github workflow functions. Applied linter and formatter to code throughout. Added tests for GUI load.
bafcf39
Fix to config file reference
3d18b9d
Updated sync_to_hf github action yml with corrected Hugging Face space reference
6635049
removed tempory test file
09ce61a
Added a test suite based on the functions in cli_redact.py
084af54
Added further file limits to deduplication and file load functions
826ed50
Correction on AWS Textract page by page calls with custom analysis types
0e9dd2d
Added form, table, and layout extraction options to AWS Textract calls. Added options to config to bound document length, maximum table rows, etc.
d3e6a24
Added example data files. Greatly revised CLI redaction for redaction, deduplication, and AWS Textract batch calls. Various minor fixes and package updates.
d60759d
Merge pull request #59 from seanpedrick-case/tabular_duplicates
64ab318
unverified
Sean Pedrick-Case
commited on