-
Notifications
You must be signed in to change notification settings - Fork 54
Open
Labels
Description
As a Court, so that I can quickly search and scan within a document, I need all uploaded PDF files to be OCR'd and/or scanned for text.
We can currently scrape text-ready PDFs using PDF.js
We do not currently have the ability to OCR PDF files that only consist of a flattened/image layer.
Pre-Conditions
Acceptance Criteria
- Non-order/opinion document text should not be incorporated into search indices
Notes
- Do not interfere with uploading mechanism - can/should do post-processing?
- Batch? On demand?
- Where will we store the data?
- Some pre-work has been done on this --- see here: Search: Practitioner-filed Brief Documents #8745 (comment)
Tasks
Test Cases
Definition of Done (Updated 2026-01-28)
Product Owner
- Acceptance criteria have been met and validated on the Court's test environment.
- Associated test cases defined in TestRail have been updated if necessary.
- Successful test run is performed in TestRail.
- User guides are updated if necessary.
UX
- All new functionality has been verified to work with keyboard navigation and screen reader software.
- UI should be touch optimized and responsive for external users.
Engineering
- Automated test scripts have been written, including visual tests for newly added PDFs.
- Successful test run is performed in TestRail.
- New screens have been added to cypress accessibility axe.
- Interactors should validate entities before calling persistence methods.
- Types have been added to all added and updated functions.
- Code refactored for clarity and to remove any known technical debt.
- Acceptance criteria for the story has been met.
- If there are special deployment instructions, they have been added to the
CHANGES.mdfile and the PR description. - Code that resides in the shared folder that only runs on the API or browser has been moved to either /web-client or /web-api.
Reactions are currently unavailable