SPIKE: Perform OCR/Text Recognition on All Document Uploads

As a Court, so that I can quickly search and scan within a document, I need all uploaded PDF files to be OCR'd and/or scanned for text.

We can currently scrape text-ready PDFs using PDF.js

We do not currently have the ability to OCR PDF files that only consist of a flattened/image layer.


## Pre-Conditions


## Acceptance Criteria
* Non-order/opinion document text should **not** be incorporated into search indices

## Notes
* Do not interfere with uploading mechanism - can/should do post-processing?
* Batch? On demand?
* Where will we store the data?
* Some pre-work has been done on this --- see here: https://github.com/ustaxcourt/ef-cms/issues/8745#issuecomment-3289639095

## Tasks


## Test Cases


## Definition of Done (Updated 2026-01-28)

### Product Owner
 - [ ] Acceptance criteria have been met and validated on the Court's test environment.
 - [ ] Associated test cases defined in TestRail have been updated if necessary.
 - [ ] Successful test run is performed in TestRail.
 - [ ] User guides are updated if necessary.

### UX
 - [ ] All new functionality has been verified to work with keyboard navigation and screen reader software.
 - [ ] UI should be touch optimized and responsive for external users.

### Engineering
 - [ ] Automated test scripts have been written, including visual tests for newly added PDFs.
 - [ ] Successful test run is performed in TestRail.
 - [ ] New screens have been added to cypress accessibility axe.
 - [ ] Interactors should validate entities before calling persistence methods.
 - [ ] Types have been added to all added and updated functions.
 - [ ] Code refactored for clarity and to remove any known technical debt.
 - [ ] Acceptance criteria for the story has been met.
 - [ ] If there are special deployment instructions, they have been added to the `CHANGES.md` file and the PR description.
 - [ ] Code that resides in the shared folder that only runs on the API or browser has been moved to either /web-client or /web-api.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPIKE: Perform OCR/Text Recognition on All Document Uploads #9744

Pre-Conditions

Acceptance Criteria

Notes

Tasks

Test Cases

Definition of Done (Updated 2026-01-28)

Product Owner

UX

Engineering

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SPIKE: Perform OCR/Text Recognition on All Document Uploads #9744

Description

Pre-Conditions

Acceptance Criteria

Notes

Tasks

Test Cases

Definition of Done (Updated 2026-01-28)

Product Owner

UX

Engineering

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions