(retriever) fix remote ocr and pe logic to match local behavior by edknv · Pull Request #1810 · NVIDIA/NeMo-Retriever

edknv · 2026-04-07T16:54:00Z

Description

Fixes 2 bugs in the remote (NIM/build endpoint) code paths that caused the results to diverge from the local model paths.

OCR mege levels: The OCR NIM endpoint defaults to paragraph-level text merging when merge_levels is not specified in the request. The local OCR path correctly uses word-level merging for tables (producing proper pseudo-markdown with individual cells) and paragraph-level for everything else. The remote path was not passing merge_levels at all.
Page-elements score filtering: The local inference path applies a per-class final score filter to remove low-confidence detections. The remote path was missing this step.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
If adjusting docker-compose.yaml environment variables have you ensured those are mimicked in the Helm values.yaml file.

greptile-apps · 2026-04-07T16:56:55Z

Greptile Summary

This PR fixes two behavioral divergences between the remote (NIM) and local inference code paths. First, it wires merge_levels through invoke_image_inference_batches and populates it per-crop in the OCR path ("word" for tables, "paragraph" for all other elements), so the NIM endpoint no longer defaults to paragraph-level merging for table crops. Second, it applies _apply_final_score_filter in all three remote response format branches of _remote_response_to_detections, bringing remote page-element scoring in line with the local pipeline which already applied this filter after WBF post-processing. Both fixes are minimal, correctly targeted, and handle the graceful-degradation cases (empty YOLOX_PAGE_V3_FINAL_SCORE, absent merge_levels) safely.

Confidence Score: 5/5

Safe to merge; changes are narrowly scoped bug fixes that align remote paths with already-validated local behavior.

No P0 or P1 issues found. The merge_levels length validation is correct and defensive, the per-batch slicing is correct for Sequence types, and _apply_final_score_filter short-circuits safely when YOLOX_PAGE_V3_FINAL_SCORE is empty. All three remote format branches are now consistently patched.

No files require special attention.

Important Files Changed

Filename	Overview
nemo_retriever/src/nemo_retriever/nim/nim.py	Adds optional merge_levels param to invoke_image_inference_batches with length validation and correct per-batch slicing into the JSON payload
nemo_retriever/src/nemo_retriever/ocr/ocr.py	Computes per-crop merge_levels list (word for tables, paragraph otherwise) and passes it to the remote OCR call, matching local model behavior
nemo_retriever/src/nemo_retriever/page_elements/page_elements.py	Applies _apply_final_score_filter in all three remote response format branches, closing the gap with the local pipeline's post-WBF score filtering

Sequence Diagram

sequenceDiagram
    participant Caller
    participant ocr_page_elements
    participant invoke_image_inference_batches
    participant NIM_OCR_Endpoint
    participant _remote_response_to_detections
    participant _apply_final_score_filter

    Caller->>ocr_page_elements: pages_df, invoke_url
    ocr_page_elements->>ocr_page_elements: build crop_b64s + crop_merge_levels
    note over ocr_page_elements: word for tables, paragraph for others
    ocr_page_elements->>invoke_image_inference_batches: image_b64_list, merge_levels
    invoke_image_inference_batches->>invoke_image_inference_batches: validate len(merge_levels)==n
    loop per batch
        invoke_image_inference_batches->>NIM_OCR_Endpoint: POST {input, merge_levels[start:end]}
        NIM_OCR_Endpoint-->>invoke_image_inference_batches: OCR response
    end
    invoke_image_inference_batches-->>ocr_page_elements: response_items

    Caller->>_remote_response_to_detections: response_json
    _remote_response_to_detections->>_remote_response_to_detections: parse response format
    _remote_response_to_detections->>_apply_final_score_filter: dets (post-WBF)
    note over _apply_final_score_filter: per-class YOLOX_PAGE_V3_FINAL_SCORE filter
    _apply_final_score_filter-->>_remote_response_to_detections: filtered dets
    _remote_response_to_detections-->>Caller: final detections

_{Reviews (1): Last reviewed commit: "(retriever) fix remote ocr and pe logic ..." | Re-trigger Greptile}

(retriever) fix remote ocr and pe logic to match local behavior

00958eb

edknv requested review from a team as code owners April 7, 2026 16:54

edknv requested a review from ChrisJar April 7, 2026 16:54

jioffe502 approved these changes Apr 7, 2026

View reviewed changes

edknv merged commit a7cd139 into NVIDIA:main Apr 7, 2026
7 of 8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(retriever) fix remote ocr and pe logic to match local behavior#1810

(retriever) fix remote ocr and pe logic to match local behavior#1810
edknv merged 1 commit intoNVIDIA:mainfrom
edknv:edwardk/retriever-nim-merge-level

edknv commented Apr 7, 2026

Uh oh!

greptile-apps bot commented Apr 7, 2026

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

edknv commented Apr 7, 2026

Description

Checklist

Uh oh!

greptile-apps bot commented Apr 7, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants