(retriever) fix remote ocr and pe logic to match local behavior#1810
(retriever) fix remote ocr and pe logic to match local behavior#1810edknv merged 1 commit intoNVIDIA:mainfrom
Conversation
Greptile SummaryThis PR fixes two behavioral divergences between the remote (NIM) and local inference code paths. First, it wires
|
| Filename | Overview |
|---|---|
| nemo_retriever/src/nemo_retriever/nim/nim.py | Adds optional merge_levels param to invoke_image_inference_batches with length validation and correct per-batch slicing into the JSON payload |
| nemo_retriever/src/nemo_retriever/ocr/ocr.py | Computes per-crop merge_levels list (word for tables, paragraph otherwise) and passes it to the remote OCR call, matching local model behavior |
| nemo_retriever/src/nemo_retriever/page_elements/page_elements.py | Applies _apply_final_score_filter in all three remote response format branches, closing the gap with the local pipeline's post-WBF score filtering |
Sequence Diagram
sequenceDiagram
participant Caller
participant ocr_page_elements
participant invoke_image_inference_batches
participant NIM_OCR_Endpoint
participant _remote_response_to_detections
participant _apply_final_score_filter
Caller->>ocr_page_elements: pages_df, invoke_url
ocr_page_elements->>ocr_page_elements: build crop_b64s + crop_merge_levels
note over ocr_page_elements: word for tables, paragraph for others
ocr_page_elements->>invoke_image_inference_batches: image_b64_list, merge_levels
invoke_image_inference_batches->>invoke_image_inference_batches: validate len(merge_levels)==n
loop per batch
invoke_image_inference_batches->>NIM_OCR_Endpoint: POST {input, merge_levels[start:end]}
NIM_OCR_Endpoint-->>invoke_image_inference_batches: OCR response
end
invoke_image_inference_batches-->>ocr_page_elements: response_items
Caller->>_remote_response_to_detections: response_json
_remote_response_to_detections->>_remote_response_to_detections: parse response format
_remote_response_to_detections->>_apply_final_score_filter: dets (post-WBF)
note over _apply_final_score_filter: per-class YOLOX_PAGE_V3_FINAL_SCORE filter
_apply_final_score_filter-->>_remote_response_to_detections: filtered dets
_remote_response_to_detections-->>Caller: final detections
Reviews (1): Last reviewed commit: "(retriever) fix remote ocr and pe logic ..." | Re-trigger Greptile
Description
Fixes 2 bugs in the remote (NIM/build endpoint) code paths that caused the results to diverge from the local model paths.
Checklist