Usage of PaddleOCR for DICOM images #15449

zixinnie-rti · 2025-05-27T16:58:33Z

zixinnie-rti
May 27, 2025

Hello,

I have been using PaddleOCRv2.9.1 for recognition of text from DICOM images, however when I try to upgrade to PaddleOCRv3.0.0 the code is giving me errors I do not understand.

My code involves converting DICOM images into pixel arrays (via the Pydicom package) and then passing those pixel arrays to PaddleOCR. Old versions of PaddleOCR are able to accept pixel arrays as input, however this newest version seems to be having issues.

Is there anyone who has been able to pass a pixel array into PaddleOCR?

Answered by liuhongen1234567

May 30, 2025

Hello, in PPOCRv5, the shape of the np.array must be [H, W, 3]. An array with shape [H, W] will result in an error during the preprocessing normalization stage due to the absence of shape[2]. You can use code img = np.dstack([img]*3) to convert an image from [H, W] to [H, W, 3] before inputting it into the ocr.predict() function.

View full answer

liuhongen1234567 · 2025-05-29T08:57:52Z

liuhongen1234567
May 29, 2025
Collaborator

Hello, may I ask what the specific issue is? I tested it on my end, and the new version does support passing np.array.

1 reply

zixinnie-rti May 29, 2025
Author

Here is a minimum example, along with the error message

from pydicom import dcmread
from paddleocr import PaddleOCR

filepath = 'PATH/TO/INPUT/FOLDER/1_ORIGINAL.dcm'

ds = dcmread(filepath, force = True)

image =  ds.pixel_array (this should be a numpy array of pixels)

#set up the OCR tool
ocr = PaddleOCR(lang="en")

text_detections = ocr.predict(image)

Error message:

IndexError                                Traceback (most recent call last)
File 
     17 #set up the OCR tool
     18 ocr = PaddleOCR(lang="en")
---> 20 text_detections = ocr.predict(image)

File ~.venv\Lib\site-packages\paddleocr\_pipelines\ocr.py:191, in PaddleOCR.predict(self, input, use_doc_orientation_classify, use_doc_unwarping, use_textline_orientation, text_det_limit_side_len, text_det_limit_type, text_det_thresh, text_det_box_thresh, text_det_unclip_ratio, text_rec_score_thresh)
    177 def predict(
    178     self,
    179     input,
   (...)
    189     text_rec_score_thresh=None,
    190 ):
--> 191     return list(
    192         self.predict_iter(
    193             input,
    194             use_doc_orientation_classify=use_doc_orientation_classify,
    195             use_doc_unwarping=use_doc_unwarping,
    196             use_textline_orientation=use_textline_orientation,
    197             text_det_limit_side_len=text_det_limit_side_len,
    198             text_det_limit_type=text_det_limit_type,
    199             text_det_thresh=text_det_thresh,
    200             text_det_box_thresh=text_det_box_thresh,
    201             text_det_unclip_ratio=text_det_unclip_ratio,
    202             text_rec_score_thresh=text_rec_score_thresh,
    203         )
    204     )

File .venv\Lib\site-packages\paddlex\inference\pipelines\_parallel.py:129, in AutoParallelSimpleInferencePipeline.predict(self, input, *args, **kwargs)
    123     yield from self._executor.execute(
    124         input,
    125         *args,
    126         **kwargs,
    127     )
    128 else:
--> 129     yield from self._pipeline.predict(
    130         input,
    131         *args,
    132         **kwargs,
    133     )

File .venv\Lib\site-packages\paddlex\inference\pipelines\ocr\pipeline.py:336, in _OCRPipeline.predict(self, input, use_doc_orientation_classify, use_doc_unwarping, use_textline_orientation, text_det_limit_side_len, text_det_limit_type, text_det_max_side_limit, text_det_thresh, text_det_box_thresh, text_det_unclip_ratio, text_rec_score_thresh)
    333 image_arrays = self.img_reader(batch_data.instances)
    335 if model_settings["use_doc_preprocessor"]:
--> 336     doc_preprocessor_results = list(
    337         self.doc_preprocessor_pipeline(
    338             image_arrays,
    339             use_doc_orientation_classify=use_doc_orientation_classify,
    340             use_doc_unwarping=use_doc_unwarping,
    341         )
    342     )
    343 else:
    344     doc_preprocessor_results = [{"output_img": arr} for arr in image_arrays]

File .venv\Lib\site-packages\paddlex\inference\pipelines\_parallel.py:129, in AutoParallelSimpleInferencePipeline.predict(self, input, *args, **kwargs)
    123     yield from self._executor.execute(
    124         input,
    125         *args,
    126         **kwargs,
    127     )
    128 else:
--> 129     yield from self._pipeline.predict(
    130         input,
    131         *args,
    132         **kwargs,
    133     )

File .venv\Lib\site-packages\paddlex\inference\pipelines\doc_preprocessor\pipeline.py:173, in _DocPreprocessorPipeline.predict(self, input, use_doc_orientation_classify, use_doc_unwarping)
    170     rot_imgs = image_arrays
    172 if model_settings["use_doc_unwarping"]:
--> 173     output_imgs = [
    174         item["doctr_img"][:, :, ::-1]
    175         for item in self.doc_unwarping_model(rot_imgs)
    176     ]
    177 else:
    178     output_imgs = rot_imgs

File .venv\Lib\site-packages\paddlex\inference\pipelines\doc_preprocessor\pipeline.py:173, in <listcomp>(.0)
    170     rot_imgs = image_arrays
    172 if model_settings["use_doc_unwarping"]:
--> 173     output_imgs = [
    174         item["doctr_img"][:, :, ::-1]
    175         for item in self.doc_unwarping_model(rot_imgs)
    176     ]
    177 else:
    178     output_imgs = rot_imgs

File .venv\Lib\site-packages\paddlex\inference\models\base\predictor\base_predictor.py:211, in BasePredictor.__call__(self, input, batch_size, **kwargs)
    209     yield output[0]
    210 else:
--> 211     yield from self.apply(input, **kwargs)

File .venv\Lib\site-packages\paddlex\inference\models\base\predictor\base_predictor.py:267, in BasePredictor.apply(self, input, **kwargs)
    265     batches = self.batch_sampler(input)
    266 for batch_data in batches:
--> 267     prediction = self.process(batch_data, **kwargs)
    268     prediction = PredictionWrap(prediction, len(batch_data))
    269     for idx in range(len(batch_data)):

File .venv\Lib\site-packages\paddlex\inference\models\image_unwarping\predictor.py:86, in WarpPredictor.process(self, batch_data)
     76 """
     77 Process a batch of data through the preprocessing, inference, and postprocessing.
     78 
   (...)
     83     dict: A dictionary containing the input path, raw image, class IDs, scores, and label names for every instance of the batch. Keys include 'input_path', 'input_img', 'class_ids', 'scores', and 'label_names'.
     84 """
     85 batch_raw_imgs = self.preprocessors["Read"](imgs=batch_data.instances)
---> 86 batch_imgs = self.preprocessors["Normalize"](imgs=batch_raw_imgs)
     87 batch_imgs = self.preprocessors["ToCHW"](imgs=batch_imgs)
     88 x = self.preprocessors["ToBatch"](imgs=batch_imgs)

File .venv\Lib\site-packages\paddlex\inference\models\common\vision\processors.py:270, in Normalize.__call__(self, imgs)
    268 def __call__(self, imgs):
    269     """apply"""
--> 270     return [self.norm(img) for img in imgs]

File .venv\Lib\site-packages\paddlex\inference\models\common\vision\processors.py:270, in <listcomp>(.0)
    268 def __call__(self, imgs):
    269     """apply"""
--> 270     return [self.norm(img) for img in imgs]

File .venv\Lib\site-packages\paddlex\inference\models\common\vision\processors.py:260, in Normalize.norm(self, img)
    257 def norm(self, img):
    258     split_im = list(cv2.split(img))
--> 260     for c in range(img.shape[2]):
    261         split_im[c] = split_im[c].astype(np.float32)
    262         split_im[c] *= self.alpha[c]

IndexError: tuple index out of range

liuhongen1234567 · 2025-05-29T16:32:55Z

liuhongen1234567
May 29, 2025
Collaborator

Hello, it seems like there is an issue with the preprocessing of the document. Could you provide the shape of the array or the DCM file? Alternatively, you could try setting the parameter use_doc_unwarping=False in PaddleOCR.

3 replies

zixinnie-rti May 29, 2025
Author

image.shape
(825, 1036)

When I set use_doc_unwarping=False, I get this error:

ValueError                                Traceback (most recent call last)
File 
     17 #set up the OCR tool
     18 ocr = PaddleOCR(lang="en", use_doc_unwarping=False)
---> 20 text_detections = ocr.predict(image)

File .venv\Lib\site-packages\paddleocr\_pipelines\ocr.py:191, in PaddleOCR.predict(self, input, use_doc_orientation_classify, use_doc_unwarping, use_textline_orientation, text_det_limit_side_len, text_det_limit_type, text_det_thresh, text_det_box_thresh, text_det_unclip_ratio, text_rec_score_thresh)
    177 def predict(
    178     self,
    179     input,
   (...)
    189     text_rec_score_thresh=None,
    190 ):
--> 191     return list(
    192         self.predict_iter(
    193             input,
    194             use_doc_orientation_classify=use_doc_orientation_classify,
    195             use_doc_unwarping=use_doc_unwarping,
    196             use_textline_orientation=use_textline_orientation,
    197             text_det_limit_side_len=text_det_limit_side_len,
    198             text_det_limit_type=text_det_limit_type,
    199             text_det_thresh=text_det_thresh,
    200             text_det_box_thresh=text_det_box_thresh,
    201             text_det_unclip_ratio=text_det_unclip_ratio,
    202             text_rec_score_thresh=text_rec_score_thresh,
    203         )
    204     )

File .venv\Lib\site-packages\paddlex\inference\pipelines\_parallel.py:129, in AutoParallelSimpleInferencePipeline.predict(self, input, *args, **kwargs)
    123     yield from self._executor.execute(
    124         input,
    125         *args,
    126         **kwargs,
    127     )
    128 else:
--> 129     yield from self._pipeline.predict(
    130         input,
    131         *args,
    132         **kwargs,
    133     )

File .venv\Lib\site-packages\paddlex\inference\pipelines\ocr\pipeline.py:350, in _OCRPipeline.predict(self, input, use_doc_orientation_classify, use_doc_unwarping, use_textline_orientation, text_det_limit_side_len, text_det_limit_type, text_det_max_side_limit, text_det_thresh, text_det_box_thresh, text_det_unclip_ratio, text_rec_score_thresh)
    344     doc_preprocessor_results = [{"output_img": arr} for arr in image_arrays]
    346 doc_preprocessor_images = [
    347     item["output_img"] for item in doc_preprocessor_results
    348 ]
--> 350 det_results = list(
    351     self.text_det_model(doc_preprocessor_images, **text_det_params)
    352 )
    354 dt_polys_list = [item["dt_polys"] for item in det_results]
    356 dt_polys_list = [self._sort_boxes(item) for item in dt_polys_list]

File .venv\Lib\site-packages\paddlex\inference\models\base\predictor\base_predictor.py:211, in BasePredictor.__call__(self, input, batch_size, **kwargs)
    209     yield output[0]
    210 else:
--> 211     yield from self.apply(input, **kwargs)

File .venv\Lib\site-packages\paddlex\inference\models\base\predictor\base_predictor.py:267, in BasePredictor.apply(self, input, **kwargs)
    265     batches = self.batch_sampler(input)
    266 for batch_data in batches:
--> 267     prediction = self.process(batch_data, **kwargs)
    268     prediction = PredictionWrap(prediction, len(batch_data))
    269     for idx in range(len(batch_data)):

File .venv\Lib\site-packages\paddlex\inference\models\text_detection\predictor.py:94, in TextDetPredictor.process(self, batch_data, limit_side_len, limit_type, thresh, box_thresh, unclip_ratio, max_side_limit)
     82 def process(
     83     self,
     84     batch_data: List[Union[str, np.ndarray]],
   (...)
     90     max_side_limit: Union[int, None] = None,
     91 ):
     93     batch_raw_imgs = self.pre_tfs["Read"](imgs=batch_data.instances)
---> 94     batch_imgs, batch_shapes = self.pre_tfs["Resize"](
     95         imgs=batch_raw_imgs,
     96         limit_side_len=limit_side_len or self.limit_side_len,
     97         limit_type=limit_type or self.limit_type,
     98         max_side_limit=(
     99             max_side_limit if max_side_limit is not None else self.max_side_limit
    100         ),
    101     )
    102     batch_imgs = self.pre_tfs["Normalize"](imgs=batch_imgs)
    103     batch_imgs = self.pre_tfs["ToCHW"](imgs=batch_imgs)

File .venv\Lib\site-packages\paddlex\inference\models\text_detection\processors.py:71, in DetResizeForTest.__call__(self, imgs, limit_side_len, limit_type, max_side_limit)
     69 resize_imgs, img_shapes = [], []
     70 for ori_img in imgs:
---> 71     img, shape = self.resize(
     72         ori_img, limit_side_len, limit_type, max_side_limit
     73     )
     74     resize_imgs.append(img)
     75     img_shapes.append(shape)

File .venv\Lib\site-packages\paddlex\inference\models\text_detection\processors.py:85, in DetResizeForTest.resize(self, img, limit_side_len, limit_type, max_side_limit)
     78 def resize(
     79     self,
     80     img,
   (...)
     83     max_side_limit: Union[int, None] = None,
     84 ):
---> 85     src_h, src_w, _ = img.shape
     86     if sum([src_h, src_w]) < 64:
     87         img = self.image_padding(img)

ValueError: not enough values to unpack (expected 3, got 2)

liuhongen1234567 May 30, 2025
Collaborator

Hello, in PPOCRv5, the shape of the np.array must be [H, W, 3]. An array with shape [H, W] will result in an error during the preprocessing normalization stage due to the absence of shape[2]. You can use code img = np.dstack([img]*3) to convert an image from [H, W] to [H, W, 3] before inputting it into the ocr.predict() function.

Answer selected by zixinnie-rti

zixinnie-rti May 30, 2025
Author

Thank you so much! It appears that the combination of use_doc_unwarper = False and converting the image from a single channel to 3 channels worked.

Usage of PaddleOCR for DICOM images #15449

Uh oh!

zixinnie-rti May 27, 2025

Replies: 2 comments · 4 replies

Uh oh!

liuhongen1234567 May 29, 2025 Collaborator

Uh oh!

Uh oh!

zixinnie-rti May 29, 2025 Author

Uh oh!

Uh oh!

liuhongen1234567 May 29, 2025 Collaborator

Uh oh!

Uh oh!

zixinnie-rti May 29, 2025 Author

Uh oh!

liuhongen1234567 May 30, 2025 Collaborator

Uh oh!

zixinnie-rti May 30, 2025 Author

zixinnie-rti
May 27, 2025

Replies: 2 comments 4 replies

liuhongen1234567
May 29, 2025
Collaborator

zixinnie-rti May 29, 2025
Author

liuhongen1234567
May 29, 2025
Collaborator

zixinnie-rti May 29, 2025
Author

liuhongen1234567 May 30, 2025
Collaborator

zixinnie-rti May 30, 2025
Author