python_backend pytorch example as_numpy()  error

**Description**

I’m using triton python_backend to run the  [pytorch example](https://github.com/triton-inference-server/python_backend/tree/35a1c1fad5104c9c4149dd7fee69585d99bb6009/examples/pytorch) in the python_backend repo. I packaged the pytroch dependencies into a conda environment and can
load the model successfully. However when running the [client inference script](https://github.com/triton-inference-server/python_backend/blob/35a1c1fad5104c9c4149dd7fee69585d99bb6009/examples/pytorch/client.py) provided in the repo, I encounter the following error when trying to get the output from the httpclient response. It seems that the response is empty.

---------------------------------------------------------------------------
```
ValueError                                Traceback (most recent call last)
Cell In[20], line 33
     30 response = client.infer(model_name, inputs, request_id=str(1), outputs=outputs)
     32 result = response.get_response()
---> 33 output0_data = response.as_numpy("OUTPUT0")
     34 output1_data = response.as_numpy("OUTPUT1")
     36 print(
     37     "INPUT0 ({}) + INPUT1 ({}) = OUTPUT0 ({})".format(
     38         input0_data, input1_data, output0_data
     39     )
     40 )

File ~/SageMaker/custom-miniconda/miniconda/envs/custom_python310/lib/python3.10/site-packages/tritonclient/http/_infer_result.py:208, in InferResult.as_numpy(self, name)
    204             if not has_binary_data:
    205                 np_array = np.array(
    206                     output["data"], dtype=triton_to_np_dtype(datatype)
    207                 )
--> 208             np_array = np_array.reshape(output["shape"])
    209             return np_array
    210 return None

ValueError: cannot reshape array of size 0 into shape (4,)
```

I tried running the [add_sub ](https://github.com/triton-inference-server/python_backend/tree/35a1c1fad5104c9c4149dd7fee69585d99bb6009/examples/add_sub)example and the response.as_numpy("OUTPUT0") worked fine with the expected output.

**Triton Information**
What version of Triton are you using?

server_version   2.41.0   

Are you using the Triton container or did you build it yourself?

I’m using a sagemaker docker image for triton server: 763104351884.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tritonserver:23.12-py3

**To Reproduce**

The model.py

```
import json

# triton_python_backend_utils is available in every Triton Python model. You
# need to use this module to create inference requests and responses. It also
# contains some utility functions for extracting information from model_config
# and converting Triton input/output types to numpy types.
import triton_python_backend_utils as pb_utils
from torch import nn


class AddSubNet(nn.Module):
    """
    Simple AddSub network in PyTorch. This network outputs the sum and
    subtraction of the inputs.
    """

    def __init__(self):
        super(AddSubNet, self).__init__()

    def forward(self, input0, input1):
        return (input0 + input1), (input0 - input1)


class TritonPythonModel:
    """Your Python model must use the same class name. Every Python model
    that is created must have "TritonPythonModel" as the class name.
    """

    def initialize(self, args):
        """`initialize` is called only once when the model is being loaded.
        Implementing `initialize` function is optional. This function allows
        the model to initialize any state associated with this model.

        Parameters
        ----------
        args : dict
          Both keys and values are strings. The dictionary keys and values are:
          * model_config: A JSON string containing the model configuration
          * model_instance_kind: A string containing model instance kind
          * model_instance_device_id: A string containing model instance device ID
          * model_repository: Model repository path
          * model_version: Model version
          * model_name: Model name
        """

        # You must parse model_config. JSON string is not parsed here
        self.model_config = model_config = json.loads(args["model_config"])

        # Get OUTPUT0 configuration
        output0_config = pb_utils.get_output_config_by_name(model_config, "OUTPUT0")

        # Get OUTPUT1 configuration
        output1_config = pb_utils.get_output_config_by_name(model_config, "OUTPUT1")

        # Convert Triton types to numpy types
        self.output0_dtype = pb_utils.triton_string_to_numpy(
            output0_config["data_type"]
        )
        self.output1_dtype = pb_utils.triton_string_to_numpy(
            output1_config["data_type"]
        )

        # Instantiate the PyTorch model
        self.add_sub_model = AddSubNet()

    def execute(self, requests):
        """`execute` must be implemented in every Python model. `execute`
        function receives a list of pb_utils.InferenceRequest as the only
        argument. This function is called when an inference is requested
        for this model. Depending on the batching configuration (e.g. Dynamic
        Batching) used, `requests` may contain multiple requests. Every
        Python model, must create one pb_utils.InferenceResponse for every
        pb_utils.InferenceRequest in `requests`. If there is an error, you can
        set the error argument when creating a pb_utils.InferenceResponse.

        Parameters
        ----------
        requests : list
          A list of pb_utils.InferenceRequest

        Returns
        -------
        list
          A list of pb_utils.InferenceResponse. The length of this list must
          be the same as `requests`
        """

        output0_dtype = self.output0_dtype
        output1_dtype = self.output1_dtype

        responses = []

        # Every Python backend must iterate over everyone of the requests
        # and create a pb_utils.InferenceResponse for each of them.
        for request in requests:
            # Get INPUT0
            in_0 = pb_utils.get_input_tensor_by_name(request, "INPUT0")
            # Get INPUT1
            in_1 = pb_utils.get_input_tensor_by_name(request, "INPUT1")

            out_0, out_1 = self.add_sub_model(in_0.as_numpy(), in_1.as_numpy())

            # Create output tensors. You need pb_utils.Tensor
            # objects to create pb_utils.InferenceResponse.
            out_tensor_0 = pb_utils.Tensor("OUTPUT0", out_0.astype(output0_dtype))
            out_tensor_1 = pb_utils.Tensor("OUTPUT1", out_1.astype(output1_dtype))

            # Create InferenceResponse. You can set an error here in case
            # there was a problem with handling this inference request.
            # Below is an example of how you can set errors in inference
            # response:
            #
            # pb_utils.InferenceResponse(
            #    output_tensors=..., TritonError("An error occurred"))
            inference_response = pb_utils.InferenceResponse(
                output_tensors=[out_tensor_0, out_tensor_1]
            )
            responses.append(inference_response)

        # You should return a list of pb_utils.InferenceResponse. Length
        # of this list must match the length of `requests` list.
        return responses

    def finalize(self):
        """`finalize` is called only once when the model is being unloaded.
        Implementing `finalize` function is optional. This function allows
        the model to perform any necessary clean ups before exit.
        """
        print("Cleaning up...")
```

config.pbtxt:

```
name: "add_sub"
backend: "python"

input [
  {
    name: "INPUT0"
    data_type: TYPE_FP32
    dims: [ 4 ]
  }
]
input [
  {
    name: "INPUT1"
    data_type: TYPE_FP32
    dims: [ 4 ]
  }
]
output [
  {
    name: "OUTPUT0"
    data_type: TYPE_FP32
    dims: [ 4 ]
  }
]
output [
  {
    name: "OUTPUT1"
    data_type: TYPE_FP32
    dims: [ 4 ]
  }
]

instance_group [{ kind: KIND_CPU }]
```

The client.py:

```
import sys

import numpy as np
import tritonclient.http as httpclient
from tritonclient.utils import *

model_name = "add_sub"
shape = [4]

with httpclient.InferenceServerClient("localhost:8000") as client:
    input0_data = np.random.rand(*shape).astype(np.float32)
    input1_data = np.random.rand(*shape).astype(np.float32)
    inputs = [
        httpclient.InferInput(
            "INPUT0", input0_data.shape, np_to_triton_dtype(input0_data.dtype)
        ),
        httpclient.InferInput(
            "INPUT1", input1_data.shape, np_to_triton_dtype(input1_data.dtype)
        ),
    ]

    inputs[0].set_data_from_numpy(input0_data)
    inputs[1].set_data_from_numpy(input1_data)

    outputs = [
        httpclient.InferRequestedOutput("OUTPUT0"),
        httpclient.InferRequestedOutput("OUTPUT1"),
    ]

    response = client.infer(model_name, inputs, request_id=str(1), outputs=outputs)

    result = response.get_response()
    output0_data = response.as_numpy("OUTPUT0")
    output1_data = response.as_numpy("OUTPUT1")

    print(
        "INPUT0 ({}) + INPUT1 ({}) = OUTPUT0 ({})".format(
            input0_data, input1_data, output0_data
        )
    )
    print(
        "INPUT0 ({}) - INPUT1 ({}) = OUTPUT1 ({})".format(
            input0_data, input1_data, output1_data
        )
    )

    if not np.allclose(input0_data + input1_data, output0_data):
        print("add_sub example error: incorrect sum")
        sys.exit(1)

    if not np.allclose(input0_data - input1_data, output1_data):
        print("add_sub example error: incorrect difference")
        sys.exit(1)

    print("PASS: add_sub")
    sys.exit(0)
```

**Expected behavior**
I expect that the client script
```
output0_data = response.as_numpy("OUTPUT0")
output1_data = response.as_numpy("OUTPUT1")
```
will convert the output tensors in the http response to numpy array.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

python_backend pytorch example as_numpy() error #7647

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

python_backend pytorch example as_numpy() error #7647

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions