Error: unpack_from requires a buffer of at least ... bytes for unpacking ... bytes at offset 4 (actual buffer size is ...)

**Description**
I get the following error when using a `TYPE_STRING` input field in a triton model with Python backend:

```
{'error': "Failed to process the request(s) for model instance 'string_test_0', message: error: unpack_from requires a buffer of at least 50529031 bytes for unpacking 50529027 bytes at offset 4 (actual buffer size is 7)\n\nAt:\n  /opt/tritonserver/backends/python/triton_python_backend_utils.py(117): deserialize_bytes_tensor\n"}
```

Looking at the `/opt/tritonserver/backends/python/triton_python_backend_utils.py` file, the line 117 is:

` sb = struct.unpack_from("<{}s".format(l), val_buf, offset)[0]`

**Triton Information**
What version of Triton are you using?

I am using the docker base image `nvcr.io/nvidia/tritonserver:24.04-py3` with CUDA 12.4 installed on my Ubuntu machine 22.04.4 LTS (Jammy Jellyfish). Python version is 3.10.12.

**To Reproduce**

I was able to make a minimal example for which I get the error.

### `config.pbtxt` of model

```
name: "string_test"
backend: "python"

input [
  {
    name: "INPUT0"
    data_type: TYPE_STRING
    dims: [ 1 ]
  }
]
output [
  {
    name: "OUTPUT0"
    data_type: TYPE_STRING
    dims: [ 1 ]
  }
]
```

### `model.py` of model

```
import sys
import json

sys.path.append('../../')
import triton_python_backend_utils as pb_utils
import numpy as np


class TritonPythonModel:
    """This model always returns the input that it has received.
    """

    def initialize(self, args):
        self.model_config = json.loads(args['model_config'])

    def execute(self, requests):
        """ This function is called on inference request.
        """
        responses = []
        for request in requests:
            in_0 = pb_utils.get_input_tensor_by_name(request, "INPUT0")
            out_tensor_0 = pb_utils.Tensor("OUTPUT0", in_0.as_numpy().astype(np.object_))
            print(f"INPUT VALUE: {in_0.as_numpy()[0].decode()}")
            responses.append(pb_utils.InferenceResponse([out_tensor_0]))
        return responses 
```

### example client script

```
import requests

URL = "http://localhost:8120/v2/models/string_test/infer"


def main():
    data = {
        "name": "string_test",
        "inputs": [
            {
                "name": "INPUT0",
                "shape": [1],
                "datatype": "BYTES",
                "data": ["Hi!"]
            }
        ]
    }
    res = requests.post(URL, json=data)
    print(res.json())
    return


if __name__ == "__main__":
    main()
```

I have tried various things - from changing CUDA and Triton versions, changing Nvidia package versions, etc. None seems to work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error: unpack_from requires a buffer of at least ... bytes for unpacking ... bytes at offset 4 (actual buffer size is ...) #7391

`config.pbtxt` of model

`model.py` of model

example client script

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error: unpack_from requires a buffer of at least ... bytes for unpacking ... bytes at offset 4 (actual buffer size is ...) #7391

Description

config.pbtxt of model

model.py of model

example client script

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`config.pbtxt` of model

`model.py` of model