-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Closed
Description
Description
I get the following error when using a TYPE_STRING
input field in a triton model with Python backend:
{'error': "Failed to process the request(s) for model instance 'string_test_0', message: error: unpack_from requires a buffer of at least 50529031 bytes for unpacking 50529027 bytes at offset 4 (actual buffer size is 7)\n\nAt:\n /opt/tritonserver/backends/python/triton_python_backend_utils.py(117): deserialize_bytes_tensor\n"}
Looking at the /opt/tritonserver/backends/python/triton_python_backend_utils.py
file, the line 117 is:
sb = struct.unpack_from("<{}s".format(l), val_buf, offset)[0]
Triton Information
What version of Triton are you using?
I am using the docker base image nvcr.io/nvidia/tritonserver:24.04-py3
with CUDA 12.4 installed on my Ubuntu machine 22.04.4 LTS (Jammy Jellyfish). Python version is 3.10.12.
To Reproduce
I was able to make a minimal example for which I get the error.
config.pbtxt
of model
name: "string_test"
backend: "python"
input [
{
name: "INPUT0"
data_type: TYPE_STRING
dims: [ 1 ]
}
]
output [
{
name: "OUTPUT0"
data_type: TYPE_STRING
dims: [ 1 ]
}
]
model.py
of model
import sys
import json
sys.path.append('../../')
import triton_python_backend_utils as pb_utils
import numpy as np
class TritonPythonModel:
"""This model always returns the input that it has received.
"""
def initialize(self, args):
self.model_config = json.loads(args['model_config'])
def execute(self, requests):
""" This function is called on inference request.
"""
responses = []
for request in requests:
in_0 = pb_utils.get_input_tensor_by_name(request, "INPUT0")
out_tensor_0 = pb_utils.Tensor("OUTPUT0", in_0.as_numpy().astype(np.object_))
print(f"INPUT VALUE: {in_0.as_numpy()[0].decode()}")
responses.append(pb_utils.InferenceResponse([out_tensor_0]))
return responses
example client script
import requests
URL = "http://localhost:8120/v2/models/string_test/infer"
def main():
data = {
"name": "string_test",
"inputs": [
{
"name": "INPUT0",
"shape": [1],
"datatype": "BYTES",
"data": ["Hi!"]
}
]
}
res = requests.post(URL, json=data)
print(res.json())
return
if __name__ == "__main__":
main()
I have tried various things - from changing CUDA and Triton versions, changing Nvidia package versions, etc. None seems to work.
lucidyan
Metadata
Metadata
Assignees
Labels
No labels