Skip to content

Conversation

@yinggeh
Copy link
Contributor

@yinggeh yinggeh commented Jul 9, 2024

What does the PR do?

Add client input size check to make sure input shape byte size matches input data byte size.

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

Check the conventional commit type
box here and add the label to the github PR.

  • test

Related PRs:

triton-inference-server/client#742

Where should the reviewer start?

Should look at triton-inference-server/client#742 first.

Test plan:

n/a

  • CI Pipeline ID:
    17202351

Caveats:

Shared memory byte size checks for string inputs is not implemented.

Background

Stop malformed input request at client side before sending to the server.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Relates to #7171

kthui and others added 30 commits October 13, 2023 13:56
* Fix gRPC test failure and refactor

* Add gRPC AsyncIO cancellation tests

* Better check if a request is cancelled

* Use f-string
* Fixing torch version for vllm
* Switch Jetson model TensorRT models generation to container

* Adding missed file

* Fix typo

* Fix typos

* Remove extra spaces

* Fix typo
* Ensure notify_state_ gets properly destructed

* Fix inflight state tracking to properly erase states

* Prevent removing the notify_state from being erased

* Wrap notify_state_ object within unique_ptr
* TRTLLM backend post release

* TRTLLM backend post release

* Update submodule url for permission issue

* Update submodule url

* Fix up

* Not using postbuild function to workaround submodule url permission issue
* Minor fix for L0_model_config
* Test with different sizes of CUDA memory pool

* Check the server log for error message

* Improve debugging

* Fix syntax
Co-authored-by: dyastremsky <[email protected]>
Co-authored-by: Ryan McCormick <[email protected]>
* Update README and versions for 23.10 branch (#6399)

* Cherry-picking vLLM backend changes (#6404)

* Update build.py to build vLLM backend (#6394)

* Add Python backend when vLLM backend built (#6397)

---------

Co-authored-by: dyastremsky <[email protected]>

* Add documentation on request cancellation (#6403) (#6407)

* Add documentation on request cancellation

* Include python backend

* Update docs/user_guide/request_cancellation.md

* Update docs/user_guide/request_cancellation.md

* Update docs/README.md

* Update docs/user_guide/request_cancellation.md

* Remove inflight term from the main documentation

* Address review comments

* Fix

* Update docs/user_guide/request_cancellation.md

* Fix

---------

Co-authored-by: Iman Tabrizian <[email protected]>
Co-authored-by: Neelay Shah <[email protected]>
Co-authored-by: Ryan McCormick <[email protected]>
Co-authored-by: Jacky <[email protected]>

* Fixes in request cancellation doc (#6409) (#6410)

* TRT-LLM backend build changes (#6406) (#6430)

* Update url

* Debugging

* Debugging

* Update url

* Fix build for TRT-LLM backend

* Remove TRTLLM TRT and CUDA versions

* Fix up unused var

* Fix up dir name

* FIx cmake patch

* Remove previous TRT version

* Install required packages for example models

* Remove packages that are only needed for testing

* Fixing vllm build (#6433) (#6437)

* Fixing torch version for vllm

Co-authored-by: Olga Andreeva <[email protected]>

* Update TRT-LLM backend url (#6455) (#6460)

* TRTLLM backend post release

* TRTLLM backend post release

* Update submodule url for permission issue

* Update submodule url

* Fix up

* Not using postbuild function to workaround submodule url permission issue

* remove redundant lines

* Revert "remove redundant lines"

This reverts commit 86be7ad.

* restore missed lines

* Update build.py

Co-authored-by: Olga Andreeva <[email protected]>

* Update build.py

Co-authored-by: Olga Andreeva <[email protected]>

---------

Co-authored-by: Tanmay Verma <[email protected]>
Co-authored-by: dyastremsky <[email protected]>
Co-authored-by: Iman Tabrizian <[email protected]>
Co-authored-by: Neelay Shah <[email protected]>
Co-authored-by: Ryan McCormick <[email protected]>
Co-authored-by: Jacky <[email protected]>
Co-authored-by: Kris Hung <[email protected]>
Co-authored-by: Katherine Yang <[email protected]>
Co-authored-by: Olga Andreeva <[email protected]>
…ycle) (#6490)

* Test torch allocator gpu memory usage directly rather than global gpu memory for more consistency
* Add testing backend and test

* Add test to build / CI. Minor fix on L0_http

* Format. Update backend documentation

* Fix up

* Address comment

* Add negative testing

* Fix up
* Use postbuild function

* Remove updating submodule url
* Added testing for python_backend autocomplete: optional input and model_transaction_policy
* Fixing L0_io
* Bumped vllm version

* Add python-bsed backends testing

* Add python-based backends CI

* Fix errors

* Add vllm backend

* Fix pre-commit

* Modify test.sh

* Remove vllm_opt qa model

* Remove vLLM ackend tests

* Resolve review comments

* Fix pre-commit errors

* Update qa/L0_backend_python/python_based_backends/python_based_backends_test.py

Co-authored-by: Tanmay Verma <[email protected]>

* Remove collect_artifacts_from_subdir function call

---------

Co-authored-by: oandreeva-nv <[email protected]>
Co-authored-by: Tanmay Verma <[email protected]>
Copy link
Contributor

@rmccorm4 rmccorm4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple questions

@yinggeh yinggeh force-pushed the yinggeh-DLIS-6657-client-input-byte-size-check branch from f432d41 to 3863c39 Compare July 31, 2024 02:22
@yinggeh yinggeh requested a review from rmccorm4 July 31, 2024 02:23
@yinggeh yinggeh force-pushed the yinggeh-DLIS-6657-client-input-byte-size-check branch from 17053e9 to 482409e Compare August 4, 2024 22:56
@rmccorm4 rmccorm4 changed the title feat: Client input byte size checks test: Client-side input shape/element validation Aug 5, 2024
}
EOL

cp -r $DATADIR/qa_model_repository/graphdef_object_int32_int32 models/.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use model "simple_identity" instead to test string inputs.

Comment on lines 206 to 222
if client_type == "http":
triton_client = tritonhttpclient.InferenceServerClient("localhost:8000")
else:
triton_client = tritongrpcclient.InferenceServerClient("localhost:8001")

# Example using BYTES input tensor with utf-8 encoded string that
# has an embedded null character.
null_chars_array = np.array(
["he\x00llo".encode("utf-8") for i in range(16)], dtype=np.object_
)
null_char_data = null_chars_array.reshape([1, 16])
identity_inference(triton_client, null_char_data, True) # Using binary data
identity_inference(triton_client, null_char_data, False) # Using JSON data

# Example using BYTES input tensor with 16 elements, where each
# element is a 4-byte binary blob with value 0x00010203. Can use
# dtype=np.bytes_ in this case.
bytes_data = [b"\x00\x01\x02\x03" for i in range(16)]
np_bytes_data = np.array(bytes_data, dtype=np.bytes_)
np_bytes_data = np_bytes_data.reshape([1, 16])
identity_inference(triton_client, np_bytes_data, True) # Using binary data
identity_inference(triton_client, np_bytes_data, False) # Using JSON data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from client/src/python/examples/simple_http_string_infer_client.py. It looks like the example demonstrated two ways of preparing string input data. I'll remove one of them.

@yinggeh yinggeh requested review from GuanLuo and rmccorm4 August 6, 2024 17:24
inputs[0].set_shape([2, 8])
inputs[1].set_shape([2, 8])

with self.assertRaises(InferenceServerException) as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with self.assertRaises(InferenceServerException) as e:
# If number of elements (volume) is correct but shape is wrong, the core will return an error.
with self.assertRaises(InferenceServerException) as e:

@yinggeh yinggeh requested a review from rmccorm4 August 6, 2024 17:48
@yinggeh yinggeh added PR: test Adding missing tests or correcting existing test and removed PR: feat A new feature labels Aug 7, 2024
@yinggeh yinggeh marked this pull request as draft September 18, 2024 18:38
@pvijayakrish pvijayakrish force-pushed the yinggeh-DLIS-6657-client-input-byte-size-check branch from dff25f4 to 48c9b25 Compare January 15, 2025 17:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR: test Adding missing tests or correcting existing test

Development

Successfully merging this pull request may close these issues.