-
Notifications
You must be signed in to change notification settings - Fork 160
Deprecate ModelOpt custom docker and directly use TRT-LLM / PyTorch / TRT docker #346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kevalmorabia97
wants to merge
4
commits into
main
Choose a base branch
from
kmorabia/remove-dockerfile
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+140
−333
Open
Changes from 3 commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
07c158f
Deprecate ModelOpt custom docker and directly use TRT-LLM docker
kevalmorabia97 7832b15
Deprecate onnx ptq docker image as well
kevalmorabia97 3392955
Minor Doc updates
kevalmorabia97 54b6bcf
Fix LD_LIBRARY_PATH for onnx docker
kevalmorabia97 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -30,39 +30,29 @@ Environment setup | |
|
||
.. tab:: Docker image (Recommended) | ||
|
||
**Using ModelOpt's docker image** | ||
To use Model Optimizer with full dependencies (e.g. TensorRT/TensorRT-LLM deployment), we recommend using the | ||
`TensorRT-LLM docker image <https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tensorrt-llm/containers/release/tags>`_, | ||
e.g., ``nvcr.io/nvidia/tensorrt-llm/release:<version>``. | ||
|
||
To use Model Optimizer with full dependencies (e.g. TensorRT/TensorRT-LLM deployment), we recommend using our provided docker image | ||
which is based on the `TensorRT-LLM <https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tensorrt-llm/containers/release/tags>`_ | ||
docker image with additional dependencies installed. | ||
Make sure to upgrade Model Optimizer to the latest version using ``pip`` as described in the next section. | ||
kevalmorabia97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
After installing the `NVIDIA Container Toolkit <https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html>`_, | ||
please run the following commands to build the Model Optimizer docker container which has all the base | ||
dependencies pre-installed. You may need to install additional dependencies from the examples's `requirements.txt` file. | ||
You would also need to setup appropriate environment variables for the TensorRT binaries as follows: | ||
|
||
.. code-block:: shell | ||
|
||
# Clone the ModelOpt repository | ||
git clone [email protected]:NVIDIA/TensorRT-Model-Optimizer.git | ||
cd TensorRT-Model-Optimizer | ||
export LD_LIBRARY_PATH="/usr/include:/usr/lib/x86_64-linux-gnu:/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:${LD_LIBRARY_PATH}" | ||
export PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/bin:${PATH}" | ||
|
||
# Build the docker (will be tagged `docker.io/library/modelopt_examples:latest`) | ||
# You may customize `docker/Dockerfile` to include or exclude certain dependencies you may or may not need. | ||
bash docker/build.sh | ||
You may need to install additional dependencies from the respective examples's `requirements.txt` file. | ||
|
||
# Run the docker image | ||
docker run --gpus all -it --shm-size 20g --rm docker.io/library/modelopt_examples:latest bash | ||
|
||
# Check installation (inside the docker container) | ||
python -c "import modelopt; print(modelopt.__version__)" | ||
|
||
**Using alternative NVIDIA docker images** | ||
**Alternative NVIDIA docker images** | ||
|
||
For PyTorch, you can also use `NVIDIA NGC PyTorch container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch/tags>`_ | ||
and for NVIDIA NeMo framework, you can use the `NeMo container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo/tags>`_. | ||
Both of these containers come with Model Optimizer pre-installed. Make sure to update the Model Optimizer to the latest version if not already. | ||
|
||
For ONNX PTQ, you can use the optimized docker image from [onnx_ptq Dockerfile](https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/examples/onnx_ptq/docker). | ||
For ONNX / TensorRT use cases, you can also use the `TensorRT container <https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorrt/tags>`_ | ||
which provides superior performance to the PyTorch container. | ||
|
||
.. tab:: Local environment (PIP / Conda) | ||
|
||
|
@@ -86,9 +76,8 @@ Environment setup | |
|
||
If you wish to use ModelOpt in conjunction with other NVIDIA libraries (e.g. TensorRT, TensorRT-LLM, NeMo, Triton, etc.), | ||
please make sure to check the ease of installation of these libraries in a local environment. If you face any | ||
issues, we recommend using a docker image for a seamless experience. For example, `TensorRT-LLM documentation <https://nvidia.github.io/TensorRT-LLM/>`_. | ||
requires installing in a docker image. You may still choose to use other ModelOpt's features locally for example, | ||
quantizing a HuggingFace model and then use a docker image for deployment. | ||
issues, we recommend using a docker image for a seamless experience. You may still choose to use other ModelOpt's | ||
features locally for example, quantizing a HuggingFace model and then use a docker image for deployment. | ||
|
||
Install Model Optimizer | ||
======================= | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.