Skip to content

Commit a78a055

Browse files
jayrodgepap-openai
andauthored
Improve NVIDIA TensorRT-LLM guide formatting and add Brev integration (#1989)
Co-authored-by: pap-openai <[email protected]>
1 parent 06a6b30 commit a78a055

File tree

2 files changed

+33
-44
lines changed

2 files changed

+33
-44
lines changed

articles/run-nvidia.ipynb

Lines changed: 32 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,21 @@
1818
"- `gpt-oss-20b`\n",
1919
"- `gpt-oss-120b`\n",
2020
"\n",
21-
"In this guide, we will run `gpt-oss-20b`, if you want to try the larger model or want more customization refer to [this](https://github.com/NVIDIA/TensorRT-LLM/tree/main/docs/source/blogs/tech_blog) deployment guide."
21+
"In this guide, we will run `gpt-oss-20b`, if you want to try the larger model or want more customization refer to [this](https://github.com/NVIDIA/TensorRT-LLM/blob/main/docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md) deployment guide.\n",
22+
"\n",
23+
"Note: Your input prompts should use the [harmony response](http://cookbook.openai.com/articles/openai-harmony) format for the model to work properly, though this guide does not require it."
24+
]
25+
},
26+
{
27+
"cell_type": "markdown",
28+
"metadata": {},
29+
"source": [
30+
"#### Launch on NVIDIA Brev\n",
31+
"You can simplify the environment setup by using [NVIDIA Brev](https://developer.nvidia.com/brev). Click the button below to launch this project on a Brev instance with the necessary dependencies pre-configured.\n",
32+
"\n",
33+
"Once deployed, click on the \"Open Notebook\" button to get start with this guide\n",
34+
"\n",
35+
"[![Launch on Brev](https://brev-assets.s3.us-west-1.amazonaws.com/nv-lb-dark.svg)](https://brev.nvidia.com/launchable/deploy?launchableID=env-30i1YjHsRWT109HL6eYxLUeHIwF)"
2236
]
2337
},
2438
{
@@ -33,69 +47,45 @@
3347
"metadata": {},
3448
"source": [
3549
"### Hardware\n",
36-
"To run the 20B model and the TensorRT-LLM build process, you will need an NVIDIA GPU with at least 20 GB of VRAM.\n",
50+
"To run the gpt-oss-20b model, you will need an NVIDIA GPU with at least 20 GB of VRAM.\n",
3751
"\n",
38-
"> Recommended GPUs: NVIDIA RTX 50 Series (e.g.RTX 5090), NVIDIA H100, or L40S.\n",
52+
"Recommended GPUs: NVIDIA Hopper (e.g., H100, H200), NVIDIA Blackwell (e.g., B100, B200), NVIDIA RTX PRO, NVIDIA RTX 50 Series (e.g., RTX 5090).\n",
3953
"\n",
4054
"### Software\n",
4155
"- CUDA Toolkit 12.8 or later\n",
42-
"- Python 3.12 or later\n",
43-
"- Access to the Orangina model checkpoint from Hugging Face"
56+
"- Python 3.12 or later"
4457
]
4558
},
4659
{
4760
"cell_type": "markdown",
4861
"metadata": {},
4962
"source": [
50-
"## Installling TensorRT-LLM"
63+
"## Installing TensorRT-LLM\n",
64+
"\n",
65+
"There are multiple ways to install TensorRT-LLM. In this guide, we'll cover using a pre-built Docker container from NVIDIA NGC as well as building from source.\n",
66+
"\n",
67+
"If you're using NVIDIA Brev, you can skip this section."
5168
]
5269
},
5370
{
5471
"cell_type": "markdown",
5572
"metadata": {},
5673
"source": [
57-
"## Using NGC\n",
74+
"## Using NVIDIA NGC\n",
5875
"\n",
59-
"Pull the pre-built TensorRT-LLM container for GPT-OSS from NVIDIA NGC.\n",
76+
"Pull the pre-built [TensorRT-LLM container](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tensorrt-llm/containers/release/tags) for GPT-OSS from [NVIDIA NGC](https://www.nvidia.com/en-us/gpu-cloud/).\n",
6077
"This is the easiest way to get started and ensures all dependencies are included.\n",
6178
"\n",
62-
"`docker pull nvcr.io/nvidia/tensorrt-llm/release:gpt-oss-dev`\n",
63-
"`docker run --gpus all -it --rm -v $(pwd):/workspace nvcr.io/nvidia/tensorrt-llm/release:gpt-oss-dev`\n",
79+
"```bash\n",
80+
"docker pull nvcr.io/nvidia/tensorrt-llm/release:gpt-oss-dev\n",
81+
"docker run --gpus all -it --rm -v $(pwd):/workspace nvcr.io/nvidia/tensorrt-llm/release:gpt-oss-dev\n",
82+
"```\n",
6483
"\n",
65-
"## Using Docker (build from source)\n",
84+
"## Using Docker (Build from Source)\n",
6685
"\n",
6786
"Alternatively, you can build the TensorRT-LLM container from source.\n",
68-
"This is useful if you want to modify the source code or use a custom branch.\n",
69-
"See the official instructions here: https://github.com/NVIDIA/TensorRT-LLM/tree/feat/gpt-oss/docker\n",
70-
"\n",
71-
"The following commands will install required dependencies, clone the repository,\n",
72-
"check out the GPT-OSS feature branch, and build the Docker container:\n",
73-
" ```\n",
74-
"#Update package lists and install required system packages\n",
75-
"sudo apt-get update && sudo apt-get -y install git git-lfs build-essential cmake\n",
76-
"\n",
77-
"# Initialize Git LFS (Large File Storage) for handling large model files\n",
78-
"git lfs install\n",
79-
"\n",
80-
"# Clone the TensorRT-LLM repository\n",
81-
"git clone https://github.com/NVIDIA/TensorRT-LLM.git\n",
82-
"cd TensorRT-LLM\n",
83-
"\n",
84-
"# Check out the branch with GPT-OSS support\n",
85-
"git checkout feat/gpt-oss\n",
86-
"\n",
87-
"# Initialize and update submodules (required for build)\n",
88-
"git submodule update --init --recursive\n",
89-
"\n",
90-
"# Pull large files (e.g., model weights) managed by Git LFS\n",
91-
"git lfs pull\n",
92-
"\n",
93-
"# Build the release Docker image\n",
94-
"make -C docker release_build\n",
95-
"\n",
96-
"# Run the built Docker container\n",
97-
"make -C docker release_run \n",
98-
"```"
87+
"This approach is useful if you want to modify the source code or use a custom branch.\n",
88+
"For detailed instructions, see the [official documentation](https://github.com/NVIDIA/TensorRT-LLM/tree/feat/gpt-oss/docker)."
9989
]
10090
},
10191
{

registry.yaml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,7 @@
44
# should build pages for, and indicates metadata such as tags, creation date and
55
# authors for each page.
66

7-
8-
- title: Using NVIDIA TensorRT-LLM to run the 20B model
7+
- title: Using NVIDIA TensorRT-LLM to run gpt-oss-20b
98
path: articles/run-nvidia.ipynb
109
date: 2025-08-05
1110
authors:

0 commit comments

Comments
 (0)