We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent b6fb1b4 commit b43774aCopy full SHA for b43774a
README.md
@@ -19,6 +19,10 @@ state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs.<
19
<div align="left">
20
21
## Tech Blogs
22
+
23
+* [10/13] Scaling Expert Parallelism in TensorRT LLM (Part 3: Pushing the Performance Boundary)
24
+✨ [➡️ link](./docs/source/blogs/tech_blog/blog14_Scaling_Expert_Parallelism_in_TensorRT-LLM_part3.md)
25
26
* [09/26] Inference Time Compute Implementation in TensorRT LLM
27
✨ [➡️ link](./docs/source/blogs/tech_blog/blog13_Inference_Time_Compute_Implementation_in_TensorRT-LLM.md)
28
0 commit comments