23 Oct 23:29

7c7c23c

TRTorch v0.1.0

Direct PyTorch integration via backend API, support for Ampere, support for simple branch and loop cases

This is the first "beta" release of TRTorch, introducing direct integration into PyTorch via the new Backend API. This release also contains an NGC based Dockerfile for users looking to use TRTorch on Ampere, using NGC's patched version of PyTorch. Note that compiled programs from older versions of TRTorch are not compatible with the TRTorch 0.1.0 runtime due to an ABI change. There are now example Jupyter notebooks which demonstrate various features of the compiler included in the documentation.

New Ops:

prelu
lstm_cell
power
conv3d
narrow

Dependencies:

Bazel 3.4.1
Libtorch 1.6.0
CUDA 10.2 (by default, CUDA 11 supported with compatible PyTorch build)
cuDNN 7.6.5 (by default, cuDNN 8 supported with compatible PyTorch build)
TensorRT 7.0.0 (by default, TensorRT 7.1 supported with compatible PyTorch build)

Changelog

v0.1.0 (2020-10-23)

Bug Fixes

added some fixes, trt/jit output still mismatches (723ac1d)
added test cases to explicitly check hidden/cell state outputs (d7c3164)
cleaned up logic, added case where bias doesn't exist for LSTM cell converter (a3e1093)
//core/conversion/evaluator: Custom to IValue that handles int[] (68c934a)
//docker: Workaround only shared libraries being available in (50c7eda)
//py: Fix long description section of setup.py (efd2099)
//tests: Add stride to complete tensors (af5d28e)
//tests/accuracy: Fix int8 accuracy test for new PTQ api (a53bea7)
//tests/core/converters/activations: Complete tensors in prelu test (0e90f78)
docsrc: Update docsrc container for bazel 3.4.1 (4eb53b5)
fix(Windows)!: Fix dependency resolution for local builds (858d8c3)
chore!: Update dependencies to PyTorch 1.6.0 (8eda27d)
chore!: Bumping version numbers to 0.1.0 (b84c90b)
refactor(//core)!: Introducing a binding convention that will address (5a105c6)
refactor!: Renaming extra info to compile spec to be more consistent (b8fa228)

Features

//core/conversion/converters: LSTMCell converter (8c61248)
//core/conversion/var: created ITensorOrFreeze() method, to replace functionality of Var::ITensor() (2ccf8d0)
//core/converters: Add power layer conversion support and minor README edits (a801506)
//core/lowering: Add functionalization pass to replace implace (90a9ed6), closes #30
//docker: Adding CUDA11 based container for Ampere support (970d775)
started working on lstm_cell converter (546d790)
//py: Initial compiliant implementation of the to_backend api for (59113cf)
//third_party/tensorrt: Add back TensorRT static lib in a cross (d3c2e7e)
aten::prelu: Basic prelu support (8bc4369)
aten::prelu: Implement the multi-channel version of prelu and (c066581)
finished logic for LSTM cell, now to test (a88cfaf)

BREAKING CHANGES

Users on Windows trying to use cuDNN 8 must manually
configure third_party/cudnn/local/BUILD to use cuDNN 8.

Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com

Support for Python 3.5 is being dropped with this
update

Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com

Version is being bumped to version 0.1.0a0 to target
PyTorch 1.6.0

Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com

This changes the "ABI" of compiled TRTorch programs and
the runtime and breaks backwards compatability between the runtime in
0.1.0+ and programs compiled pre-0.1.0

Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com

This changes the top level api for setting the
specification for compilation, a simple find and replace should allow
users to port forward

Signed-off-by: Naren Dasan naren@narendasan.com
Signed-off-by: Naren Dasan narens@nvidia.com

Assets 7

18 Jul 05:42

narendasan

v0.0.3

fe06d09

TRTorch v0.0.3 Pre-release

Pre-release

TRTorch v0.0.3

aarch64 toolchain, Revised PTQ API, PyTorch 1.5.1, support for cuDNN 8.0, TensorRT 7.1 (with compatible PyTorch build)

This is the thrid alpha release of TRTorch. It bumps the target PyTorch version to 1.5.1 and introduces support for cuDNN 8.0 and TensorRT 7.1, however this is only supported in cases where PyTorch has been compiled with the same cuDNN version. This release also introduces formal support for aarch64, however pre-compiled binaries will not be available until we can deliver python packages for aarch64 for all supported version of python. Note some idiosyncrasies when it comes to working with PyTorch on aarch64, if you are using PyTorch compiled by NVIDIA for aarch64 the ABI version is CXX11 instead of the pre CXX11 ABI found on PyTorch on x86_64. When compiling the Python API for TRTorch add the --use-cxx11-abi flag to the command and do not use the --config=pre-cxx11-abi flag when building the C++ library (more instructions on native aarch64 compilation in the documentation). This release also introduces a breaking change to the C++ API where now in order to use logging or ptq APIs a separate header file must be included. Look at the implementation of trtorchc or ptq for example usage.

Dependencies:

Bazel 3.3.1
Libtorch 1.5.1
CUDA 10.2
cuDNN 7.6.5 (by default, cuDNN 8 supported with compatable PyTorch build)
TensorRT 7.0.0 (by default, TensorRT 7.1 supported with compatable PyTorch build)

Changelog

feat!: Lock bazel version (25f4371)
refactor(//cpp/api)!: Refactoring ptq to use includes but seperate from (d2f8a59)

Bug Fixes

//core: Do not compile hidden methods (6bd1a3f)
//core/conversion: Check for calibrator before setting int8 mode (3afd209)
//core/conversion: Supress unnecessary debug messages (2b23874)
//core/conversion/conversionctx: Check both tensor and eval maps (2d65ece)
//core/conversion/conversionctx: In the case of strict types and (3611778)
//core/conversion/converters: Fix plugin implementation for TRT 7 (94d6a0f)
//core/conversion/converters/impl: 1d case not working (f42562b)
//core/conversion/converters/impl: code works for interpolate2d/3d, doesn't work for 1d yet (e4cb117)
//core/conversion/converters/impl: Fix interpolate.cpp (b6942a2)
//core/conversion/converters/impl/element_wise: Fix broadcast (a9f33e4)
//core/conversion/evaluators: A couple fixes for evaluators (07ba980)
//core/lowering: Conv2D -> _convolution pass was triggering conv (ca2b5f9)
//cpp: Remove deprecated script namespace (d70760f)
//cpp/api: Better inital condition for the dataloader iterator to (8d22bdd)
//cpp/api: Remove unecessary destructor in ptq class (fc70267)
//cpp/api: set a default for calibrator (825be69)
//cpp/benchmark: reorder benchmark so FP16 bn issue in JIT doesnt (98527d2)
//cpp/ptq: Default version of the app should not resize images (de3cbc4)
//cpp/ptq: Enable FP16 kernels for INT8 applications (26709cc)
//cpp/ptq: Enable FP16 kernels for INT8 applications (e1c5416)
//cpp/ptq: remove some logging from ptq app (b989c7f)
//cpp/ptq: Tracing model in eval mode wrecks accuracy in Libtorch (54a24b3)
//cpp/trtorchc: Refactor trtorchc to use new C++ API (789e1be), closes #132
//cpp/trtorchc: Support building trtorchc with the pre_cxx11_abi (172d4d5)
//docs: add nojekyll file (2a02cd5)
//docs: fix version links (11555f7)
//notebooks: Fix WORKSPACE template file to reflect new build system layout (c8ea9b7)
//py: Build system issues (c1de126)
//py: Ignore generated version file (9e37dc1)
//py: Lib path incorrect (ff2b13c)
//tests: Duplicated tensorrt dep (5cd697e)
//third_party/tensorrt: Fix include dir for library headers (22ed5cf)
//third_party/tensorrt: Fix TensorRT paths for local x86 builds (73d804b)
aarch64: fixes and issues for aarch64 toolchain (9a6cccd)
aten::_convolution: out channels was passed in incorrectly for (ee727f8)
aten::_convolution: Pass dummy bias when there is no bias (b20671c)
aten::batch_norm: A new batch norm implementation that hopefully (6461872)
aten::batchnorm|aten::view: Fix converter implementation for (bf651dd)
aten::contiguous: Blacklist aten::contiguous from conversion (b718121)
aten::flatten: Fixes dynamic shape for flatten (4eb20bb)
fixed FP16 bug, fixed README, addressed some other PR comments (d9c0e84)
aten::neg: Fix a index bug in neg (1b2cde4)
aten::size, other aten evaluators: Removes aten::size converter in (c83447e)
BUILD: modified BUILD (a0d8586)
trying to resolve interpolate plugin problems (f0fefaa)
core/conversion/converters/impl: fix error message in interpolate (5ddab8b)
Address issues in PR (cd24f26)
bypass jeykll, also add PR template (a41c400)
first commit (4f1a9df)
Fix pre CXX11 ABI python builds and regen docs (42013ab)
fixed interpolate_plugin to handle dynamically sized inputs for adaptive_pool2d (7794c78)
need to fix gather converter (024a6b2)
plugin: trying to fix bug in plugin (cafcced)
pooling: fix the tests and the 1D pooling cases (a90e6db)
RunGraphEngineDynamic fixed to work with dynamically sized input tensors (6308190)

Features

//:libtrtorch: Ship trtorchc with the tarball (d647447)
//core/compiler: Multiple outputs supported now via tuple (f9af574)
//core/conversion: Adds the ability to evaluate loops (dcb1474)
//core/conversion: Compiler can now create graphs (9d1946e)
//core/conversion: Evaluation of static conditionals works now (6421f3d)
//core/conversion/conversionctx: Make op precision available at (78a1c61)
//core/conversion/converters: Throw a warning if a converter is (6cce381)
//core/conversion/converters/impl: added support for aten::stack (415378e)
//core/conversion/converters/impl: added support for linear1d and bilinear2d ops (4416d1f)
//core/conversion/converters/impl: added support for trilinear3d op (bb46e70)
//core/conversion/converters/impl: all function schemas for upsample_nearest (1b50484)
//core/conversion/converters/impl: logic implemented ([7f12160](https://github.com/...

Assets 8

17 May 02:00

narendasan

v0.0.2

3f57189

TRTorch v0.0.2 Pre-release

Pre-release

TRTorch v0.0.2

Python API & PyTorch 1.5.0 Support

This is a second alpha release of TRTorch. It bumps support for PyTorch to 1.5.0 and introduces a Python distribution for TRTorch.
Also now includes full documentation https://nvidia.github.io/TRTorch
Adds support for Post Training Quantization in C++

Dependencies

Libtorch 1.5.0
CUDA 10.2
cuDNN 7.6.5
TensorRT 7.0.0

Changelog

Bug Fixes

//core/conversion: Check for calibrator before setting int8 mode (3afd209)
//core/conversion/conversionctx: Check both tensor and eval maps (2d65ece)
//core/conversion/converters/impl/element_wise: Fix broadcast (a9f33e4)
//cpp: Remove deprecated script namespace (d70760f)
//cpp/api: Better inital condition for the dataloader iterator to (8d22bdd)
//cpp/api: Remove unecessary destructor in ptq class (fc70267)
//cpp/api: set a default for calibrator (825be69)
//cpp/ptq: remove some logging from ptq app (b989c7f)
Address issues in PR (cd24f26)
//cpp/ptq: Tracing model in eval mode wrecks accuracy in Libtorch (54a24b3)
//docs: add nojekyll file (2a02cd5)
//docs: fix version links (11555f7)
//py: Build system issues (c1de126)
//py: Ignore generated version file (9e37dc1)
bypass jeykll, also add PR template (a41c400)

Features

//core/conversion/conversionctx: Make op precision available at (78a1c61)
//core/conversion/converters/impl/shuffle: Implement aten::resize (353f2d2)
//core/execution: Type checking for the executor, now is the (2dd1ba3)
//core/lowering: New freeze model pass and new exception (4acc3fd)
//core/quantization: skeleton of INT8 PTQ calibrator (dd443a6)
//core/util: New logging level for Graph Dumping (90c44b9)
//cpp/api: Adding max batch size setting (1b25542)
//cpp/api: Functional Dataloader based PTQ (f022dfe)
//cpp/api: Remove the extra includes in the API header (2f86f84)
//cpp/ptq: Add a feature to the dataset to use less than the full (5f36f47)
//cpp/ptq/training: Training recipe for VGG16 Classifier on (676bf56)
//lowering: centralize lowering and try to use PyTorch Conv2DBN folding (fad4a10)
//py: API now produces valid engines that are consumable by (72bc1f7)
//py: Inital introduction of the Python API (7088245)
//py: Manylinux container and build system for multiple python (639c2a3)
//py: Working portable package (482ef2c)
//tests: New optional accuracy tests to check INT8 and FP16 (df74136)
//cpp/api: Working INT8 Calibrator, also resolves #41 (5c0d737)
aten::flatten: Adds a converter for aten flatten since MM is the (d945eb9)
aten::matmul|aten::addmm: Adds support for aten::matmul and (c5b6202)
Support non cxx11-abi builds for use in python api (83e0ed6)
aten::size [static]: Implement a aten::size converter for static input size (0548540)
conv2d_to_convolution: A pass to map aten::conv2d to _convolution (2c5c0d5)

Assets 8

08 Apr 01:41

narendasan

v0.0.1

4b58d3b

Initial Release Pre-release

Pre-release

TRTorch v0.0.1

Initial Release

This is the initial alpha release of TRTorch. Supports basic compilation of TorchScript Modules, networks similar to ResNet50, Mobilenet, simple feed forward networks.
C++ Based API
- Can save converted models to PLAN file for use in TensorRT Apps
- Compile module and continue running with JIT interpreter accelerated by TensorRT
Supports FP32 and FP16 execution
Sample application to show how to use the compiler

Dependencies

Libtorch 1.4.0
CUDA 10.1
cuDNN 7.6
TensorRT 6.0.1

Assets 3

Releases: pytorch/TensorRT

TRTorch v0.1.0

TRTorch v0.1.0

Direct PyTorch integration via backend API, support for Ampere, support for simple branch and loop cases

Dependencies:

Changelog

v0.1.0 (2020-10-23)

Bug Fixes

Features

BREAKING CHANGES

Uh oh!

TRTorch v0.0.3

TRTorch v0.0.3

aarch64 toolchain, Revised PTQ API, PyTorch 1.5.1, support for cuDNN 8.0, TensorRT 7.1 (with compatible PyTorch build)

Dependencies:

Changelog

Bug Fixes

Features

Uh oh!

TRTorch v0.0.2

TRTorch v0.0.2

Python API & PyTorch 1.5.0 Support

Dependencies

Changelog

Bug Fixes

Features

Uh oh!

Initial Release

TRTorch v0.0.1

Initial Release

Dependencies

Uh oh!