NVIDIA-AI-IOT
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 104 additions & 0 deletions b/‎CONTRIBUTING.md‎
Lines changed: 104 additions & 0 deletions
diff --git a/‎docs/CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/CHANGELOG.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/CONTRIBUTING.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/CONTRIBUTING.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/benchmarks/jetson_nano.md‎
Lines changed: 22 additions & 0 deletions b/‎docs/benchmarks/jetson_nano.md‎
Lines changed: 22 additions & 0 deletions
diff --git a/‎docs/benchmarks/jetson_xavier.md‎
Lines changed: 33 additions & 0 deletions b/‎docs/benchmarks/jetson_xavier.md‎
Lines changed: 33 additions & 0 deletions
diff --git a/‎docs/css/version-select.css‎
Lines changed: 5 additions & 0 deletions b/‎docs/css/version-select.css‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎docs/getting_started.md‎
Lines changed: 32 additions & 0 deletions b/‎docs/getting_started.md‎
Lines changed: 32 additions & 0 deletions
diff --git a/‎docs/images/chart.svg‎
Lines changed: 1 addition & 0 deletions b/‎docs/images/chart.svg‎
Lines changed: 1 addition & 0 deletions
@@ -15,3 +15,4 @@ __pycache__/
 *.pyc
 *.ipynb_checkpoints
 *.pth
+docs/converters.md
@@ -0,0 +1 @@
+# Changes
@@ -0,0 +1,104 @@
+# Contributing
+
+## Forms of contribution
+
+### Submit an Issue
+
+torch2trt is use case driven.  We originally created it to solve
+use cases related to NVIDIA Jetson, but the layer support has grown
+largely since it's release and we've found that it has 
+helped many other developers as well.  
+
+The growth of torch2trt has been largely driven by issues submitted on [GitHub](https://github.com/NVIDIA-AI-IOT/torch2trt/issues).
+We learn a lot from the reported issues. Submitting an issue it is one of the best ways to begin contributing to torch2trt.
+
+The reported issues typically are one of the following,
+
+* A bug or unexpected result
+* A model with unsupported layers
+
+If you report an issue, we typically find the following information helpful
+
+* PyTorch version
+* TensorRT version
+* Platform (ie: Jetson Nano)
+* The PyTorch Module you're attempting to convert
+* The steps taken to convert the PyTorch module
+
+If you're not sure how to provide any of these pieces of information, don't worry.  Just open the issue
+and we're happy to discuss and help work out the details.
+
+### Ask a Question
+
+Another great way to contribute is to ask a question on [GitHub](https://github.com/NVIDIA-AI-IOT/torch2trt/issues).
+There are often other developers who share your question, and they may find the discussion helpful.  This also
+helps us gauge feature interest and identify gaps in documentation.
+
+### Submit a Pull Request
+
+torch2trt is use case driven and has limited maintainence, for this reason we value community contributions greatly.
+Another great way to contribute is by submitting a pull request.  Pull requests which are most likely to be accepted are
+
+* A new converter
+* A test case
+* A bug fix
+
+If you add a new converter, it is best to include a few test
+cases that cross validate the converter against the original PyTorch.  We provide a utility function to do this,
+as described in the [Custom Converter](usage/custom_converter.md) usage guide.
+
+Ideally pull requests solve one thing at a time.  This makes it easy
+to evaluate the impact that the changes have on the project step-by-step.  The more confident we are that
+the changes will not adversely impact the experience of other developers, the more likely we are to accept them.
+
+## Running module test cases
+
+Before any change is accepted, we run the test cases on at least one platform.  This performs a large number
+of cross validation checks against PyTorch.  To do this
+
+```bash
+python3 -m torch2trt.test --name=converters --tolerance=1e-2
+```
+
+This will not hard-fail, but will highlight any build errors or max error checks.  It is helpful if you include
+the status of this command in any pull-request, as well as system information like
+
+* PyTorch version
+* TensorRT version
+* Platform (ie: Jetson Nano)
+
+## Testing documentation
+
+If you have a change that modifies the documentation, it is relatively straightforward to test.  We
+use ``mkdocs-material`` for documentation, which parses markdown files in the ``docs`` folder.
+
+To view the docs, simply call
+
+```
+./scripts/test_docs.sh
+```
+
+And then navigate to ``https://<ip_address>:8000``.
+
+Please note, this will not include dynamically generated documentation pages like the converters page.
+These contain cross reference links to the GitHub source code. If you want to test these
+you can call 
+    
+```bash
+./scripts/build_docs.sh <github url> <tag>
+```
+    
+Pointing to the public reflection
+of your local repository.  For example, if we're working off the upstream master branch, we
+would call 
+   
+```bash
+./scripts/build_docs.sh https://github.com/NVIDIA-AI-IOT/torch2trt master
+```
+    
+If your changes are pushed to your fork, you would do 
+   
+```bash
+./scripts/build_docs.sh https://github.com/<user>/torch2trt my_branch
+```
+    
@@ -0,0 +1 @@
+../CHANGELOG.md
@@ -0,0 +1 @@
+../CONTRIBUTING.md
@@ -0,0 +1,22 @@
+# Jetson Nano
+
+| Name | Data Type | Input Shapes | torch2trt kwargs | Max Error | Throughput (PyTorch) | Throughput (TensorRT) | Latency (PyTorch) | Latency (TensorRT) |
+|------|-----------|--------------|------------------|-----------|----------------------|-----------------------|-------------------|--------------------|
+| torchvision.models.alexnet.alexnet | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 2.29E-05 | 46.4 | 69.9 | 22.1 | 14.7 |
+| torchvision.models.squeezenet.squeezenet1_0 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 1.20E-02 | 44 | 137 | 24.2 | 7.6 |
+| torchvision.models.squeezenet.squeezenet1_1 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 9.77E-04 | 76.6 | 248 | 14 | 4.34 |
+| torchvision.models.resnet.resnet18 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 5.86E-03 | 29.4 | 90.2 | 34.7 | 11.4 |
+| torchvision.models.resnet.resnet34 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 1.56E-01 | 15.5 | 50.7 | 64.8 | 20.2 |
+| torchvision.models.resnet.resnet50 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 6.45E-02 | 12.4 | 34.2 | 81.7 | 29.8 |
+| torchvision.models.resnet.resnet101 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 1.01E+03 | 7.18 | 19.9 | 141 | 51.1 |
+| torchvision.models.resnet.resnet152 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 0.00E+00 | 4.96 | 14.1 | 204 | 72.3 |
+| torchvision.models.densenet.densenet121 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 3.42E-03 | 11.5 | 41.9 | 84.5 | 24.8 |
+| torchvision.models.densenet.densenet169 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 5.86E-03 | 8.25 | 33.2 | 118 | 31.2 |
+| torchvision.models.densenet.densenet201 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 3.42E-03 | 6.84 | 25.4 | 141 | 40.8 |
+| torchvision.models.densenet.densenet161 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 4.15E-03 | 4.71 | 15.6 | 247 | 65.8 |
+| torchvision.models.vgg.vgg11 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 3.51E-04 | 8.9 | 18.3 | 114 | 55.1 |
+| torchvision.models.vgg.vgg13 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 3.07E-04 | 6.53 | 14.7 | 156 | 68.7 |
+| torchvision.models.vgg.vgg16 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 4.58E-04 | 5.09 | 11.9 | 201 | 85.1 |
+| torchvision.models.vgg.vgg11_bn | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 3.81E-04 | 8.74 | 18.4 | 117 | 54.8 |
+| torchvision.models.vgg.vgg13_bn | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 5.19E-04 | 6.31 | 14.8 | 162 | 68.5 |
+| torchvision.models.vgg.vgg16_bn | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 9.77E-04 | 4.96 | 12 | 207 | 84.3 |
@@ -0,0 +1,33 @@
+# Jetson Xavier
+
+| Name | Data Type | Input Shapes | torch2trt kwargs | Max Error | Throughput (PyTorch) | Throughput (TensorRT) | Latency (PyTorch) | Latency (TensorRT) |
+|------|-----------|--------------|------------------|-----------|----------------------|-----------------------|-------------------|--------------------|
+| torch2trt.tests.torchvision.classification.alexnet | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 7.63E-05 | 251 | 565 | 4.96 | 2.02 |
+| torch2trt.tests.torchvision.classification.squeezenet1_0 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 9.77E-04 | 121 | 834 | 8.04 | 1.49 |
+| torch2trt.tests.torchvision.classification.squeezenet1_1 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 9.77E-04 | 125 | 1.29e+03 | 8.01 | 1.02 |
+| torch2trt.tests.torchvision.classification.resnet18 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 9.77E-03 | 136 | 722 | 7.33 | 1.64 |
+| torch2trt.tests.torchvision.classification.resnet34 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 2.50E-01 | 77.8 | 396 | 12.9 | 2.79 |
+| torch2trt.tests.torchvision.classification.resnet50 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 1.09E-01 | 55.8 | 326 | 17.9 | 3.37 |
+| torch2trt.tests.torchvision.classification.resnet101 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 0.00E+00 | 28.3 | 175 | 35.1 | 6.04 |
+| torch2trt.tests.torchvision.classification.resnet152 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 0.00E+00 | 18.8 | 122 | 53.2 | 8.57 |
+| torch2trt.tests.torchvision.classification.densenet121 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 7.81E-03 | 20.9 | 76.6 | 47.5 | 13 |
+| torch2trt.tests.torchvision.classification.densenet169 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 3.91E-03 | 14.8 | 41.7 | 66.7 | 23.7 |
+| torch2trt.tests.torchvision.classification.densenet201 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 4.88E-03 | 12.6 | 30.2 | 79.1 | 33 |
+| torch2trt.tests.torchvision.classification.densenet161 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 4.88E-03 | 16.1 | 43.7 | 62.1 | 23 |
+| torch2trt.tests.torchvision.classification.vgg11 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 2.56E-03 | 84.8 | 201 | 12.1 | 5.24 |
+| torch2trt.tests.torchvision.classification.vgg13 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 2.24E-03 | 71.1 | 165 | 14.3 | 6.34 |
+| torch2trt.tests.torchvision.classification.vgg16 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 3.78E-03 | 61.5 | 139 | 16.5 | 7.46 |
+| torch2trt.tests.torchvision.classification.vgg19 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 2.81E-03 | 54.1 | 120 | 18.7 | 8.61 |
+| torch2trt.tests.torchvision.classification.vgg11_bn | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 2.20E-03 | 81.5 | 200 | 12.5 | 5.27 |
+| torch2trt.tests.torchvision.classification.vgg13_bn | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 1.71E-03 | 67.5 | 165 | 15.1 | 6.33 |
+| torch2trt.tests.torchvision.classification.vgg16_bn | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 2.87E-03 | 58.3 | 139 | 17.4 | 7.48 |
+| torch2trt.tests.torchvision.classification.vgg19_bn | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 2.44E-03 | 51.4 | 120 | 19.7 | 8.61 |
+| torch2trt.tests.torchvision.classification.mobilenet_v2 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 0.00E+00 | 64.8 | 723 | 15.4 | 1.67 |
+| torch2trt.tests.torchvision.classification.shufflenet_v2_x0_5 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 1.53E-05 | 51.2 | 463 | 19.4 | 2.17 |
+| torch2trt.tests.torchvision.classification.shufflenet_v2_x1_0 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 1.53E-05 | 49.4 | 419 | 20.4 | 2.43 |
+| torch2trt.tests.torchvision.classification.shufflenet_v2_x1_5 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 1.53E-05 | 51.4 | 426 | 19.6 | 2.37 |
+| torch2trt.tests.torchvision.classification.shufflenet_v2_x2_0 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 1.53E-05 | 48.2 | 419 | 20.8 | 2.48 |
+| torch2trt.tests.torchvision.classification.mnasnet0_5 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 2.03E-06 | 67.8 | 883 | 14.9 | 1.4 |
+| torch2trt.tests.torchvision.classification.mnasnet0_75 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 0.00E+00 | 67.6 | 751 | 14.8 | 1.6 |
+| torch2trt.tests.torchvision.classification.mnasnet1_0 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 0.00E+00 | 65.7 | 667 | 15.2 | 1.77 |
+| torch2trt.tests.torchvision.classification.mnasnet1_3 | float16 | [(1, 3, 224, 224)] | {'fp16_mode': True} | 0.00E+00 | 67.4 | 573 | 15 | 2.02 |
@@ -0,0 +1,5 @@
+@media only screen and (max-width:76.1875em) {
+  #version-selector {
+    padding: .6rem .8rem;
+  }
+}
@@ -0,0 +1,32 @@
+# Getting Started
+
+Follow these steps to get started using torch2trt.
+
+!!! note
+
+    torch2trt depends on the TensorRT Python API.  On Jetson, this is included with the latest JetPack.  For desktop, please follow the [TensorRT Installation Guide](https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html).  You may also try installing torch2trt inside one of the NGC PyTorch docker containers for [Desktop](https://ngc.nvidia.com/catalog/containers/nvidia:pytorch) or [Jetson](https://ngc.nvidia.com/catalog/containers/nvidia:l4t-pytorch).
+
+### Install Without plugins
+
+To install without compiling plugins, call the following
+
+```bash
+git clone https://github.com/NVIDIA-AI-IOT/torch2trt
+cd torch2trt
+python setup.py install
+```
+
+### Install With plugins
+
+To install with plugins to support some operations in PyTorch that are not natviely supported with TensorRT, call the following
+
+!!! note
+    
+    Please note, this currently only includes the interpolate plugin.  This plugin requires PyTorch 1.3+ for serialization.  
+
+```bash
+git clone https://github.com/NVIDIA-AI-IOT/torch2trt
+cd torch2trt
+sudo python setup.py install --plugins
+```
+