|
| 1 | +# TRTorch Partitioning |
| 2 | + |
| 3 | +TRTorch partitioning phase is developed to support `automatic fallback` feature in TRTorch. This phase won't run by |
| 4 | +default until the automatic fallback feature is enabled. |
| 5 | + |
| 6 | +On a high level, TRTorch partitioning phase does the following: |
| 7 | +- `Segmentation`. Go through the set of operators in order and verify if there is converter for each operator. Then, |
| 8 | +roughly separate the graph into parts that TRTorch can support and parts TRTorch cannot. |
| 9 | +- `Dependency Analysis`. For every to be compiled operator there is a "complete dependency graph", which means that |
| 10 | +every input can to traced back to an input as Tensor or TensorList. Go through all segments after segmentation then |
| 11 | + do dependency analysis to ensure that there are only Tensor/TensorList inputs and outputs for TensorRT segments. |
| 12 | +- `Shape Analysis`. For each segments, figure out the input and outputs shapes starting from the provided input shape |
| 13 | +from the user. Shapes can be calculated by running the graphs with JIT. |
| 14 | +- `Conversion`. Every TensorRT segments will be converted to TensorRT engine. This part is done in compiler.cpp, but |
| 15 | + it's still a phase in our partitioning process. |
| 16 | +- `Stitching`. Stitch all TensorRT engines with PyTorch nodes altogether. |
| 17 | + |
| 18 | +Test cases for each of these components could be found [here](https://github.com/NVIDIA/TRTorch/tree/master/tests/core/partitioning). |
| 19 | + |
| 20 | +Here is the brief description of functionalities of each file: |
| 21 | +- `PartitionInfo.h/cpp`: The automatic fallback APIs that is used for partitioning. |
| 22 | +- `SegmentedBlock.h/cpp`: The main data structures that is used to maintain information for each segments after segmentation. |
| 23 | +- `shape_analysis.h/cpp`: Code implementation to get the shapes for each segments by running them in JIT. |
| 24 | +- `partitioning.h/cpp`: APIs and main code implementation for partitioning phase. |
| 25 | + |
| 26 | +### Automatic Fallback |
| 27 | +To enable automatic fallback feature, you can set following attributes in Python: |
| 28 | +```python |
| 29 | + import torch |
| 30 | + import trtorch |
| 31 | + |
| 32 | + ... |
| 33 | + model = MyModel() |
| 34 | + ts_model = torch.jit.script(model) |
| 35 | + trt_model = trtorch.compile(model, { |
| 36 | + ... |
| 37 | + "torch_fallback" : { |
| 38 | + "enabled" : True, |
| 39 | + "min_block_size" : 3, |
| 40 | + "forced_fallback_ops": ["aten::add"], |
| 41 | + } |
| 42 | + }) |
| 43 | +``` |
| 44 | +- `enabled`: By default automatic fallback will be off. It is enabled by setting it to True. |
| 45 | +- `min_block_size`: The minimum number of consecutive operations that must satisfy to be converted to TensorRT. For |
| 46 | +example, if it's set to 3, then there must be 3 consecutive supported operators then this segments will be converted. |
| 47 | +- `forced_fallback_ops`: A list of strings that will be the names of operations that the user explicitly want to be in |
| 48 | +PyTorch nodes. |
| 49 | + |
| 50 | +To enable automatic fallback feature in C++, following APIs could be uses: |
| 51 | + |
| 52 | +```c++ |
| 53 | +#include "torch/script.h" |
| 54 | +#include "trtorch/trtorch.h" |
| 55 | + |
| 56 | +... |
| 57 | +auto in = torch::randn({1, 3, 224, 224}, {torch::kCUDA}); |
| 58 | + |
| 59 | +auto mod = trtorch::jit::load("trt_ts_module.ts"); |
| 60 | +auto input_sizes = std::vector<trtorch::CompileSpec::InputRange>{{in.sizes()}}; |
| 61 | +trtorch::CompileSpec cfg(input_sizes); |
| 62 | +cfg.torch_fallback = trtorch::CompileSpec::TorchFallback(true); |
| 63 | +cfg.torch_fallback.min_block_size = 2; |
| 64 | +cfg.torch_fallback.forced_fallback_ops.push_back("aten::relu"); |
| 65 | +auto trt_mod = trtorch::CompileGraph(mod, cfg); |
| 66 | +auto out = trt_mod.forward({in}); |
| 67 | +``` |
0 commit comments