|
| 1 | +# TRTorch Partitioning |
| 2 | + |
| 3 | +TRTorch partitioning phase is developed to support automatic fallback feature in TRTorch. This phase won't run by |
| 4 | +default until the automatic fallback feature is enabled. |
| 5 | + |
| 6 | +On a high level, TRTorch partitioning phase does the following: |
| 7 | +- Segmentation. Go through the set of operators in order and verify if there is converter for each operator. Then, |
| 8 | +roughly separate the graph into parts that TRTorch can support and parts TRTorch cannot. |
| 9 | +- Dependency Analysis. For every to be compiled operator there is a "complete dependency graph", which means that |
| 10 | +every input can to traced back to an input as Tensor or TensorList. Go through all segments after segmentation then |
| 11 | + do dependency analysis to ensure that there are only Tensor/TensorList inputs and outputs for TensorRT segments. |
| 12 | +- Shape Analysis. For each segments, figure out the input and outputs shapes starting from the provided input shape |
| 13 | +from the user. Shapes can be calculated by running the graphs with JIT. |
| 14 | +- Conversion. Every TensorRT segments will be converted to TensorRT engine. This part is done in compiler.cpp, but |
| 15 | + it's still a phase in our partitioning process. |
| 16 | +- Stitching. Stitch all TensorRT engines with PyTorch nodes altogether. |
| 17 | + |
| 18 | +Here is the brief description of functionalities of each file: |
| 19 | +- PartitionInfo.h/cpp: The automatic fallback APIs that is used for partitioning. |
| 20 | +- SegmentedBlock.h/cpp: The main data structures that is used to maintain information for each segments after segmentation. |
| 21 | +- shape_analysis.h/cpp: Code implementation to get the shapes for each segments by running them in JIT. |
| 22 | +- partitioning.h/cpp: APIs and main code implementation for partitioning phase. |
| 23 | + |
| 24 | +### Automatic Fallback |
| 25 | +To enable automatic fallback feature, you can set following attributes in Python: |
| 26 | +```python |
| 27 | + import torch |
| 28 | + import trtorch |
| 29 | + |
| 30 | + ... |
| 31 | + model = MyModel() |
| 32 | + ts_model = torch.jit.script(model) |
| 33 | + trt_model = trtorch.compile(model, { |
| 34 | + ... |
| 35 | + "torch_fallback" : { |
| 36 | + "enabled" : True, |
| 37 | + "min_block_size" : 1, |
| 38 | + "forced_fallback_ops": ["aten::foo"], |
| 39 | + } |
| 40 | + }) |
| 41 | +``` |
| 42 | +- enabled: By default automatic fallback will be off. It is enabled by setting it to True. |
| 43 | +- min_block_size: The minimum number of consecutive operations that must satisfy to be converted to TensorRT. For |
| 44 | +example, if it's set to 3, then there must be 3 consecutive supported operators then this segments will be converted. |
| 45 | +- forced_fallback_ops: A list of strings that will be the names of operations that the user explicitly want to be in |
| 46 | +PyTorch nodes. |
| 47 | + |
| 48 | +To enable automatic fallback feature in C++, following APIs could be uses: |
| 49 | + |
| 50 | +```c++ |
| 51 | +#include "torch/script.h" |
| 52 | +#include "trtorch/trtorch.h" |
| 53 | + |
| 54 | +... |
| 55 | +auto in = torch::randn({1, 3, 224, 224}, {torch::kCUDA}); |
| 56 | + |
| 57 | +auto mod = trtorch::jit::load("trt_ts_module.ts"); |
| 58 | +auto input_sizes = std::vector<trtorch::CompileSpec::InputRange>{{in.sizes()}}; |
| 59 | +trtorch::CompileSpec cfg(input_sizes); |
| 60 | +cfg.torch_fallback = trtorch::CompileSpec::TorchFallback(true); |
| 61 | +cfg.torch_fallback.min_block_size = 1; |
| 62 | +cfg.torch_fallback.forced_fallback_ops.push_back("aten::foo"); |
| 63 | +auto trt_mod = trtorch::CompileGraph(mod, cfg); |
| 64 | +auto out = trt_mod.forward({in}); |
| 65 | +``` |
0 commit comments