Releases: TexasInstruments/edgeai-tidl-tools
11_02_04_00
New in this Release
| Description | Notes |
|---|---|
| Module safety for TIDL-RT Inference for AM69A/J784S4 | |
| Support for several new operators: ArgMin, Expand, Min, ReduceMean, ReduceSum, Swish | |
| Support for 5x5s4 and 3x3s2 deconvolution | |
| Increase the maximum number of inputs and outputs for a model to 32 inputs and 32 outputs per core | |
| Bechmarking API for ONNX Runtime C++ interface | |
| Publicly document build instructions for onnxruntime wheels and libraries | Refer |
Fixed in this Release
| ID | Description | Affected Platforms |
|---|---|---|
| TIDL-3886 | Maxpool 2x2 with stride 1x1 is considered supported but is incorrectly denied from being offloaded to C7x | All except AM62 |
| TIDL-6856 | Models with Convolution operator having 3x1 kernel_shape and single input & output channel fail in compilation | All except AM62 |
| TIDL-6866 | Models compiled with option "advanced_options:output_feature_16bit_names_list" along with "high_resolution_optimization" = 1 and "tensor_bits = 8" results in functionally incorrect output on host emulation/target | All except AM62 |
| TIDL-7032 | Models with Convolution operator where padding is >= kernel size results in hang on device | All except AM62 |
| TIDL-7249 | Quantization prototxt bias values changing based on calibration data | All except AM62 |
| TIDL-7298 | Models with Gather operator having constant indices results in incorrect quantized values | All except AM62 |
| TIDL-7306 | Models with ScatterElements operator where indices or update tensors is constant initializer, produce incorrect results | All except AM62 |
| TIDL-7418 | QDQ layers where weight is shared by multiple layers results in model compilation failure | All except AM62 |
| TIDL-7448 | Convolution operator gives wrong output shape when kernel shape is 2x2, strides are 2x2 and (PadH, PadW) exceeds (1, 1) | All except AM62 |
| TIDL-7450 | Models with Pad operator where constant value is non-zero and a floating-point number, produce incorrect results | All except AM62 |
| TIDL-7518 | Models with Squeeze operator fail with : "ERROR: Requested constant tensor 1 in is not found" when axes input is not provided | All except AM62 |
| TIDL-7523 | Models having batch size > 1 give inconsistent performance estimates | All except AM62 |
| TIDL-7548 | Performance estimation artifacts incorrectly report the shape of Data layers | All except AM62 |
| TIDL-7565 | Trigonometric activation functions produce incorrect outputs in the asymptotic regions | All except AM62 |
| TIDL-7572 | Convolution operator gives incorrect output shape when auto_pad attribute is SAME_UPPER or SAME_LOWER | All except AM62 |
| TIDL-7573 | Models with ScatterND operator having negative indices result in a crash on device | All except AM62 |
| TIDL-7574 | Models with ScatterElements operator where all input indices are negative, results in a segmentation fault in inference | All except AM62 |
| TIDL-7576 | Models with InstanceNormalization operator where number of input dimensions is not equal to 4, produce incorrect results | All except AM62 |
| TIDL-7577 | Models with ScatterND operator where indices tensor width is 1, produce incorrect results | All except AM62 |
| TIDL-7581 | Models with LeakyRelu operator where alpha != 1 produce unstable results | All except AM62 |
| TIDL-7588 | Models with ScatterND operator with >4D input shape produce incorrect results in 8-bit host emulation | All except AM62 |
| TIDL-7842 | Interrupt Signal (SIGINT) is not handled in TIDL-RT test application, causing C7x to enter into bad state | All except AM62 |
| TIDL-7869 | ONNXRUNTIME does not have proper check for dynamic library loading failures | All except AM62 |
| TIDL-7927 | Transpose layer consumed by multiple MatMul operators fails in import saying "Network Optimization failed" | All except AM62 |
| TIDL-7944 | Div operator with input B containing near-zero values results in poor accuracy in 16bit | All except AM62 |
| TIDL-8006 | Performance estimates for Gather and GridSample layer are incorrect | All except AM62 |
| TIDL-8009 | Models with Pad operator where pads tensor values are negative throws Segmentation fault during model inference | All except AM62 |
| TIDL-8010 | Models with Pad operator with >4D input shape produce incorrect results in 8-bit host emulation | All except AM62 |
| TIDL-8024 | Models with ScatterND operator with >4D input shape produce incorrect results in 8-bit host emulation | All except AM62 |
| TIDL-8028 | Models with Convolution operator where kernel size is (NxN), strides are (NxN) and input height and/or width dimension are not evenly divisible by the kernel size | All except AM62 |
| TIDL-8493 | Models compiled with quantization_style=4 and with a layernorm operator whose outputs are all negative and the same value result in functionally incorrect outputs | All except AM62 |
| TIDL-8571 | Gather operator with scalar indices results in incorrect output shape in compiled network | All except AM62 |
| TIDL-8829 | Pool operator when auto_pad is SAME_UPPER or SAME_LOWER results in incorrect output shape in compiled network | All except AM62 |
| TIDL-8874 | Deformable convolution expressed as a sequence of nodes does not get correctly optimized to a single deformable convolution block during model compilation | All except AM62 |
| TIDL-8830 | Deformable Convolution pattern where the offset and mask are generated from a single convolution layer is not fused into a Deformable Convolution layer | All except AM62 |
| TIDL-8872 | Deformable Convolution results in wrong output if any dimension above channel is non-singleton | All except AM62 |
| TIDL-8876 | ConvTranspose operator triggers a segmentation fault in 16-bit model compilation when the total output size exceeds int32 max value | All except AM62 |
| TIDL-8903 | Div operator is incorrectly mapped to BatchNorm layer in compiled model when the constant input in Div is not along channel dimension, resulting in incorrect results | All except AM62 |
| TIDL-12101 | Models with Slice operator having int32 inputs result in a hang on device | All except AM62 |
| TIDL-12404 | Models with Transpose operator having >3D input shape produce incorrect results in 16bit on device | All except AM62 |
| TIDL-12441 | BiasScale gets incorrectly populated for non linear activations and displays random values in SVG/Artifacts | All except AM62 |
| TIDL-12472 | BatchNorm operator with input shape [N, C] is incorrectly removed from the network causing a crash during model compilation | All except AM62 |
| TIDL-12473 | Models with patch embedding convolution (NxNsN) with multiple output consumers results in functionally incorrect outputs | All except AM62 |
| TIDL-12496 | Models with Softmax operator having dimension larger than 1024 along axis produce incorrect results when compiled with the option 'advanced_options:quantization_scale_type' set to 1 | All except AM62 |
| TIDL-12507 | Model with Split/Slice layer which has at least one output not being consumed might result in a failure during model compilation | All except AM62 |
| TIDL-12510 | edgeai-tidl-tools oob model cl-ort-resnet18-v1_4batch does not compile on REL.TIDL.11.01.08.00 | All except AM62 |
| TIDL-12527 | Models with "Max" operator with a constant input whose size is <= channel dimension of the variable input produce incorrect results | All except AM62 |
| TIDL-12569 | Models with Transpose operator when compiled in 16bit produce incorrect results on device in 11.02.00.01 | All except AM62 |
Known Issues
| ID | Description | Affected Platforms | Occurrence | Workaround in this release |
|---|---|---|---|---|
| TIDL-3622 | Quantization prototxt does not correctly fill information for tflite const layers | All except AM62 | Rare | None |
| TIDL-3780 | Prototext based scale input may result in slight degradation in quantized output | All except AM62 | Rare | None |
| TIDL-3845 | Running model compilation and inference back to back in the same python script results in a segfault | All except AM62 | Rare | None |
| TIDL-3905 | TFLite Prequantized models with "add_datac... |
11_01_07_00
New in this Release
| Description | Notes |
|---|---|
| Improved SVG viewer for TIDL Compiled Models | Compiled artifacts will now include an HTML file with the generated SVG embedded, replacing the plain SVG. This interactive viewer provides advanced features, such as clicking on nodes, exploring input/output connections, and seamless navigation. The older SVG can still be downloaded from the HTML |
| Added support for [M,N] output shape Gemm operator |
Fixed in this Release
| ID | Description | Affected Platforms |
|---|---|---|
| TIDL-7560 | Matmul gives error "Failed to run calibration pass" in compilation when input B has larger dimensions than input A | All except AM62 |
| TIDL-7818 | Gridsample with bilinear mode has sub-optimal performance | All except AM62 |
| TIDL-8880 | Matmul with quantization_style=4 and a constant tensor as input results in a functional issue on device when dimensions above channel are > 1 | All except AM62 |
| TIDL-12403 | Softmax gives wrong result when the input is 255 in 8 bit | All except AM62 |
| TIDL-12413 | Models compiled with quantization_style=4 and which have a matmul layer with bias fused into it, where the volume of the sum of inputs > 224KB results in incorrect outputs on device | All except AM62 |
| TIDL-12427 | Models with partial QDQ were incorrectly not starting calibration during model compilation | All except AM62 |
| TIDL-12453 | Matmul with a broadcasted dimension above the channel dimension (3rd dimension) results in a functional mismatch on EVM | All except AM62 |
| TIDL-12464 | Incorrect out element type set for TIDL Data Convert layer when a model is imported with "advanced_options:prequantized_model" and "advanced_options:add_data_convert_ops > 1" and input layer doesn't have any QDQ layers | All except AM62 |
| TIDL-12497 | Inference gives different results between host emulation and target due to incorrect classification of MatMul layers into asymmetric flow | All except AM62 |
| TIDL-12499 | Pad layer results in incorrect outputs on device when only right pad = 1 and all other pad values are 0 | All except AM62 |
Known Issues
| ID | Description | Affected Platforms | Occurrence | Workaround in this release |
|---|---|---|---|---|
| TIDL-3622 | Quantization prototxt does not correctly fill information for tflite const layers | All except AM62 | Rare | None |
| TIDL-3780 | Prototext based scale input may result in slight degradation in quantized output | All except AM62 | Rare | None |
| TIDL-3845 | Running model compilation and inference back to back in the same python script results in a segfault | All except AM62 | Rare | None |
| TIDL-3886 | Maxpool 2x2 with stride 1x1 is considered supported but is incorrectly denied from being offloaded to C7x | All except AM62 | Rare | None |
| TIDL-3905 | TFLite Prequantized models with "add_dataconvert_ops": 3 fails with error "Unable to split bias" | All except AM62 | Rare | None |
| TIDL-4024 | QDQ models with self-attention blocks error out during model compilation with "RUNTIME_EXCEPTION : Non-zero status code returned while running TIDL_0 node. Name:'TIDLExecutionProvider_TIDL_0_0' Status Message: CHECK failed: (index) < (current_size_)") | All except AM62 | Rare | None |
| TIDL-4625 | TIDL QDQ model import fails with "[PARSER] ERROR: Unable to merge Quantize" error when the model has unsupported nodes | All except AM62 | Rare | None |
| TIDL-4699 | Eltwise Mul in ONNX QDQ models results in poorer accuracy compared to ONNX's QLinear implementation of the same | All except AM62 | Rare | None |
| TIDL-6866 | Using option "advanced_options:output_feature_16bit_names_list" along with "high_resolution_optimization" = 1 and "tensor_bits = 8" results in functionally incorrect output on host emulation/target | All except AM62 | Rare | None |
| TIDL-7108 | Model compilation fails for OD networks which have conditional subgraph blocks | All except AM62 | Rare | None |
| TIDL-7298 | [Gather] Mismatched typecasting for const layer indices during quantization | All except AM62 | Rare | None |
| TIDL-7418 | QDQ layers where weight is shared by multiple layers results in model compilation failure | All except AM62 | Rare | None |
| TIDL-7423 | Broadcast mul has an inconsistent jump in latency with increase in dimensions (When dimensions are not factorizable) | All except AM62 | Rare | None |
| TIDL-7593 | Inference runs on the device once after a reboot but hangs on subsequent runs | All except AM62 | Rare | None |
| TIDL-7842 | Incorrect cleanup after a ctrl-C on EVM causes a crash | All except AM62 | Rare | None |
| TIDL-8021 | TIDL import fails when quant params prototxt is used in combination with a onnx QDQ model | All except AM62 | Rare | None |
| TIDL-8902 | Convolution with non standard pad values results in EVM Hang issues | All except AM62 | Rare | None |
| TIDL-12101 | Slice with INT32 inputs results on a hang on target | All except AM62 | Rare | None |
| TIDL-12412 | Accuracy collapses for large number of calibration frames during model compilation PTQ (With default accuracy_level) | All except AM62 | Rare | None |
| TIDL-12510 | edgeai-tidl-tools OOB model cl-ort-resnet18-v1_4batch does not compile | AM69A | Always | None |
11_01_06_00
New in this Release
| Description | Notes |
|---|---|
| Module safety for TIDL-RT Inference for AM68A/J721S2 | |
| Support for new operators: ScatterElements, SiLU | |
| Added support for FastVit and Deformable DETR architectures |
Fixed in this Release
| ID | Description | Affected Platforms |
|---|---|---|
| TIDL-2990 | PReLU layer does not correctly parse the slope parameter and produces incorrect outputs | All except AM62 |
| TIDL-3411 | ONNX models which have the same input and output names result in a segmentation fault | All except AM62 |
| TIDL-7308 | TIDL optimizer fuses MatMul output side transpose into MatMul even though MatMul has more than one consumer in the network | All except AM62 |
| TIDL-7342 | Slice Layer with Dim1 or Dim2 greater then 1 results into hang on EVM | All except AM62 |
| TIDL-7366 | Pooling layer pads are not correctly getting parsed in TIDL | All except AM62 |
| TIDL-7412 | [C7x-MULTI-CORE] Multi core Low latency mode may have functionally wrong result for networks having maxpooling layer with ceilmode set | All except AM62 |
| TIDL-7434 | Gemm layer gives seg fault in compilation when input A and B are variable and input C is constant | All except AM62 |
| TIDL-7527 | [ConvTranspose ] ConvTranspose layer gives wrong output shape when dilation > 1 | All except AM62 |
| TIDL-7531 | TIDL-RT inserts pad between channels which requires application user to remove for final output data of network | All except AM62 |
| TIDL-7535 | [ConvTranspose] ConvTranspose gives functionally wrong output when group > 1 | All except AM62 |
| TIDL-7539 | [ConvTranspose ] ConvTranspose layer gives wrong output shape when auto_pad is SAME_UPPER or SAME_LOWER | All except AM62 |
| TIDL-7541 | [ConvTranspose] ConvTranspose functionally gives wrong output when kernel shape is 3*3 with stride 2*2 and dilation 1*1 | All except AM62 |
| TIDL-7559 | Transpose layer getting wrongly fused in Matmul, causing incorrect layer output shape | All except AM62 |
| TIDL-7561 | Supported ReduceMax/Min (axis along height) is incorrectly getting denied in TIDL | All except AM62 |
| TIDL-7568 | Model optimizer fails with "Error in topologically sorting the network" when imported model has Flatten layer with producer Reshape Layer having multiple consumers | All except AM62 |
| TIDL-7585 | [ConvTranspose] For Tflite models, Deconvolution layer is offloaded to ARM with message "Layer type not supported by TIDL" | All except AM62 |
| TIDL-7867 | TIDL-RT inference on target may abruptly call TIDL_deactivate when a network has reduce layer with less spatial volume (< 256 KB for AM62A, J722S and <448 KB for other devices) | All except AM62 |
| TIDL-7891 | GridSample with 16-bit inputs produces different results between Host emulation and target run | All except AM62 |
| TIDL-7902 | Functional mismatch when outputFeature16bitNamesList and params16bitNamesList is set on board | All except AM62 |
| TIDL-7909 | Deconv/ConvTranpose layers with very high plane size (greater than 64KB) can result in mismatch between host and target execution | All except AM62 |
| TIDL-7924 | Convolution (which is pushed to 16-bit via mixed precision) with weight size greater than 2MB results a hang on EVM | All except AM62 |
| TIDL-7929 | TopK import gives error during calibration due to incorrect data types for added slice layer | All except AM62 |
| TIDL-7940 | Gather with indices coming from TopK layer does not import | All except AM62 |
| TIDL-7941 | Update GridSample documentation for various limitations | All except AM62 |
| TIDL-7995 | Fixed point output of Sub operator has a constant shift (When one of the inputs is a constant) compared to floating point output | All except AM62 |
| TIDL-7999 | Missing ranges for newly added layers during TIDL import for a Onnx QDQ model resulting in poor accuracy during inference | All except AM62 |
| TIDL-8008 | Max Op 16bit mismatch on device for SDK 10.1/11.01.02 backport | All except AM62 |
| TIDL-8017 | Valid Convolution/Poling with Batch dimension results in hang on EVM when inferenceMode = Low_Latency | All except AM62 |
| TIDL-8022 | TIDL outputs zeros for indicies output from TopK | All except AM62 |
Known Issues
| ID | Description ...
11_01_05_00
New in this Release
| Description | Notes |
|---|---|
| Module safety for TIDL-RT Inference for AM68A/J721S2 | |
| Support for new operators: ScatterElements, SiLU | |
| Added support for FastVit and Deformable DETR architectures |
Fixed in this Release
| ID | Description | Affected Platforms |
|---|---|---|
| TIDL-2990 | PReLU layer does not correctly parse the slope parameter and produces incorrect outputs | All except AM62 |
| TIDL-3411 | ONNX models which have the same input and output names result in a segmentation fault | All except AM62 |
| TIDL-7308 | TIDL optimizer fuses MatMul output side transpose into MatMul even though MatMul has more than one consumer in the network | All except AM62 |
| TIDL-7342 | Slice Layer with Dim1 or Dim2 greater then 1 results into hang on EVM | All except AM62 |
| TIDL-7366 | Pooling layer pads are not correctly getting parsed in TIDL | All except AM62 |
| TIDL-7412 | [C7x-MULTI-CORE] Multi core Low latency mode may have functionally wrong result for networks having maxpooling layer with ceilmode set | All except AM62 |
| TIDL-7434 | Gemm layer gives seg fault in compilation when input A and B are variable and input C is constant | All except AM62 |
| TIDL-7527 | [ConvTranspose ] ConvTranspose layer gives wrong output shape when dilation > 1 | All except AM62 |
| TIDL-7531 | TIDL-RT inserts pad between channels which requires application user to remove for final output data of network | All except AM62 |
| TIDL-7535 | [ConvTranspose] ConvTranspose gives functionally wrong output when group > 1 | All except AM62 |
| TIDL-7539 | [ConvTranspose ] ConvTranspose layer gives wrong output shape when auto_pad is SAME_UPPER or SAME_LOWER | All except AM62 |
| TIDL-7541 | [ConvTranspose] ConvTranspose functionally gives wrong output when kernel shape is 3*3 with stride 2*2 and dilation 1*1 | All except AM62 |
| TIDL-7559 | Transpose layer getting wrongly fused in Matmul, causing incorrect layer output shape | All except AM62 |
| TIDL-7561 | Supported ReduceMax/Min (axis along height) is incorrectly getting denied in TIDL | All except AM62 |
| TIDL-7568 | Model optimizer fails with "Error in topologically sorting the network" when imported model has Flatten layer with producer Reshape Layer having multiple consumers | All except AM62 |
| TIDL-7585 | [ConvTranspose] For Tflite models, Deconvolution layer is offloaded to ARM with message "Layer type not supported by TIDL" | All except AM62 |
| TIDL-7867 | TIDL-RT inference on target may abruptly call TIDL_deactivate when a network has reduce layer with less spatial volume (< 256 KB for AM62A, J722S and <448 KB for other devices) | All except AM62 |
| TIDL-7891 | GridSample with 16-bit inputs produces different results between Host emulation and target run | All except AM62 |
| TIDL-7902 | Functional mismatch when outputFeature16bitNamesList and params16bitNamesList is set on board | All except AM62 |
| TIDL-7909 | Deconv/ConvTranpose layers with very high plane size (greater than 64KB) can result in mismatch between host and target execution | All except AM62 |
| TIDL-7924 | Convolution (which is pushed to 16-bit via mixed precision) with weight size greater than 2MB results a hang on EVM | All except AM62 |
| TIDL-7929 | TopK import gives error during calibration due to incorrect data types for added slice layer | All except AM62 |
| TIDL-7940 | Gather with indices coming from TopK layer does not import | All except AM62 |
| TIDL-7941 | Update GridSample documentation for various limitations | All except AM62 |
| TIDL-7995 | Fixed point output of Sub operator has a constant shift (When one of the inputs is a constant) compared to floating point output | All except AM62 |
| TIDL-7999 | Missing ranges for newly added layers during TIDL import for a Onnx QDQ model resulting in poor accuracy during inference | All except AM62 |
| TIDL-8008 | Max Op 16bit mismatch on device for SDK 10.1/11.01.02 backport | All except AM62 |
| TIDL-8017 | Valid Convolution/Poling with Batch dimension results in hang on EVM when inferenceMode = Low_Latency | All except AM62 |
| TIDL-8022 | TIDL outputs zeros for indicies output from TopK | All except AM62 |
Known Issues
| ID | Description ...
11_00_08_00
New in this Release
| Description | Notes |
|---|---|
| Enhanced support for Gridsample | |
| Several bug fixes (Noted below) |
Fixed in this Release
| ID | Description | Affected Platforms |
|---|---|---|
| TIDL-7821 | GridSample with large dimensions results in a compilation failure | All except AM62 |
| TIDL-7597 | Models with large number of inputs & with debug_level=4 in OSRT flow result in TIOVX graph formation failure | All except AM62 |
| TIDL-7595 | Output of pad layer on target is incorrect if width > 65535 | All except AM62 |
| TIDL-7592 | Incorrect documentation of depth-wise separable convolution support | All except AM62 |
| TIDL-7558 | Gridsample results in incorrect output in 8-bit when mode is nearest and input width/height is >128 | All except AM62 |
| TIDL-7547 | Model import fails to detect LayerNorm pattern if the model only has that pattern | All except AM62 |
| TIDL-7536 | Model compilation errors out with message "Layer type not supported by TIDL" if the imported network contains Max layer | All except AM62 |
| TIDL-7517 | Slice layer throws Segmentation fault during model compilation when input Ends > InputDims[axes] | All except AM62 |
| TIDL-7455 | AveragePool layer throws: Floating point exception (core dumped) when Ceil mode =1 | All except AM62 |
| TIDL-7095 | TIDL does not respect selectLastIndex attribute of ArgMax operator | All except AM62 |
| TIDL-3639 | Convolution with small number of output channels and small coefficient width may fail on target | All except AM62 |
Known Issues
| ID | Description | Affected Platforms | Occurrence | Workaround in this release |
|---|---|---|---|---|
| TIDL-2990 | PReLU layer does not correctly parse the slope parameter and produces incorrect outputs | All except AM62 | Rare | None |
| TIDL-3622 | Quantization prototxt does not correctly fill information for tflite const layers | All except AM62 | Rare | None |
| TIDL-3780 | Prototext based scale input may result in slight degradation in quantized output | All except AM62 | Rare | None |
| TIDL-3845 | Running model compilation and inference back to back in the same python script results in a segfault | All except AM62 | Rare | None |
| TIDL-3886 | Maxpool 2x2 with stride 1x1 is considered supported but is incorrectly denied from being offloaded to C7x | All except AM62 | Rare | None |
| TIDL-3905 | TFLite Prequantized models with "add_dataconvert_ops": 3 fails with error "Unable to split bias" | All except AM62 | Rare | None |
| TIDL-4024 | QDQ models with self-attention blocks error out during model compilation with "RUNTIME_EXCEPTION : Non-zero status code returned while running TIDL_0 node. Name:'TIDLExecutionProvider_TIDL_0_0' Status Message: CHECK failed: (index) < (current_size_)") | All except AM62 | Rare | None |
| TIDL-7108 | Model compilation fails for OD networks which have conditional subgraph blocks | All except AM62 | Rare | None |
| TIDL-7131 | Denial reason for maxpool layer is not consistent with layer's configuration | All except AM62 | Rare | None |
| TIDL-7366 | Pooling layer pads are not correctly getting parsed in TIDL | All except AM62 | Rare | None |
| TIDL-7418 | QDQ layers where weight is shared by multiple layers results in model compilation failure | All except AM62 | Rare | None |
DISCLAIMER
- Please refer to the Version Compatibility Table for TI SDK releases compatible with this version
- Please follow the setup steps for downloading & setting up required dependencies for this version (11_00_08_00)
11_00_07_00
New in this Release
| Description | Notes |
|---|---|
| Enhanced support for Gridsample | |
| Several bug fixes (Noted below) |
Fixed in this Release
| ID | Description | Affected Platforms |
|---|---|---|
| TIDL-7821 | GridSample with large dimensions results in a compilation failure | All except AM62 |
| TIDL-7597 | Models with large number of inputs & with debug_level=4 in OSRT flow result in TIOVX graph formation failure | All except AM62 |
| TIDL-7595 | Output of pad layer on target is incorrect if width > 65535 | All except AM62 |
| TIDL-7592 | Incorrect documentation of depth-wise separable convolution support | All except AM62 |
| TIDL-7558 | Gridsample results in incorrect output in 8-bit when mode is nearest and input width/height is >128 | All except AM62 |
| TIDL-7547 | Model import fails to detect LayerNorm pattern if the model only has that pattern | All except AM62 |
| TIDL-7536 | Model compilation errors out with message "Layer type not supported by TIDL" if the imported network contains Max layer | All except AM62 |
| TIDL-7517 | Slice layer throws Segmentation fault during model compilation when input Ends > InputDims[axes] | All except AM62 |
| TIDL-7455 | AveragePool layer throws: Floating point exception (core dumped) when Ceil mode =1 | All except AM62 |
| TIDL-7095 | TIDL does not respect selectLastIndex attribute of ArgMax operator | All except AM62 |
| TIDL-3639 | Convolution with small number of output channels and small coefficient width may fail on target | All except AM62 |
Known Issues
| ID | Description | Affected Platforms | Occurrence | Workaround in this release |
|---|---|---|---|---|
| TIDL-2990 | PReLU layer does not correctly parse the slope parameter and produces incorrect outputs | All except AM62 | Rare | None |
| TIDL-3622 | Quantization prototxt does not correctly fill information for tflite const layers | All except AM62 | Rare | None |
| TIDL-3780 | Prototext based scale input may result in slight degradation in quantized output | All except AM62 | Rare | None |
| TIDL-3845 | Running model compilation and inference back to back in the same python script results in a segfault | All except AM62 | Rare | None |
| TIDL-3886 | Maxpool 2x2 with stride 1x1 is considered supported but is incorrectly denied from being offloaded to C7x | All except AM62 | Rare | None |
| TIDL-3905 | TFLite Prequantized models with "add_dataconvert_ops": 3 fails with error "Unable to split bias" | All except AM62 | Rare | None |
| TIDL-4024 | QDQ models with self-attention blocks error out during model compilation with "RUNTIME_EXCEPTION : Non-zero status code returned while running TIDL_0 node. Name:'TIDLExecutionProvider_TIDL_0_0' Status Message: CHECK failed: (index) < (current_size_)") | All except AM62 | Rare | None |
| TIDL-7108 | Model compilation fails for OD networks which have conditional subgraph blocks | All except AM62 | Rare | None |
| TIDL-7131 | Denial reason for maxpool layer is not consistent with layer's configuration | All except AM62 | Rare | None |
| TIDL-7366 | Pooling layer pads are not correctly getting parsed in TIDL | All except AM62 | Rare | None |
| TIDL-7418 | QDQ layers where weight is shared by multiple layers results in model compilation failure | All except AM62 | Rare | None |
11_00_06_00
New in this Release
| Description | Notes |
|---|---|
| Enhanced support for transpose (Upto 6D Permute) | |
| Support for several new operators: Unsqueeze, Acos, Atan, Cos, CosH, ELU, Neg, Tan, TanH | |
| Enhanced broadcast capabilities for MatMul, Add & Mul operators | |
| Support for OD Meta-Arch for MobileNetv2-SSD (Torchvision) & FastBEV | |
| TIDL Host emulation supports runtime redirection of temporary buffers to a specified path (Instead of /dev/shm) |
Fixed in this Release
| ID | Description | Affected Platforms |
|---|---|---|
| TIDL-3906 | TIDLRT_Create call fails for multiple threads/processes in parallel | All except AM62 |
| TIDL-4329 | Network with DepthToSpace layer without any convolution before is conserved in TIDL graph and causes Floating point exception during model compilation flow | All except AM62 |
| TIDL-7026 | Model inference gives wrong results when the imported network has a convolution layer with stride > 1, kernel size = 1x1 and padding > 0 | All except AM62 |
| TIDL-7027 | Compiled subgraph has incorrect layer shapes when the imported network has batches (N in [N,D1,D2,C,H,W]) and compiled with batchMode = 1 and "advanced_options:add_data_convert_ops" : 3 options | All except AM62 |
| TIDL-7067 | FCOS3D Model inference gives wrong results on target | All except AM62 |
| TIDL-7070 | Sigmoid layer gives different results on host emulation and target (AM62A) | All except AM62 |
| TIDL-7073 | Running inference on a network with option "advanced_options:inference_mode" = 2 sequentially followed by a network with "advanced_options:inference_mode" = 0 on c7x_2 or greater results in hang on device | All except AM62 |
| TIDL-7106 | Models compiled with option "advanced_options:inference_mode" = 2 and having a layer running in MULTICORE mode with more than one input and one of the input is constant data (onnx initializer) can result in functionally wrong output | All except AM62 |
| TIDL-7107 | Models compiled with option "advanced_options:inference_mode" = 2 and having ConcatLayer with axis along height, will result in functionally wrong output on host emulation and target | All except AM62 |
| TIDL-7112 | Model inference hangs on device when imported network has a 7x7 depthwise separable convolution layer and running using mixed precision | All except AM62 |
| TIDL-7113 | Models compiled with option "advanced_options:inference_mode" = 2 and having a Concat layer along width and running in MULTICORE, will result in functionally wrong output on target axis | All except AM62 |
| TIDL-7114 | Models compiled with option "advanced_options:inference_mode" = 2 and having a MULTICORE layer followed by an MatMul/Gemm layer, will result in functionally wrong output on target | All except AM62 |
| TIDL-7115 | Models compiled with option "advanced_options:inference_mode" = 2 and having a SoftMax layer, will result in functionally wrong output on target | All except AM62 |
| TIDL-7162 | ScatterND layer (with "add" reduction attribute) gives different results on host emulation and target | All except AM62 |
| TIDL-7166 | Memory leak when running multiple Ort::Session consecutively with TIDL execution provider | All except AM62 |
| TIDL-7190 | Setting user option "advanced_options:partial_init_during_compile" = 1 during model compilation results in functionally incorrect output for tflite pre-quantized models | All except AM62 |
| TIDL-7191 | Model inference gives wrong results when the network is compiled through OSRT and input has higher dimensions (D1 & D2 in [N,D1,D2,C,H,W]) | All except AM62 |
| TIDL-7196 | Model compilation may hang or result in a network which generates incorrect outputs when imported network has number of batches (N in [N,C,H,W]) > 1 for elementwise operations | All except AM62 |
| TIDL-7202 | Compiled model has wrong shape for Unsqueeze if the imported model has axis attribute set for Unsqueeze | All except AM62 |
| TIDL-7243 | Matmul (ONNX) may fail with "Dataflow for tensor x with high volume is not supported" when the input resolution is high | All except AM62 |
| TIDL-7291 | Model inference gives wrong results in host emulation when network has a Resize layer whose input has multiple consumers | All except AM62 |
| TIDL-7292 | GridSample layer gives different results when run in host emulation and target | All except AM62 |
| TIDL-7294 | Gather layer performance (Latency) is bad when shape value of (axis -2) is equal to [64 for J721E, J721S2, J784S4, J742S2] or [32 for AM62A, J722S] | All except AM62 |
| TIDL-7300 | Concat layer gives wrong result when pad is changing from its input to output | All except AM62 |
| TIDL-7305 | Convolution layer with NxN kernel, NxN stride (N >=3) and pads>0 gives functionally wrong results | All except AM62 |
| TIDL-7325 | TIDL model inference can result in a hang when imported network has an output of size 1 and "advanced_options:add_data_convert_ops" option is set to 2/3 during model compilation | All except AM62 |
| TIDL-7328 | The functional behavior of the ArgMax layer changes when D1 or D2 dimensions exceed 1 in the shape format (N, D1, D2, C, H, W) | All except AM62 |
| TIDL-7332 | TIDL model import fails with message "Non-zero status code returned while running TIDL_0 node. Name:'TIDLExecutionProvider_TIDL_0_0' Status Message: TIDL Compute Import Failed" when network has MaxPool layer with kernel shape 1x1 and stride of 1x1 | All except AM62 |
| TIDL-7333 | TopK operator causes a segmentation fault during compilation | All e... |
10_01_04_00
New in this Release
| Description | Notes |
|---|---|
| Improved fixed point implementations (8-bit) for LayerNorm, Softmax, Concat & Add for transformer based architectures | This requires an updated version of C7x/MMA firmware (10_01_04_00) and needs to have advanced_options:c7x_firmware set to 10_01_04_00 |
| Optimized GeLU Pattern matching to fuse the 0.5 factor as part of GeLU for lower latency | This requires an updated version of C7x/MMA firmware (10_01_04_00) and needs to have advanced_options:c7x_firmware set to 10_01_04_00 |
| Enhanced accuracy for vision transformer backbones: SWIN, DEIT, LEVIT | This requires an updated version of C7x/MMA firmware (10_01_04_00) and needs to have advanced_options:c7x_firmware set to 10_01_04_00 |
| Improved support for Deformable Convolution, GridSample operators | This requires an updated version of C7x/MMA firmware (10_01_04_00) and needs to have advanced_options:c7x_firmware set to 10_01_04_00 |
Fixed in this Release
| ID | Description | Affected Platforms |
|---|---|---|
| TIDL-4413 | Add/Sub/Mul/Div layer with input tensor dimensions of 1x1x1xCxHxW and 1x1xNxCxHxW is performing suboptimally | All except AM62 |
| TIDL-4667 | Model compilation fails if CWD of the compilation script does not have write permissions | All except AM62 |
| TIDL-6462 | TFL SSD networks fail with during compilation | All except AM62 |
| TIDL-6465 | Convolution with Fr=Fc=3 and dilation>8 (for AM62A/J722S) dilation>16 (for J721S2) gives wrong output on Host Emulation | All except AM62 |
| TIDL-7073 | Running inference on a network with option "advanced_options:inference_mode" = 2 sequentially followed by a network with "advanced_options:inference_mode" = 0 on c7x_2 or greater results in hang on TDA4VH | All except AM62 |
| TIDL-7085 | Misleading "Unable to find intializer" prints issued during model compilation | All except AM62 |
| TIDL-7090 | Models compiled using enableHighResOptimization=1 option, containing Resize layer may give segmentation fault during inference | All except AM62 |
| TIDL-7098 | Models compiled with option "advanced_options:inference_mode" = 2 and containing a constant layer followed by a layer running in TIDL_MULTI_CORE mode may result in functionally incorrect output in host emulation/target | All except AM62 |
| TIDL-7099 | Models compiled with option "advanced_options:inference_mode" = 2 and containing all layers running in TIDL_NOT_MULTI_CORE mode may result in segmentation fault | All except AM62 |
| TIDL-7112 | Model compilation hangs on device for 7x7 depthwise separable convolution if input element type is 8 bit and output element type is 16 bit | All except AM62 |
| TIDL-7125 | Convolution layer with filter coefficient width greater than one, followed by a transpose layer, gives wrong output on target | All except AM62 |
| TIDL-7137 | Incorrect conversion of Gather layer to reshape | All except AM62 |
| TIDL-7139 | Slice layer with slice along channel axis and any dimension beyond channel dimension is greater than 1 results in functionally incorrect output | All except AM62 |
Known Issues
| ID | Description | Affected Platforms | Occurrence | Workaround in this release |
|---|---|---|---|---|
| TIDL-4731 | Fusion of batch norm layer into convolution layer when batchnorm is before convolution can give incorrect results when convolution input has pad | All except AM62 | Rare | None |
| TIDL-6469 | partial_init_during_compile fails in host emulation mode | All except AM62 | Rare | None |
| TIDL-6856 | 3x1 convolution with single input and output channel fails in model compilation | All except AM62 | Rare | None |
| TIDL-6866 | Using option "advanced_options:output_feature_16bit_names_list" along with "high_resolution_optimization" = 1 and "tensor_bits = 8" results in functionally incorrect output on host emulation/target | Rare | None | |
| TIDL-7108 | Model compilation fails for OD networks which have conditional subgraph blocks present | All except AM62 | Rare | None |
| TIDL-7133 | Elementwise operation with number of channels processed in one block not a factor of total number of channels, produces wrong output on target | All except AM62 | Rare | None |
10_01_03_00
New in this Release
| Description | Notes |
|---|---|
| Improved fixed point implementations (8-bit) for LayerNorm, Softmax, Concat & Add for transformer based architectures | |
| Optimized GeLU Pattern matching to fuse the 0.5 factor as part of GeLU for lower latency | |
| Enhanced accuracy for vision transformer backbones: SWIN, DEIT, LEVIT | |
| Improved support for Deformable Convolution & GridSample operators | |
| Support for unsqueeze operator |
Fixed in this Release
| ID | Description | Affected Platforms |
|---|---|---|
| TIDL-4413 | Add/Sub/Mul/Div layer with input tensor dimensions of 1x1x1xCxHxW and 1x1xNxCxHxW is performing suboptimally | All except AM62 |
| TIDL-4667 | Model compilation fails if CWD of the compilation script does not have write permissions | All except AM62 |
| TIDL-6462 | TFL SSD networks fail with during compilation | All except AM62 |
| TIDL-6465 | Convolution with Fr=Fc=3 and dilation>8 (for AM62A/J722S) dilation>16 (for J721S2) gives wrong output on Host Emulation | All except AM62 |
| TIDL-7073 | Running inference on a network with option "advanced_options:inference_mode" = 2 sequentially followed by a network with "advanced_options:inference_mode" = 0 on c7x_2 or greater results in hang on TDA4VH | All except AM62 |
| TIDL-7085 | Misleading "Unable to find intializer" prints issued during model compilation | All except AM62 |
| TIDL-7090 | Models compiled using enableHighResOptimization=1 option, containing Resize layer may give segmentation fault during inference | All except AM62 |
| TIDL-7098 | Models compiled with option "advanced_options:inference_mode" = 2 and containing a constant layer followed by a layer running in TIDL_MULTI_CORE mode may result in functionally incorrect output in host emulation/target | All except AM62 |
| TIDL-7099 | Models compiled with option "advanced_options:inference_mode" = 2 and containing all layers running in TIDL_NOT_MULTI_CORE mode may result in segmentation fault | All except AM62 |
| TIDL-7112 | Model compilation hangs on device for 7x7 depthwise separable convolution if input element type is 8 bit and output element type is 16 bit | All except AM62 |
| TIDL-7125 | Convolution layer with filter coefficient width greater than one, followed by a transpose layer, gives wrong output on target | All except AM62 |
| TIDL-7137 | Incorrect conversion of Gather layer to reshape | All except AM62 |
| TIDL-7139 | Slice layer with slice along channel axis and any dimension beyond channel dimension is greater than 1 results in functionally incorrect output | All except AM62 |
Known Issues
| ID | Description | Affected Platforms | Occurrence | Workaround in this release |
|---|---|---|---|---|
| TIDL-4731 | Fusion of batch norm layer into convolution layer when batchnorm is before convolution can give incorrect results when convolution input has pad | All except AM62 | Rare | None |
| TIDL-6469 | partial_init_during_compile fails in host emulation mode | All except AM62 | Rare | None |
| TIDL-6856 | 3x1 convolution with single input and output channel fails in model compilation | All except AM62 | Rare | None |
| TIDL-6866 | Using option "advanced_options:output_feature_16bit_names_list" along with "high_resolution_optimization" = 1 and "tensor_bits = 8" results in functionally incorrect output on host emulation/target | All except AM62 | Rare | None |
| TIDL-7108 | Model compilation fails for OD networks which have conditional subgraph blocks present | All except AM62 | Rare | None |
| TIDL-7133 | Elementwise operation with number of channels processed in one block not a factor of total number of channels, produces wrong output on target | All except AM62 | Rare | None |
10_01_00_02
New in this Release
| Description | Notes |
|---|---|
| Support for ONNXRUNTIME 1.15.0 | |
| Support for several new operators: TopK, Sqrt, Sin, Pow, Mish, Log, Instance Normalization, HSWISH, Floor, Exp, ERF, AsinH, Asin & Abs | |
| Improved support for networks with a large number of operators (>2K) | |
| Support for improved latency & weight sparsity | Specific to J722S/AM67A/TDA4AEN platforms |
Fixed in this Release
| ID | Description | Affected Platforms |
|---|---|---|
| TIDL-6871 | Softmax (with output type float) gives incorrect results when axis is set to width and width < 16 | All except AM62 |
| TIDL-6865 | Elementwise layers with dimension N1xC1xH1xW1 and N2xC2xH2xW2, gives functionally incorrect output on target, if H1 or H2 is 1 and H1 != H2 and C1 == C2 > 1 | All except AM62 |
| TIDL-6485 | Models compiled with option "advanced_options:inference_mode" = 2 and containing a Constant Data layer H > 1 will result in functionally incorrect output | All except AM62 |
| TIDL-6473 | Models compiled with option "advanced_options:inference_mode" = 2 and containing a layer running in TIDL_NOT_MULTI_CORE mode followed by Slice layer running in TIDL_MULTI_CORE mode may result in functionally incorrect output in host emulation/target | All except AM62 |
| TIDL-6461 | Using "advanced_options:inference_mode" = 2 and "debug_level" >=3 may result in error for debug stitching script for some networks | All except AM62 |
| TIDL-6418 | Models compiled with "advanced_options:inference_mode" = 2 compilation option may result in functionally incorrect outputs in if the model has Slice/Reshape layers | All except AM62 |
| TIDL-5169 | Dataconvert layer with layout conversion from NCHW->NHWC at the output of network returns TIDLRT_create time error if number of output channels for this layer is equal to one | All except AM62 |
| TIDL-5167 | Layers with multiple input may result into functional issue if inputs have different padding in the buffer | All except AM62 |
| TIDL-5166 | Matmul layer with A matrix broadcast in channel axis results in crash on target/EVM | All except AM62 |
| TIDL-5162 | Memory planning fails for models having batches with broadcast | All except AM62 |
| TIDL-4868 | Reshape layer accidentally gets denied with message : "Input volume should be equal to output volume" | All except AM62 |
| TIDL-4855 | ONNX Runtime does not report correct copy cycles from get_TI_benchmark_data | All except AM62 |
| TIDL-4833 | Networks erroring out with message "tidlReadPerChannelMeanStatistics : Unable to read Per Channel Mean statistics" | All except AM62 |
| TIDL-4832 | Networks with GEMM are not correctly getting denied, with the following error towards the end "Gemm layer is not supported in TIDL when bias size != output width" | All except AM62 |
| TIDL-4714 | Networks with >1536 operators in a single graph fail to compile | All except AM62 |
| TIDL-4460 | Model compilation fails for networks with Transpose layers with following error message : "Failed to Allocate memory record 7 @ space = 17 and size = xxxxxx !!!" | |
| TIDL-4367 | Networks with multiple branch where first layer in any one of the branch is a reshape layer gives functionally wrong output | All except AM62 |
| TIDL-3928 | Sub operator with variable input get's incorrectly offloaded to C7x and results in an init failure during inference | All except AM62 |
| TIDL-3902 | Model compiled with enableHighResOptimization=1 option, with any convolution layer's weights volume plus 192 * number of input channels greater than 224KB(for AM62A/J722S) or 448KB (for all other devices), may result into hang on target | All except AM62 |
| TIDL-2947 | Convolution with pad greater than the input width results in incorrect outputs | All except AM62 |
Known Issues
| ID | Description | Affected Platforms | Occurrence | Workaround in this release |
|---|---|---|---|---|
| TIDL-7073 | Running inference on a network with option "advanced_options:inference_mode" = 2 sequentially followed by a network with "advanced_options:inference_mode" = 0 on c7x_2 or greater results in hang on target | All except AM62 | Rare | None |
| TIDL-6866 | Using option "advanced_options:output_feature_16bit_names_list" along with "high_resolution_optimization" = 1 and "tensor_bits = 8" results in functionally incorrect output on host emulation/target | All except AM62 | Rare | None |
| TIDL-6856 | 3x1 convolution with single input and output channel fails in model compilation | All except AM62 | Rare | None |
| TIDL-6469 | partial_init_during_compile fails in host emulation mode | All except AM62 | Frequent | None |
| TIDL-6465 | Convolution with Fr=Fc=3 and dilation>8 (for AM62A/J722S) dilation>16 (for other devices) gives wrong output on Host Emulation | All except AM62 | Rare | None |
| TIDL-4731 | Fusion of batch norm layer into convolution layer when batchnorm is before convolution can give incorrect results when convolution input has pad | All except AM62 | Rare | None |
| TIDL-3865 | Elementwise layers with broadcast along width or height or both and number of channels > 1 produces incorrect outputs on device | All except AM62 | Rare | None |