Skip to content

Commit b9abeae

Browse files
author
Jenkins
committed
arm_compute v18.11
1 parent 52ba29e commit b9abeae

File tree

12,114 files changed

+714002
-427236
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

12,114 files changed

+714002
-427236
lines changed

README.md

Lines changed: 7 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,22 @@
1-
1+
Release repository: https://github.com/arm-software/ComputeLibrary
2+
Development repository: https://review.mlplatform.org/#/admin/projects/ml/ComputeLibrary
23
Please report issues here: https://github.com/ARM-software/ComputeLibrary/issues
34
**Make sure you are using the latest version of the library before opening an issue. Thanks**
45

56
News:
67

7-
- We're hiring: Staff Machine Learning C++ Software Engineer in Cambridge (UK)
8-
- Required skills:
9-
- Proficient in C++11.
10-
- Preferred skills:
11-
- Some SIMD (Preferably NEON and/or OpenCL) experience
12-
- Some machine learning / computer vision knowledge
13-
- Familiarity in developing compute-intensive applications and ideally industry experience of product development
14-
- Experience programming in assembly language.
15-
16-
Interested ? Contact us: [email protected]
178
- [Gian Marco's talk on optimizing CNNs with Winograd algorithms at the EVS](https://www.embedded-vision.com/platinum-members/arm/embedded-vision-training/videos/pages/may-2018-embedded-vision-summit-iodice)
9+
- [Gian Marco's talk on using SGEMM and FFTs to Accelerate Deep Learning](https://www.embedded-vision.com/platinum-members/arm/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit-iodice)
1810

1911
Related projects:
2012

2113
- [Arm NN SDK](https://github.com/arm-software/armnn)
22-
- [Caffe on Compute Library](https://github.com/OAID/Caffe-HRT)
2314
- [Tutorial: Cartoonifying Images on Raspberry Pi with the Compute Library](https://community.arm.com/graphics/b/blog/posts/cartoonifying-images-on-raspberry-pi-with-the-compute-library)
2415
- [Tutorial: Running AlexNet on Raspberry Pi with Compute Library](https://community.arm.com/processors/b/blog/posts/running-alexnet-on-raspberry-pi-with-compute-library)
2516

2617
Documentation available here:
2718

19+
- [v18.11](https://arm-software.github.io/ComputeLibrary/v18.11/)
2820
- [v18.08](https://arm-software.github.io/ComputeLibrary/v18.08/)
2921
- [v18.05](https://arm-software.github.io/ComputeLibrary/v18.05/)
3022
- [v18.03](https://arm-software.github.io/ComputeLibrary/v18.03/)
@@ -40,6 +32,8 @@ Documentation available here:
4032

4133
Binaries available here:
4234

35+
- [v18.11-linux](https://github.com/ARM-software/ComputeLibrary/releases/download/v18.08/arm_compute-v18.11-bin-linux.tar.gz)
36+
- [v18.11-android](https://github.com/ARM-software/ComputeLibrary/releases/download/v18.08/arm_compute-v18.11-bin-android.tar.gz)
4337
- [v18.08-linux](https://github.com/ARM-software/ComputeLibrary/releases/download/v18.08/arm_compute-v18.08-bin-linux.tar.gz)
4438
- [v18.08-android](https://github.com/ARM-software/ComputeLibrary/releases/download/v18.08/arm_compute-v18.08-bin-android.tar.gz)
4539
- [v18.05-linux](https://github.com/ARM-software/ComputeLibrary/releases/download/v18.05/arm_compute-v18.05-bin-linux.tar.gz)
@@ -57,6 +51,6 @@ Binaries available here:
5751
- [v17.04](https://github.com/ARM-software/ComputeLibrary/releases/download/v17.04/arm_compute-v17.04-bin.tar.gz)
5852
- [v17.03.1](https://github.com/ARM-software/ComputeLibrary/releases/download/v17.03.1/arm_compute-v17.03.1-bin.tar.gz)
5953

60-
54+
6155

6256
License & Contributions: The software is provided under MIT license. Contributions to this project are accepted under the same license.

SConscript

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,11 +24,12 @@ import os.path
2424
import re
2525
import subprocess
2626

27-
VERSION = "v18.08"
28-
SONAME_VERSION="12.0.0"
27+
VERSION = "v18.11"
28+
SONAME_VERSION="13.0.0"
2929

3030
Import('env')
3131
Import('vars')
32+
Import('install_lib')
3233

3334
def build_library(name, sources, static=False, libs=[]):
3435
if static:
@@ -53,6 +54,7 @@ def build_library(name, sources, static=False, libs=[]):
5354
else:
5455
obj = arm_compute_env.SharedLibrary(name, source=sources, LIBS = arm_compute_env["LIBS"] + libs)
5556

57+
obj = install_lib(obj)
5658
Default(obj)
5759
return obj
5860

@@ -208,6 +210,8 @@ if env['neon']:
208210

209211
if "arm64-v8" in env['arch']:
210212
core_files += Glob('src/core/NEON/kernels/arm_gemm/kernels/a64_*/*.cpp')
213+
if "sve" in env['arch']:
214+
core_files += Glob('src/core/NEON/kernels/arm_gemm/kernels/sve_*/*.cpp')
211215

212216
runtime_files += Glob('src/runtime/NEON/*.cpp')
213217
runtime_files += Glob('src/runtime/NEON/functions/*.cpp')

SConstruct

Lines changed: 58 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ vars.AddVariables(
4040
BoolVariable("debug", "Debug", False),
4141
BoolVariable("asserts", "Enable asserts (this flag is forced to 1 for debug=1)", False),
4242
BoolVariable("logging", "Logging (this flag is forced to 1 for debug=1)", False),
43-
EnumVariable("arch", "Target Architecture", "armv7a", allowed_values=("armv7a", "arm64-v8a", "arm64-v8.2-a", "x86_32", "x86_64")),
43+
EnumVariable("arch", "Target Architecture", "armv7a", allowed_values=("armv7a", "arm64-v8a", "arm64-v8.2-a", "arm64-v8.2-a-sve", "x86_32", "x86_64")),
4444
EnumVariable("os", "Target OS", "linux", allowed_values=("linux", "android", "bare_metal")),
4545
EnumVariable("build", "Build type", "cross_compile", allowed_values=("native", "cross_compile", "embed_only")),
4646
BoolVariable("examples", "Build example programs", True),
@@ -54,21 +54,52 @@ vars.AddVariables(
5454
BoolVariable("openmp", "Enable OpenMP backend", False),
5555
BoolVariable("cppthreads", "Enable C++11 threads backend", True),
5656
PathVariable("build_dir", "Specify sub-folder for the build", ".", PathVariable.PathAccept),
57+
PathVariable("install_dir", "Specify sub-folder for the install", "", PathVariable.PathAccept),
5758
("extra_cxx_flags", "Extra CXX flags to be appended to the build command", ""),
59+
("extra_link_flags", "Extra LD flags to be appended to the build command", ""),
5860
("compiler_cache", "Command to prefix to the C and C++ compiler (e.g ccache)", "")
5961
)
6062

6163
env = Environment(platform="posix", variables=vars, ENV = os.environ)
62-
env.Append(LIBPATH = ["#build/%s" % env['build_dir']])
64+
build_path = env['build_dir']
65+
# If build_dir is a relative path then add a #build/ prefix:
66+
if not env['build_dir'].startswith('/'):
67+
SConsignFile('build/%s/.scons' % build_path)
68+
build_path = "#build/%s" % build_path
69+
else:
70+
SConsignFile('%s/.scons' % build_path)
71+
72+
install_path = env['install_dir']
73+
#If the install_dir is a relative path then assume it's from inside build_dir
74+
if not env['install_dir'].startswith('/') and install_path != "":
75+
install_path = "%s/%s" % (build_path, install_path)
76+
77+
env.Append(LIBPATH = [build_path])
6378
Export('env')
6479
Export('vars')
6580

66-
SConsignFile('build/.%s' % env['build_dir'])
81+
def install_lib( lib ):
82+
# If there is no install folder, then there is nothing to do:
83+
if install_path == "":
84+
return lib
85+
return env.Install( "%s/lib/" % install_path, lib)
86+
def install_bin( bin ):
87+
# If there is no install folder, then there is nothing to do:
88+
if install_path == "":
89+
return bin
90+
return env.Install( "%s/bin/" % install_path, bin)
91+
def install_include( inc ):
92+
if install_path == "":
93+
return inc
94+
return env.Install( "%s/include/" % install_path, inc)
95+
96+
Export('install_lib')
97+
Export('install_bin')
6798

6899
Help(vars.GenerateHelpText(env))
69100

70101
if env['build'] == "embed_only":
71-
SConscript('./SConscript', variant_dir='#build/%s' % env['build_dir'], duplicate=0)
102+
SConscript('./SConscript', variant_dir=build_path, duplicate=0)
72103
Return()
73104

74105
if env['neon'] and 'x86' in env['arch']:
@@ -142,17 +173,23 @@ elif env['arch'] == 'arm64-v8a':
142173
prefix = "aarch64-linux-android-"
143174
if 'clang++' in cpp_compiler:
144175
env.Append(CXXFLAGS = ['-no-integrated-as'])
145-
elif env['arch'] == 'arm64-v8.2-a':
146-
env.Append(CXXFLAGS = ['-march=armv8.2-a+fp16']) # explicitly enable fp16 extension otherwise __ARM_FEATURE_FP16_VECTOR_ARITHMETIC is undefined
176+
elif 'arm64-v8.2-a' in env['arch']:
177+
if env['arch'] == 'arm64-v8.2-a-sve':
178+
if env['os'] != 'bare_metal':
179+
print("Only bare metal SVE is supported at the moment")
180+
Exit(1)
181+
env.Append(CXXFLAGS = ['-march=armv8.2-a+sve+fp16+dotprod'])
182+
else:
183+
env.Append(CXXFLAGS = ['-march=armv8.2-a+fp16']) # explicitly enable fp16 extension otherwise __ARM_FEATURE_FP16_VECTOR_ARITHMETIC is undefined
184+
if env['os'] == 'linux':
185+
prefix = "aarch64-linux-gnu-"
186+
elif env['os'] == 'bare_metal':
187+
prefix = "aarch64-elf-"
188+
elif env['os'] == 'android':
189+
prefix = "aarch64-linux-android-"
147190
env.Append(CPPDEFINES = ['ARM_COMPUTE_AARCH64_V8_2','NO_DOT_IN_TOOLCHAIN'])
148191
if 'clang++' in cpp_compiler:
149192
env.Append(CXXFLAGS = ['-no-integrated-as'])
150-
if env['os'] == 'linux':
151-
prefix = "aarch64-linux-gnu-"
152-
elif env['os'] == 'bare_metal':
153-
prefix = "aarch64-elf-"
154-
elif env['os'] == 'android':
155-
prefix = "aarch64-linux-android-"
156193
elif env['arch'] == 'x86_32':
157194
env.Append(CCFLAGS = ['-m32'])
158195
env.Append(LINKFLAGS = ['-m32'])
@@ -242,20 +279,24 @@ if env['logging']:
242279

243280
env.Append(CPPPATH = ['#/include', "#"])
244281
env.Append(CXXFLAGS = env['extra_cxx_flags'])
282+
env.Append(LINKFLAGS = env['extra_link_flags'])
283+
284+
Default( install_include("arm_compute"))
285+
Default( install_include("support"))
245286

246287
Export('version_at_least')
247288

248289
if env['opencl']:
249-
SConscript("./opencl-1.2-stubs/SConscript", variant_dir="build/%s/opencl-1.2-stubs" % env['build_dir'], duplicate=0)
290+
SConscript("./opencl-1.2-stubs/SConscript", variant_dir="%s/opencl-1.2-stubs" % build_path, duplicate=0)
250291

251292
if env['gles_compute'] and env['os'] != 'android':
252293
env.Append(CPPPATH = ['#/include/linux'])
253-
SConscript("./opengles-3.1-stubs/SConscript", variant_dir="build/%s/opengles-3.1-stubs" % env['build_dir'], duplicate=0)
294+
SConscript("./opengles-3.1-stubs/SConscript", variant_dir="%s/opengles-3.1-stubs" % build_path, duplicate=0)
254295

255-
SConscript('./SConscript', variant_dir='#build/%s' % env['build_dir'], duplicate=0)
296+
SConscript('./SConscript', variant_dir=build_path, duplicate=0)
256297

257298
if env['examples'] and env['os'] != 'bare_metal':
258-
SConscript('./examples/SConscript', variant_dir='#build/%s/examples' % env['build_dir'], duplicate=0)
299+
SConscript('./examples/SConscript', variant_dir='%s/examples' % build_path, duplicate=0)
259300

260301
if env['os'] != 'bare_metal':
261-
SConscript('./tests/SConscript', variant_dir='#build/%s/tests' % env['build_dir'], duplicate=0)
302+
SConscript('./tests/SConscript', variant_dir='%s/tests' % build_path, duplicate=0)

arm_compute/core/CL/CLHelpers.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,14 @@ static constexpr unsigned int max_cl_vector_width = 16;
4747
*/
4848
std::string get_cl_type_from_data_type(const DataType &dt);
4949

50+
/** Translates a tensor data type to the appropriate OpenCL select type.
51+
*
52+
* @param[in] dt @ref DataType to be translated to OpenCL select type.
53+
*
54+
* @return The string specifying the OpenCL select type to be used.
55+
*/
56+
std::string get_cl_select_type_from_data_type(const DataType &dt);
57+
5058
/** Get the size of a data type in number of bits.
5159
*
5260
* @param[in] dt @ref DataType.

arm_compute/core/CL/CLKernels.h

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,10 +32,12 @@
3232
#include "arm_compute/core/CL/kernels/CLArithmeticDivisionKernel.h"
3333
#include "arm_compute/core/CL/kernels/CLArithmeticSubtractionKernel.h"
3434
#include "arm_compute/core/CL/kernels/CLBatchNormalizationLayerKernel.h"
35+
#include "arm_compute/core/CL/kernels/CLBatchToSpaceLayerKernel.h"
3536
#include "arm_compute/core/CL/kernels/CLBitwiseAndKernel.h"
3637
#include "arm_compute/core/CL/kernels/CLBitwiseNotKernel.h"
3738
#include "arm_compute/core/CL/kernels/CLBitwiseOrKernel.h"
3839
#include "arm_compute/core/CL/kernels/CLBitwiseXorKernel.h"
40+
#include "arm_compute/core/CL/kernels/CLBoundingBoxTransformKernel.h"
3941
#include "arm_compute/core/CL/kernels/CLBox3x3Kernel.h"
4042
#include "arm_compute/core/CL/kernels/CLCannyEdgeKernel.h"
4143
#include "arm_compute/core/CL/kernels/CLChannelCombineKernel.h"
@@ -64,10 +66,13 @@
6466
#include "arm_compute/core/CL/kernels/CLFillBorderKernel.h"
6567
#include "arm_compute/core/CL/kernels/CLFlattenLayerKernel.h"
6668
#include "arm_compute/core/CL/kernels/CLFloorKernel.h"
69+
#include "arm_compute/core/CL/kernels/CLFuseBatchNormalizationKernel.h"
6770
#include "arm_compute/core/CL/kernels/CLGEMMInterleave4x4Kernel.h"
6871
#include "arm_compute/core/CL/kernels/CLGEMMLowpMatrixMultiplyKernel.h"
6972
#include "arm_compute/core/CL/kernels/CLGEMMLowpOffsetContributionKernel.h"
73+
#include "arm_compute/core/CL/kernels/CLGEMMLowpOffsetContributionOutputStageKernel.h"
7074
#include "arm_compute/core/CL/kernels/CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFixedPointKernel.h"
75+
#include "arm_compute/core/CL/kernels/CLGEMMLowpQuantizeDownInt32ToUint8ScaleByFloatKernel.h"
7176
#include "arm_compute/core/CL/kernels/CLGEMMLowpQuantizeDownInt32ToUint8ScaleKernel.h"
7277
#include "arm_compute/core/CL/kernels/CLGEMMLowpReductionKernel.h"
7378
#include "arm_compute/core/CL/kernels/CLGEMMMatrixAccumulateBiasesKernel.h"
@@ -78,6 +83,7 @@
7883
#include "arm_compute/core/CL/kernels/CLGaussian3x3Kernel.h"
7984
#include "arm_compute/core/CL/kernels/CLGaussian5x5Kernel.h"
8085
#include "arm_compute/core/CL/kernels/CLGaussianPyramidKernel.h"
86+
#include "arm_compute/core/CL/kernels/CLGenerateProposalsLayerKernel.h"
8187
#include "arm_compute/core/CL/kernels/CLHOGDescriptorKernel.h"
8288
#include "arm_compute/core/CL/kernels/CLHOGDetectorKernel.h"
8389
#include "arm_compute/core/CL/kernels/CLHarrisCornersKernel.h"
@@ -90,35 +96,46 @@
9096
#include "arm_compute/core/CL/kernels/CLMagnitudePhaseKernel.h"
9197
#include "arm_compute/core/CL/kernels/CLMeanStdDevKernel.h"
9298
#include "arm_compute/core/CL/kernels/CLMedian3x3Kernel.h"
99+
#include "arm_compute/core/CL/kernels/CLMemsetKernel.h"
93100
#include "arm_compute/core/CL/kernels/CLMinMaxLayerKernel.h"
94101
#include "arm_compute/core/CL/kernels/CLMinMaxLocationKernel.h"
95102
#include "arm_compute/core/CL/kernels/CLNonLinearFilterKernel.h"
96103
#include "arm_compute/core/CL/kernels/CLNonMaximaSuppression3x3Kernel.h"
97104
#include "arm_compute/core/CL/kernels/CLNormalizationLayerKernel.h"
105+
#include "arm_compute/core/CL/kernels/CLNormalizePlanarYUVLayerKernel.h"
98106
#include "arm_compute/core/CL/kernels/CLPermuteKernel.h"
99107
#include "arm_compute/core/CL/kernels/CLPixelWiseMultiplicationKernel.h"
100108
#include "arm_compute/core/CL/kernels/CLPoolingLayerKernel.h"
109+
#include "arm_compute/core/CL/kernels/CLPriorBoxLayerKernel.h"
101110
#include "arm_compute/core/CL/kernels/CLQuantizationLayerKernel.h"
111+
#include "arm_compute/core/CL/kernels/CLROIAlignLayerKernel.h"
102112
#include "arm_compute/core/CL/kernels/CLROIPoolingLayerKernel.h"
103113
#include "arm_compute/core/CL/kernels/CLReductionOperationKernel.h"
104114
#include "arm_compute/core/CL/kernels/CLRemapKernel.h"
115+
#include "arm_compute/core/CL/kernels/CLReorgLayerKernel.h"
105116
#include "arm_compute/core/CL/kernels/CLReshapeLayerKernel.h"
106117
#include "arm_compute/core/CL/kernels/CLScaleKernel.h"
107118
#include "arm_compute/core/CL/kernels/CLScharr3x3Kernel.h"
108119
#include "arm_compute/core/CL/kernels/CLSobel3x3Kernel.h"
109120
#include "arm_compute/core/CL/kernels/CLSobel5x5Kernel.h"
110121
#include "arm_compute/core/CL/kernels/CLSobel7x7Kernel.h"
111122
#include "arm_compute/core/CL/kernels/CLSoftmaxLayerKernel.h"
123+
#include "arm_compute/core/CL/kernels/CLSpaceToBatchLayerKernel.h"
124+
#include "arm_compute/core/CL/kernels/CLStridedSliceKernel.h"
112125
#include "arm_compute/core/CL/kernels/CLTableLookupKernel.h"
113126
#include "arm_compute/core/CL/kernels/CLThresholdKernel.h"
114127
#include "arm_compute/core/CL/kernels/CLTransposeKernel.h"
128+
#include "arm_compute/core/CL/kernels/CLUpsampleLayerKernel.h"
115129
#include "arm_compute/core/CL/kernels/CLWarpAffineKernel.h"
116130
#include "arm_compute/core/CL/kernels/CLWarpPerspectiveKernel.h"
117131
#include "arm_compute/core/CL/kernels/CLWeightsReshapeKernel.h"
132+
#include "arm_compute/core/CL/kernels/CLWidthConcatenate2TensorsKernel.h"
133+
#include "arm_compute/core/CL/kernels/CLWidthConcatenate4TensorsKernel.h"
118134
#include "arm_compute/core/CL/kernels/CLWidthConcatenateLayerKernel.h"
119135
#include "arm_compute/core/CL/kernels/CLWinogradFilterTransformKernel.h"
120136
#include "arm_compute/core/CL/kernels/CLWinogradInputTransformKernel.h"
121137
#include "arm_compute/core/CL/kernels/CLWinogradOutputTransformKernel.h"
138+
#include "arm_compute/core/CL/kernels/CLYOLOLayerKernel.h"
122139
#include "arm_compute/core/CL/kernels/ICLDepthwiseConvolutionLayer3x3Kernel.h"
123140

124141
#endif /* __ARM_COMPUTE_CLKERNELS_H__ */

arm_compute/core/CL/OpenCL.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,9 @@
3737
#pragma GCC diagnostic push
3838
#pragma GCC diagnostic ignored "-Weffc++"
3939
#pragma GCC diagnostic ignored "-Wignored-qualifiers"
40+
#if defined(__GNUG__) && __GNUG__ >= 8
41+
#pragma GCC diagnostic ignored "-Wcatch-value"
42+
#endif // defined(__GNUG__) && __GNUG__ >= 8
4043
#include <CL/cl2.hpp>
4144
#pragma GCC diagnostic pop
4245

@@ -114,6 +117,7 @@ class CLSymbols final
114117
DECLARE_FUNCTION_PTR(clReleaseMemObject);
115118
DECLARE_FUNCTION_PTR(clGetDeviceInfo);
116119
DECLARE_FUNCTION_PTR(clGetDeviceIDs);
120+
DECLARE_FUNCTION_PTR(clGetMemObjectInfo);
117121
DECLARE_FUNCTION_PTR(clRetainEvent);
118122
DECLARE_FUNCTION_PTR(clGetPlatformIDs);
119123
DECLARE_FUNCTION_PTR(clGetKernelWorkGroupInfo);

arm_compute/core/CL/kernels/CLArithmeticAdditionKernel.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ class CLArithmeticAdditionKernel : public ICLKernel
5151
CLArithmeticAdditionKernel &operator=(CLArithmeticAdditionKernel &&) = default;
5252
/** Default destructor */
5353
~CLArithmeticAdditionKernel() = default;
54-
/** Initialise the kernel's inputs, output and convertion policy.
54+
/** Initialise the kernel's inputs, output and conversion policy.
5555
*
5656
* @param[in] input1 First tensor input. Data types supported: U8/QASYMM8/S16/F16/F32.
5757
* @param[in] input2 Second tensor input. Data types supported: U8, QASYMM8 (only if @p input1 is QASYMM8), S16/F16/F32.

arm_compute/core/CL/kernels/CLArithmeticSubtractionKernel.h

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -53,19 +53,19 @@ class CLArithmeticSubtractionKernel : public ICLKernel
5353
/** Default destructor */
5454
~CLArithmeticSubtractionKernel() = default;
5555

56-
/** Initialise the kernel's inputs, output and convertion policy.
56+
/** Initialise the kernel's inputs, output and conversion policy.
5757
*
58-
* @param[in] input1 First tensor input. Data types supported: U8/S16/F16/F32.
59-
* @param[in] input2 Second tensor input. Data types supported: U8/S16/F16/F32.
60-
* @param[out] output Output tensor. Data types supported: U8 (Only if both inputs are U8), S16/F16/F32.
58+
* @param[in] input1 First tensor input. Data types supported: U8/QASYMM8/S16/F16/F32.
59+
* @param[in] input2 Second tensor input. Data types supported: U8/QASYMM8/S16/F16/F32.
60+
* @param[out] output Output tensor. Data types supported: U8 (Only if both inputs are U8), QASYMM8/S16/F16/F32.
6161
* @param[in] policy Policy to use to handle overflow.
6262
*/
6363
void configure(const ICLTensor *input1, const ICLTensor *input2, ICLTensor *output, ConvertPolicy policy);
6464
/** Static function to check if given info will lead to a valid configuration of @ref CLArithmeticSubtractionKernel
6565
*
66-
* @param[in] input1 First tensor input info. Data types supported: U8/S16/F16/F32.
67-
* @param[in] input2 Second tensor input info. Data types supported: U8/S16/F16/F32.
68-
* @param[in] output Output tensor info. Data types supported: U8 (Only if both inputs are U8), S16/F16/F32.
66+
* @param[in] input1 First tensor input info. Data types supported: U8/QASYMM8/S16/F16/F32.
67+
* @param[in] input2 Second tensor input info. Data types supported: U8/QASYMM8/S16/F16/F32.
68+
* @param[in] output Output tensor info. Data types supported: U8 (Only if both inputs are U8), QASYMM8/S16/F16/F32.
6969
* @param[in] policy Policy to use to handle overflow.
7070
*
7171
* @return a status
@@ -74,6 +74,7 @@ class CLArithmeticSubtractionKernel : public ICLKernel
7474

7575
// Inherited methods overridden:
7676
void run(const Window &window, cl::CommandQueue &queue) override;
77+
BorderSize border_size() const override;
7778

7879
private:
7980
const ICLTensor *_input1; /**< Source tensor 1 */

0 commit comments

Comments
 (0)