-
Notifications
You must be signed in to change notification settings - Fork 696
Description
π Describe the bug
When executing XNNPack on models (listed below), errors occur.
We use benchmark.py from #11039.
@SS-JIA @kimishpatel @mergennachin
Commands:
# pte generation
python3 -m examples.xnnpack.aot_compiler --model_name dl3 --delegate --quantize -o xnnpack_pte/
# # xnnpack executor build
#!/bin/bash
if [[ -z $ANDROID_NDK_ROOT ]]; then
echo "Please export ANDROID_NDK_ROOT=/path/to/ndk"
exit -1
fi
CLEAN_BUILD="false"
BUILD_FOLDER="build-xnnpack"
BUILD_TYPE="release"
while [[ "$#" -gt 0 ]]; do
case "$1" in
-c|--clean_build) CLEAN_BUILD="true"; shift;;
-d|--debug) BUILD_TYPE="Debug"; shift;;
*) echo "unknow arg passed: $1"; exit 1;;
esac
shift
done
if [ "$CLEAN_BUILD" = true ]; then
rm -rf $BUILD_FOLDER
fi
cmake \
-DCMAKE_INSTALL_PREFIX=$BUILD_FOLDER \
-DCMAKE_BUILD_TYPE=$BUILD_TYPE \
-DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK_ROOT/build/cmake/android.toolchain.cmake \
-DANDROID_ABI='arm64-v8a' \
-DEXECUTORCH_BUILD_EXTENSION_DATA_LOADER=ON \
-DEXECUTORCH_BUILD_EXTENSION_MODULE=ON \
-DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON \
-DEXECUTORCH_BUILD_XNNPACK=ON \
-DEXECUTORCH_ENABLE_LOGGING=ON \
-DPYTHON_EXECUTABLE=python \
-B$BUILD_FOLDER .
cmake --build $BUILD_FOLDER -j9 --target install --config $BUILD_TYPE
# benchmark
python3 benchmark.py -p xnnpack_pte/dl3_xnnpack_q8.pte -s <ADB_SERIAL_NUM> -b xnn
output summary:
name load 1st avg peak_mem avg_mem note
xnnpack_pte/dl3_xnnpack_fp32.pte 0 0 0 0 0 Aborted
xnnpack_pte/dl3_xnnpack_q8.pte 0 0 0 0 0 Aborted
xnnpack_pte/edsr_xnnpack_fp32.pte 14.256 612.801 468.82162 323398 323236.459 success
xnnpack_pte/edsr_xnnpack_q8.pte 0 0 0 0 0 Aborted
xnnpack_pte/emformer_transcribe_xnnpack_fp32.pte 0 0 0 0 0 libc++abi: terminating due to uncaught exception of type std::out_of_range: unordered_map::at: key not found
xnnpack_pte/emformer_transcribe_xnnpack_q8.pte 0 0 0 0 0 Aborted
xnnpack_pte/ic3_xnnpack_fp32.pte 0 0 0 0 0 Aborted
xnnpack_pte/ic3_xnnpack_q8.pte 0 0 0 0 0 Aborted
xnnpack_pte/ic4_xnnpack_fp32.pte 0 0 0 3234 3234.0 Aborted
xnnpack_pte/ic4_xnnpack_q8.pte 0 0 0 0 0 Aborted
xnnpack_pte/mobilebert_xnnpack_fp32.pte 0 0 0 10530 10530.0 libc++abi: terminating due to uncaught exception of type std::out_of_range: unordered_map::at: key not found
xnnpack_pte/mv3_xnnpack_fp32.pte 11.439 64.515 2.68764 0 0 awk: division by zero
xnnpack_pte/mv3_xnnpack_q8.pte 0 0 0 1814 1814.0 Aborted
xnnpack_pte/resnet18_xnnpack_fp32.pte 95.514 86.227 13.44775 66779 66768.308 success
xnnpack_pte/resnet18_xnnpack_q8.pte 0 0 0 0 0 Aborted
xnnpack_pte/resnet50_xnnpack_fp32.pte 108.011 181.618 31.16065 171762 128501.375 Unable to read dmabuf info for 29909
xnnpack_pte/resnet50_xnnpack_q8.pte 0 0 0 0 0 Aborted
xnnpack_pte/vit_xnnpack_fp32.pte 0 0 0 0 0 Aborted
xnnpack_pte/vit_xnnpack_q8.pte 0 0 0 0 0 Aborted
xnnpack_pte/w2l_xnnpack_fp32.pte 111.403 122.821 18.16669 224629 143850.55 success
outputs
command: python3 benchmark.py -p xnnpack_pte/dl3_xnnpack_fp32.pte -s 172.17.32.1:80 -b xnn
Aborted
awk: division by zero
source line number 1
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/dl3_xnnpack_q8.pte -s 172.17.32.1:80 -b xnn
Aborted
awk: division by zero
source line number 1
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/edsr_xnnpack_fp32.pte -s 172.17.32.1:80 -b xnn
load: 14.256000
1st: 612.801000
avg: 468.821620
peak_mem: 323398
avg_mem: 323236.459
command: python3 benchmark.py -p xnnpack_pte/edsr_xnnpack_q8.pte -s 172.17.32.1:80 -b xnn
Aborted
awk: division by zero
source line number 1
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/emformer_transcribe_xnnpack_fp32.pte -s 172.17.32.1:80 -b xnn
libc++abi: terminating due to uncaught exception of type std::out_of_range: unordered_map::at: key not found
Aborted
awk: division by zero
source line number 1
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/emformer_transcribe_xnnpack_q8.pte -s 172.17.32.1:80 -b xnn
Aborted
awk: division by zero
source line number 1
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/ic3_xnnpack_fp32.pte -s 172.17.32.1:80 -b xnn
Aborted
Invalid arguments - only one [PID] argument is allowed
Usage: dmabuf_dump [-abh] [PID] [-o <raw|csv>]
-a show all dma buffers (ion) in big table, [buffer x process] grid
-b show DMA-BUF per-buffer, per-exporter and per-device statistics
-o [raw][csv] print output in the specified format.
-h show this help
If PID is supplied, the dmabuf information for that process is shown.
Per-buffer DMA-BUF stats do not take an argument.
awk: division by zero
source line number 1
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/ic3_xnnpack_q8.pte -s 172.17.32.1:80 -b xnn
Aborted
awk: division by zero
source line number 1
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/ic4_xnnpack_fp32.pte -s 172.17.32.1:80 -b xnn
Aborted
Invalid arguments - only one [PID] argument is allowed
Usage: dmabuf_dump [-abh] [PID] [-o <raw|csv>]
-a show all dma buffers (ion) in big table, [buffer x process] grid
-b show DMA-BUF per-buffer, per-exporter and per-device statistics
-o [raw][csv] print output in the specified format.
-h show this help
If PID is supplied, the dmabuf information for that process is shown.
Per-buffer DMA-BUF stats do not take an argument.
peak_mem: 3234
avg_mem: 3234.000
command: python3 benchmark.py -p xnnpack_pte/ic4_xnnpack_q8.pte -s 172.17.32.1:80 -b xnn
Aborted
awk: division by zero
source line number 1
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/mobilebert_xnnpack_fp32.pte -s 172.17.32.1:80 -b xnn
libc++abi: terminating due to uncaught exception of type std::out_of_range: unordered_map::at: key not found
Aborted
peak_mem: 10530
avg_mem: 10530.000
command: python3 benchmark.py -p xnnpack_pte/mv3_xnnpack_fp32.pte -s 172.17.32.1:80 -b xnn
awk: division by zero
source line number 1
load: 11.439000
1st: 64.515000
avg: 2.687640
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/mv3_xnnpack_q8.pte -s 172.17.32.1:80 -b xnn
Aborted
Invalid arguments - only one [PID] argument is allowed
Usage: dmabuf_dump [-abh] [PID] [-o <raw|csv>]
-a show all dma buffers (ion) in big table, [buffer x process] grid
-b show DMA-BUF per-buffer, per-exporter and per-device statistics
-o [raw][csv] print output in the specified format.
-h show this help
If PID is supplied, the dmabuf information for that process is shown.
Per-buffer DMA-BUF stats do not take an argument.
peak_mem: 1814
avg_mem: 1814.000
command: python3 benchmark.py -p xnnpack_pte/resnet18_xnnpack_fp32.pte -s 172.17.32.1:80 -b xnn
load: 95.514000
1st: 86.227000
avg: 13.447750
peak_mem: 66779
avg_mem: 66768.308
command: python3 benchmark.py -p xnnpack_pte/resnet18_xnnpack_q8.pte -s 172.17.32.1:80 -b xnn
Aborted
awk: division by zero
source line number 1
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/resnet50_xnnpack_fp32.pte -s 172.17.32.1:80 -b xnn
Unable to read dmabuf info for 29909
load: 108.011000
1st: 181.618000
avg: 31.160650
peak_mem: 171762
avg_mem: 128501.375
command: python3 benchmark.py -p xnnpack_pte/resnet50_xnnpack_q8.pte -s 172.17.32.1:80 -b xnn
Aborted
awk: division by zero
source line number 1
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/vit_xnnpack_fp32.pte -s 172.17.32.1:80 -b xnn
Aborted
awk: division by zero
source line number 1
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/vit_xnnpack_q8.pte -s 172.17.32.1:80 -b xnn
Aborted
awk: division by zero
source line number 1
peak_mem: 0
avg_mem:
command: python3 benchmark.py -p xnnpack_pte/w2l_xnnpack_fp32.pte -s 172.17.32.1:80 -b xnn
load: 111.403000
1st: 122.821000
avg: 18.166690
peak_mem: 224629
avg_mem: 143850.550
command: python3 benchmark.py -p xnnpack_pte/w2l_xnnpack_q8.pte -s 172.17.32.1:80 -b xnn
load: 126.316000
1st: 153.128000
avg: 18.250880
peak_mem: 224564
avg_mem: 144070.632
Versions
Collecting environment information...
PyTorch version: 2.7.0+cpu
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.5 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: version 3.31.6
Libc version: glibc-2.35
Python version: 3.10.0 (default, Mar 3 2022, 09:58:08) [GCC 7.5.0] (64-bit runtime)
Python platform: Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.35
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 39 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 16
On-line CPU(s) list: 0-15
Vendor ID: GenuineIntel
Model name: 13th Gen Intel(R) Core(TM) i7-1360P
CPU family: 6
Model: 186
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
Stepping: 2
BogoMIPS: 5222.41
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves avx_vnni umip waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize flush_l1d arch_capabilities
Virtualization: VT-x
Hypervisor vendor: Microsoft
Virtualization type: full
L1d cache: 384 KiB (8 instances)
L1i cache: 256 KiB (8 instances)
L2 cache: 10 MiB (8 instances)
L3 cache: 18 MiB (1 instance)
Vulnerability Gather data sampling: Not affected
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Reg file data sampling: Vulnerable: No microcode
Vulnerability Retbleed: Mitigation; Enhanced IBRS
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Enhanced / Automatic IBRS; IBPB conditional; RSB filling; PBRSB-eIBRS SW sequence; BHI BHI_DIS_S
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Versions of relevant libraries:
[pip3] executorch==0.6.0a0+2d5c84f
[pip3] numpy==2.2.6
[pip3] torch==2.7.0+cpu
[pip3] torchao==0.10.0+git8b264ce1
[pip3] torchaudio==2.7.0+cpu
[pip3] torchsr==1.0.4
[pip3] torchvision==0.22.0+cpu
[conda] executorch 0.6.0a0+2d5c84f pypi_0 pypi
[conda] numpy 2.2.6 pypi_0 pypi
[conda] torch 2.7.0+cpu pypi_0 pypi
[conda] torchao 0.10.0+git8b264ce1 pypi_0 pypi
[conda] torchaudio 2.7.0+cpu pypi_0 pypi
[conda] torchsr 1.0.4 pypi_0 pypi
[conda] torchvision 0.22.0+cpu pypi_0 pypi
Metadata
Metadata
Assignees
Labels
Type
Projects
Status