Skip to content

proj_dis_perf_improvements_1

Tsukasa OI edited this page Sep 24, 2022 · 33 revisions

Project: Disassembler Performance Improvements (Optimization 1)

Benchmarking System

This benchmark is performed on:

  • Ubuntu 22.04 LTS
  • AMD Ryzen 5 PRO 5650G processor.

In the parallel run, I ran 6 parallel jobs with -j6 (corresponding 6 cores; although the processor has 12 hardware threads, -j12 just slowed the benchmark).

Aggregate Performance Improvements

Type Expected Improvements (in general)
Binary Files 40-120%
RISC-V ELF Programs 20-40%
RISC-V ELF Libraries 70-500%

This is relative to the latest master (commit 8e037eae6823) and taken on 2022-09-23.

Columns

objdump -d (ELF)

Program Prerequisites HTable/Caching Mapping Syms Notes
Busybox 1.35.1 (RV64GC) 2.5-2.8% 30.1-33.8% 32.5-33.2%
OpenSBI 1.1 (generic fw_*.elf) 3.1-3.2% 44.4-44.8% 48.0-48.4%
Linux kernel 5.19 (vmlinux) (-1.2)-(-0.2)% 23.0-23.3% 36.3-44.2%
Linux kernel 5.19 (vmlinux.o) (-0.0)-2.1% 2.0-12.6% 106.8-256.5% Not finally linked
glibc (libc.so.6) 1.6-2.6% 26.4-30.9% 30.1-35.0%

objdump -d (ELF-based archive)

Program Prerequisites HTable/Caching Mapping Syms
glibc (libc.a) 1.5-2.2% 7.6-8.1% 510.8-548.0%
newlib (libc.a) 2.4-3.4% 6.9-12.4% 131.4-146.5%

objdump -D (binary)

Program Prerequisites HTable/Caching Mapping Syms
Linux kernel 5.19 (vmlinux) (-0.8)-(-0.6)% 79.2-112.8% 81.5-110.9%
Random files (/dev/urandom) (-2.8)-(-1.3)% 103.2-130.4% 102.7-129.4%
1M (1048576) CSR instructions 27.1% 400.6% 397.5%

gdb: disas of near all code region

Program Prerequisites HTable/Caching Mapping Syms
Linux kernel 5.19 (vmlinux) with debug info 31.8% 34.8% 35.2%
Linux kernel 5.19 (vmlinux) without debug info 83.2% 95.4% 94.6%
OpenSBI 1.1 (generic fw_*.elf) 60.2-60.7% 70.3-70.7% 70.5-70.7%
1M (1048576) CSR instructions (ELF) 78.8% 205.6% 206.8%

Batch: objdump -d on Linux distribution

Serial Run: All ELF Files Under the Directory

System Path N Prerequisites HTable/Caching Mapping Syms
Ubuntu 22.04 LTS (image for HiFive Unmatched) /usr/bin 563 2.5% 26.6% 26.7%
Debian unstable (as of 2022-07-20) /usr/bin 269 2.3% 26.5% 26.7%
Ubuntu 22.04 LTS (image for HiFive Unmatched) /usr/lib 6797 2.2% 14.4% 90.2%
Debian unstable (as of 2022-07-20) /usr/lib 548 3.0% 12.2% 195.2%

Parallel Run: All (including data-only ELFs)

System N Prerequisites HTable/Caching Mapping Syms
Ubuntu 22.04 LTS (image for HiFive Unmatched) 7666 0.9% 12.4% 64.1%
Debian unstable (as of 2022-07-20) 946 4.8% 6.8% 133.0%

Batch: objdump -D (as binary) on Linux distribution

Serial Run: All ELF Files Under the Directory

System Path N Prerequisites HTable/Caching Mapping Syms
Ubuntu 22.04 LTS (image for HiFive Unmatched) /usr/bin 563 (-1.4)% 54.9% 54.9%
Debian unstable (as of 2022-07-20) /usr/bin 269 (-1.5)% 46.2% 47.0%

Parallel Run: All (including data-only ELFs)

System N Prerequisites HTable/Caching Mapping Syms
Ubuntu 22.04 LTS (image for HiFive Unmatched) 7666 (-1.3)% 63.6% 63.3%
Debian unstable (as of 2022-07-20) 946 (-1.0)% 48.9% 49.0%

objdump -d (ELF): Extreme Examples

Program N Prerequisites HTable/Caching Mapping Syms Mapping Syms (in other words)
OpenSSL (libcrypto.so) 2 1.9-2.6% 4.8-6.1% 1131.0%-1151.8% x12.310-12.518
LLVM (libLLVM.so) 1 0.8% 0.8% 9708.4% x98.084
  • OpenSSL
    1. Ubuntu 22.04 LTS (image for HiFive Unmatched) : /usr/lib/riscv64-linux-gnu/libcrypto.so.3
    2. Debian unstable (as of 2022-07-20) : /usr/lib/riscv64-linux-gnu/libcrypto.so.3
  • LLVM
    1. Ubuntu 22.04 LTS (Package libllvm14 Version 1:14.0.0-1ubuntu1) : /usr/lib/riscv64-linux-gnu/libLLVM-14.so.1

For large library with many symbols, the effect of the mapping symbol optimization is huge. I did expect some improvements but not that huge.

This optimization also benefits Arm architecture (not AArch64, due to different mapping symbol handlings).

Clone this wiki locally