-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
Description
glibc memcpy showed as top hotspot when profiling EEMBC network 2.0 specifically in 2 sub-tests
# perf stat gcc/bin/ip_reassembly.exe -autogo >/tmp/x
Performance counter stats for 'gcc/bin/ip_reassembly.exe -autogo':
1137.96 msec task-clock # 0.965 CPUs utilized
229 context-switches # 0.201 K/sec
0 cpu-migrations # 0.000 K/sec
733 page-faults # 0.644 K/sec
1,137,637,340 cycles # 1.000 GHz
347,444,844 instructions # 0.31 insn per cycle
30,389,001 branches # 26.705 M/sec
5042494 branch-misses # 16.59% of all branches
1.179703860 seconds time elapsed
# perf record -c 10000 gcc/bin/ip_reassembly.exe -autogo >/tmp/x
#
# Samples: 117K of event 'cycles'
# Event count (approx.): 1176170000
#
# Overhead Samples Command Shared Object Symbol
# ........ ............ ............... ................. .................. ..................
61.29% 72084 ip_reassembly.e libc-2.32.so [.] _wordcopy_fwd_aligned
15.95% 18762 ip_reassembly.e ip_reassembly.exe [.] ip_input
9.82% 11551 ip_reassembly.e ip_reassembly.exe [.] ip_reass
2.47% 2905 ip_reassembly.e ip_reassembly.exe [.] m_cat
1.86% 2185 ip_reassembly.e libc-2.32.so [.] memmove
ARC glibc port uses the generic implementation of memcpy/memset which are already decentbut can be optimized for ARC with
- unaligned access
- Double load/store
- any other arch specific helpers such as clz etc.