You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cgroup: Avoid false cacheline sharing of read mostly rstat_cpu
The rstat_cpu and also rstat_css_list of the cgroup structure are read
mostly variables. However, they may share the same cacheline as the
subsequent rstat_flush_next and *bstat variables which can be updated
frequently. That will slow down the cgroup_rstat_cpu() call which is
called pretty frequently in the rstat code. Add a CACHELINE_PADDING()
line in between them to avoid false cacheline sharing.
A parallel kernel build on a 2-socket x86-64 server is used as the
benchmarking tool for measuring the lock hold time. Below were the lock
hold time frequency distribution before and after the patch:
Run time Before patch After patch
-------- ------------ -----------
0-01 us 9,928,562 9,820,428
01-05 us 110,151 50,935
05-10 us 270 93
10-15 us 273 146
15-20 us 135 76
20-25 us 0 2
25-30 us 1 0
It can be seen that the patch further pushes the lock hold time towards
the lower end.
Signed-off-by: Waiman Long <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
0 commit comments