You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
locking/lockref: Use try_cmpxchg64 in CMPXCHG_LOOP macro
Use try_cmpxchg64 instead of cmpxchg64 in CMPXCHG_LOOP macro.
x86 CMPXCHG instruction returns success in ZF flag, so this
change saves a compare after cmpxchg (and related move instruction
in front of cmpxchg). The main loop of lockref_get improves from:
13: 48 89 c1 mov %rax,%rcx
16: 48 c1 f9 20 sar $0x20,%rcx
1a: 83 c1 01 add $0x1,%ecx
1d: 48 89 ce mov %rcx,%rsi
20: 89 c1 mov %eax,%ecx
22: 48 89 d0 mov %rdx,%rax
25: 48 c1 e6 20 shl $0x20,%rsi
29: 48 09 f1 or %rsi,%rcx
2c: f0 48 0f b1 4d 00 lock cmpxchg %rcx,0x0(%rbp)
32: 48 39 d0 cmp %rdx,%rax
35: 75 17 jne 4e <lockref_get+0x4e>
to:
13: 48 89 ca mov %rcx,%rdx
16: 48 c1 fa 20 sar $0x20,%rdx
1a: 83 c2 01 add $0x1,%edx
1d: 48 89 d6 mov %rdx,%rsi
20: 89 ca mov %ecx,%edx
22: 48 c1 e6 20 shl $0x20,%rsi
26: 48 09 f2 or %rsi,%rdx
29: f0 48 0f b1 55 00 lock cmpxchg %rdx,0x0(%rbp)
2f: 75 02 jne 33 <lockref_get+0x33>
[ Michael Ellerman and Mark Rutland confirm that code generation on
powerpc and arm64 respectively is also ok, even though they do not
have a native arch_try_cmpxchg() implementation, and rely on the
default fallback case - Linus ]
Signed-off-by: Uros Bizjak <[email protected]>
Tested-by: Michael Ellerman <[email protected]>
Tested-by: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
0 commit comments