Skip to content

Commit 94af3a0

Browse files
ubizjakIngo Molnar
authored andcommitted
locking/qspinlock/x86: Micro-optimize virt_spin_lock()
Optimize virt_spin_lock() to use simpler and faster: atomic_try_cmpxchg(*ptr, &val, new) instead of: atomic_cmpxchg(*ptr, val, new) == val The x86 CMPXCHG instruction returns success in the ZF flag, so this change saves a compare after the CMPXCHG. Also optimize retry loop a bit. atomic_try_cmpxchg() fails iff &lock->val != 0, so there is no need to load and compare the lock value again - cpu_relax() can be unconditinally called in this case. This allows us to generate optimized: 1f: ba 01 00 00 00 mov $0x1,%edx 24: 8b 03 mov (%rbx),%eax 26: 85 c0 test %eax,%eax 28: 75 63 jne 8d <...> 2a: f0 0f b1 13 lock cmpxchg %edx,(%rbx) 2e: 75 5d jne 8d <...> ... 8d: f3 90 pause 8f: eb 93 jmp 24 <...> instead of: 1f: ba 01 00 00 00 mov $0x1,%edx 24: 8b 03 mov (%rbx),%eax 26: 85 c0 test %eax,%eax 28: 75 13 jne 3d <...> 2a: f0 0f b1 13 lock cmpxchg %edx,(%rbx) 2e: 85 c0 test %eax,%eax 30: 75 f2 jne 24 <...> ... 3d: f3 90 pause 3f: eb e3 jmp 24 <...> Signed-off-by: Uros Bizjak <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Waiman Long <[email protected]> Cc: Linus Torvalds <[email protected]> Link: https://lore.kernel.org/r/[email protected]
1 parent 33eb8ab commit 94af3a0

File tree

1 file changed

+9
-4
lines changed

1 file changed

+9
-4
lines changed

arch/x86/include/asm/qspinlock.h

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,8 @@ DECLARE_STATIC_KEY_TRUE(virt_spin_lock_key);
8585
#define virt_spin_lock virt_spin_lock
8686
static inline bool virt_spin_lock(struct qspinlock *lock)
8787
{
88+
int val;
89+
8890
if (!static_branch_likely(&virt_spin_lock_key))
8991
return false;
9092

@@ -94,10 +96,13 @@ static inline bool virt_spin_lock(struct qspinlock *lock)
9496
* horrible lock 'holder' preemption issues.
9597
*/
9698

97-
do {
98-
while (atomic_read(&lock->val) != 0)
99-
cpu_relax();
100-
} while (atomic_cmpxchg(&lock->val, 0, _Q_LOCKED_VAL) != 0);
99+
__retry:
100+
val = atomic_read(&lock->val);
101+
102+
if (val || !atomic_try_cmpxchg(&lock->val, &val, _Q_LOCKED_VAL)) {
103+
cpu_relax();
104+
goto __retry;
105+
}
101106

102107
return true;
103108
}

0 commit comments

Comments
 (0)