Skip to content

Commit 65c6520

Browse files
authored
cmov: add asm! optimized masknz32 for ARM32 (#1336)
In #1332 we ran into LLVM inserting branches in this routine for `thumbv6m-none-eabi` targets. It was "fixed" by fiddling around with `black_box` but that seems brittle. In #1334 we attempted a simple portable `asm!` optimization barrier approach but it did not work as expected. This instead opts to implement one of the fiddliest bits, mask generation, using ARM assembly instead. The resulting assembly is actually more efficient than what rustc/LLVM outputs and avoids touching the stack pointer. It's a simple enough function to implement in assembly on other platforms with stable `asm!` too, but this is a start.
1 parent b87f17c commit 65c6520

File tree

2 files changed

+24
-1
lines changed

2 files changed

+24
-1
lines changed

.github/workflows/cmov.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,6 @@ jobs:
135135
strategy:
136136
matrix:
137137
target:
138-
- armv7-unknown-linux-gnueabi
139138
- powerpc-unknown-linux-gnu
140139
- s390x-unknown-linux-gnu
141140
- x86_64-unknown-linux-gnu

cmov/src/portable.rs

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,15 +125,39 @@ fn testnz64(mut x: u64) -> u64 {
125125
}
126126

127127
/// Return a [`u32::MAX`] mask if `condition` is non-zero, otherwise return zero for a zero input.
128+
#[cfg(not(target_arch = "arm"))]
128129
fn masknz32(condition: Condition) -> u32 {
129130
testnz32(condition.into()).wrapping_neg()
130131
}
131132

132133
/// Return a [`u64::MAX`] mask if `condition` is non-zero, otherwise return zero for a zero input.
134+
#[cfg(not(target_arch = "arm"))]
133135
fn masknz64(condition: Condition) -> u64 {
134136
testnz64(condition.into()).wrapping_neg()
135137
}
136138

139+
/// Optimized mask generation for ARM32 targets.
140+
#[cfg(target_arch = "arm")]
141+
fn masknz32(condition: u8) -> u32 {
142+
let mut out = condition as u32;
143+
unsafe {
144+
core::arch::asm!(
145+
"rsbs {0}, {0}, #0", // Reverse subtract
146+
"sbcs {0}, {0}, {0}", // Subtract with carry, setting flags
147+
inout(reg) out,
148+
options(nostack, nomem),
149+
);
150+
}
151+
out
152+
}
153+
154+
/// 64-bit wrapper for targets that implement 32-bit mask generation in assembly.
155+
#[cfg(target_arch = "arm")]
156+
fn masknz64(condition: u8) -> u64 {
157+
let mask = masknz32(condition) as u64;
158+
mask | mask << 32
159+
}
160+
137161
#[cfg(test)]
138162
mod tests {
139163
#[test]

0 commit comments

Comments
 (0)