Skip to content

Conversation

@Kmeakin
Copy link
Contributor

@Kmeakin Kmeakin commented Oct 2, 2025

Optimize checked_ilog and pow when the base is a power of two

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Oct 2, 2025
@rustbot
Copy link
Collaborator

rustbot commented Oct 2, 2025

r? @scottmcm

rustbot has assigned @scottmcm.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from 8f19ce6 to e5ca8cd Compare October 2, 2025 01:31
@rust-log-analyzer

This comment has been minimized.

@workingjubilee
Copy link
Member

workingjubilee commented Oct 2, 2025

does this affect the codegen in practical circumstances? I would expect even a fairly weak optimizer to perform this optimization, making us hand-coding it irrelevant and potentially even harmful because we introduce more conditional logic to chew through.

@Kmeakin Kmeakin changed the title Optimize checked_ilog when base is a power of two Optimize checked_ilog and pow when base is a power of two Oct 2, 2025
@Kmeakin
Copy link
Contributor Author

Kmeakin commented Oct 2, 2025

does this affect the codegen in practical circumstances?

It replaces a loop by some bit manipulations. Whether anyone actually calls either function with a compile-time known, power of 2 base is another question. But it can't hurt, since it is guarded by is_statically_known.

I would expect even a fairly weak optimizer to perform this optimization, making us hand-coding it irrelevant

No, in either function, LLVM is not able to see that the loop is performing a log/pow and apply the identities: https://godbolt.org/z/6vMsxc9Kh

and potentially even harmful because we introduce more conditional logic to chew through.

They're guarded by is_statically_known so will be ignored if the base is not known at compile-time

@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch 2 times, most recently from 2faa397 to abb7f32 Compare October 2, 2025 02:38
@workingjubilee
Copy link
Member

No, in either function, LLVM is not able to see that the loop is performing a log/pow and apply the identities: https://godbolt.org/z/6vMsxc9Kh

...then I'm kinda surprised! Nice catch.

@scottmcm
Copy link
Member

scottmcm commented Oct 2, 2025

does this affect the codegen in practical circumstances?

Would be good to have codegen tests to demonstrate what this is doing -- especially since that way there's a way for people to try removing the special cases later if they think that LLVM no longer needs them.

@scottmcm scottmcm added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 2, 2025
@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from abb7f32 to 9ab0f63 Compare October 2, 2025 23:13
@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from 9ab0f63 to 1d0ac82 Compare October 3, 2025 21:16
@workingjubilee
Copy link
Member

thanks for the codegen tests!

@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from 1d0ac82 to c84b99e Compare October 5, 2025 18:29
@Kmeakin
Copy link
Contributor Author

Kmeakin commented Oct 5, 2025

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 5, 2025
@Kmeakin
Copy link
Contributor Author

Kmeakin commented Oct 10, 2025

@scottmcm ping?

#[no_mangle]
pub fn checked_ilog16(val: u32) -> Option<u32> {
// CHECK: %[[ICMP:.+]] = icmp ne i32 %val, 0
// CHECK: %[[CTZ:.+]] = tail call range(i32 0, 33) i32 @llvm.ctlz.i32(i32 %val, i1 true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really this PR's problem, but consider filing an LLVM bug (assuming it's also true in trunk) that this range is wider than necessary -- we're passing true for is_zero_poison https://llvm.org/docs/LangRef.html#llvm-ctlz-intrinsic so the range here should be range(i32 0, 32).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The range attribute is too large, but it seems LLVM is still able to propagate the knowledge:

https://godbolt.org/z/dzxfGnhMf

Copy link
Member

@scottmcm scottmcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code here is looking good, but can you make sure there's normal runtime behaviour tests for it too? Notably, after the base conversation (that's fixed in the code) it made me think that that's not currently covered by any tests, so we should have something -- maybe just add to the # Examples that it returns None for base zero or one? (They seem like perfectly reasonable and helpful examples, in addition to giving coverage for this stuff.)

And maybe add some should_panic tests to ensure that the overflow checking is still correct for pow when overflow checks are enabled?

View changes since this review

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 14, 2025
@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from c84b99e to 946ce96 Compare October 15, 2025 00:51
@rustbot

This comment has been minimized.

@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from 946ce96 to e481662 Compare October 15, 2025 01:31
@rust-log-analyzer

This comment has been minimized.

@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from e481662 to 264cefa Compare October 16, 2025 21:59
@Kmeakin Kmeakin requested a review from scottmcm October 16, 2025 21:59
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 16, 2025
@rust-log-analyzer

This comment has been minimized.

@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from 264cefa to 53449c9 Compare October 17, 2025 11:46
@Kmeakin
Copy link
Contributor Author

Kmeakin commented Nov 10, 2025

@scottmcm ping?

Copy link
Member

@scottmcm scottmcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still have concerns about the correctness of this; see above.

View changes since this review

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 19, 2025
@rustbot
Copy link
Collaborator

rustbot commented Nov 19, 2025

Reminder, once the PR becomes ready for a review, use @rustbot ready.

@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from 53449c9 to a3a4b22 Compare November 22, 2025 01:30
@rustbot
Copy link
Collaborator

rustbot commented Nov 22, 2025

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@rust-log-analyzer

This comment has been minimized.

if base == 2 ** k, then
log(base, n) == log(2, n) / k
Increase test coverage to check all interesting edge cases and all
variants.
`strict_pow` can be implemented in terms of `checked_pow`, and
`wrapping_pow` can be implemented in terms of `overflowing_pow`.
Copy the optimization that unrolls the loop from `pow` to `checked_pow`
and `overflowing_pow`.
@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch 2 times, most recently from be3ec71 to 5251791 Compare November 22, 2025 03:22
@rust-log-analyzer

This comment has been minimized.

@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from 5251791 to e8a603e Compare November 23, 2025 17:28
if base == 2 ** k, then
   (2 ** k) ** n
== 2 ** (k * n)
== 1 << (k * n)
@Kmeakin Kmeakin force-pushed the km/optimize-ilog-base-power-of-two branch from e8a603e to c2f36c5 Compare November 23, 2025 17:31
@rust-log-analyzer
Copy link
Collaborator

The job aarch64-gnu-llvm-20-2 failed! Check out the build log: (web) (plain enhanced) (plain)

Click to see the possible cause of the failure (guessed by this bot)
    num::u8::panicking_pow

test result: FAILED. 2253 passed; 5 failed; 6 ignored; 0 measured; 0 filtered out; finished in 2.99s

error: test failed, to rerun pass `-p coretests --test coretests`
env -u RUSTC_WRAPPER CARGO_ENCODED_RUSTDOCFLAGS="-Csymbol-mangling-version=v0\u{1f}-Zannotate-moves\u{1f}-Zrandomize-layout\u{1f}-Zunstable-options\u{1f}--check-cfg=cfg(bootstrap)\u{1f}-Dwarnings\u{1f}-Wrustdoc::invalid_codeblock_attributes\u{1f}--crate-version\u{1f}1.93.0-nightly\t(739c484aa\t2025-11-23)" CARGO_ENCODED_RUSTFLAGS="-Csymbol-mangling-version=v0\u{1f}-Zannotate-moves\u{1f}-Zrandomize-layout\u{1f}-Zunstable-options\u{1f}--check-cfg=cfg(bootstrap)\u{1f}-Zmacro-backtrace\u{1f}-Csplit-debuginfo=off\u{1f}-Clink-arg=-L/usr/lib/llvm-20/lib\u{1f}-Cllvm-args=-import-instr-limit=10\u{1f}-Clink-args=-Wl,-z,origin\u{1f}-Clink-args=-Wl,-rpath,$ORIGIN/../lib\u{1f}-Alinker-messages\u{1f}--cap-lints=allow\u{1f}--cfg\u{1f}randomized_layouts" RUSTC="/checkout/obj/build/aarch64-unknown-linux-gnu/stage2-codegen/cg_clif/dist/rustc-clif" RUSTDOC="/checkout/obj/build/aarch64-unknown-linux-gnu/stage2-codegen/cg_clif/dist/rustdoc-clif" "/checkout/obj/build/aarch64-unknown-linux-gnu/stage0/bin/cargo" "test" "--manifest-path" "/checkout/obj/build/aarch64-unknown-linux-gnu/stage2-codegen/cg_clif/build/sysroot_tests/Cargo.toml" "--target-dir" "/checkout/obj/build/aarch64-unknown-linux-gnu/stage2-codegen/cg_clif/build/sysroot_tests_target" "--locked" "--target" "aarch64-unknown-linux-gnu" "-p" "coretests" "-p" "alloctests" "--tests" "--" "-q" exited with status ExitStatus(unix_wait_status(25856))
Command `/checkout/obj/build/aarch64-unknown-linux-gnu/stage0/bin/cargo run --target aarch64-unknown-linux-gnu -Zbinary-dep-depinfo -j 4 -Zroot-dir=/checkout --locked --color always --release --manifest-path /checkout/compiler/rustc_codegen_cranelift/build_system/Cargo.toml -- test --download-dir /checkout/obj/build/cg_clif_download --out-dir /checkout/obj/build/aarch64-unknown-linux-gnu/stage2-codegen/cg_clif --no-unstable-features --use-backend cranelift --sysroot llvm --skip-test testsuite.extended_sysroot [workdir=/checkout/compiler/rustc_codegen_cranelift]` failed with exit code 1
Created at: src/bootstrap/src/core/build_steps/test.rs:3723:25
Executed at: src/bootstrap/src/core/build_steps/test.rs:3768:26

Command has failed. Rerun with -v to see more details.
Bootstrap failed while executing `--stage 2 test --skip tests --skip coverage-map --skip coverage-run --skip library --skip tidyselftest`
Build completed unsuccessfully in 0:22:50
  local time: Sun Nov 23 18:01:23 UTC 2025
  network time: Sun, 23 Nov 2025 18:01:23 GMT
##[error]Process completed with exit code 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants