x64: report an error instead of panicking on stack frames larger than i32::MAX by soumik15630m · Pull Request #13783 · bytecodealliance/wasmtime

soumik15630m · 2026-07-01T02:13:55Z

Summary

x64 codegen panics with TryFromIntError when a function's stack frame
is large enough that offsets relative to it no longer fit in a 32-bit
immediate. This affects three places: the inline stack-probe loop
(guard_size * probe_count), clobber save/restore (RSP-relative
offsets), and incoming-argument access (RBP-relative offsets via
SyntheticAmode::IncomingArg::finalize).

This was reported downstream in
rust-lang/rustc_codegen_cranelift#1656, where a ~2GB stack allocation
([0; 536870787_usize], a [i32; N] array) triggers an ICE:
thread 'optimize module ...' panicked at cranelift/codegen/src/isa/x64/inst/emit.rs:373:18:
guard_size * probe_count is too large to fit in a 32-bit immediate: TryFromIntError(PosOverflow)

What this does

Adds ABIMachineSpec::validate_frame_layout, a new trait method with a
no-op default, overridden for x64. It's called from VCode::emit right
after compute_frame_layout, before any offset encoding takes place,
and checks:

On overflow, this returns CodegenError::Unsupported instead of
panicking. Only x64 overrides the new trait method; other backends are
unaffected (aarch64/riscv64/s390x/pulley needed a one-line ?
each to propagate emit's now-fallible return type, no behavior
change).

Testing

Added unit tests in isa/x64/abi.rs covering: a normal frame
(accepted), an oversized fixed_frame_storage_size (rejected), an
oversized tail_args_size via the RBP-relative path (rejected), and
a frame that's under i32::MAX on its own but pushed over by the
guard-page rounding margin (rejected) — this last case specifically
reproduces the original bug.

Known follow-up (not included here)

bjorn3's comment mentions "all stack load/store instruction
lowerings" more broadly. I traced and covered the two offset-encoding
paths that are actually reachable from the repro (gen_clobber_save/
gen_clobber_restore and SyntheticAmode::IncomingArg::finalize), but
I have not exhaustively audited every stack-relative offset computation
in the x64 backend for the same class of narrowing. Happy to widen this
if there's a specific path you'd like checked.

Separately, rustc_codegen_cranelift currently panics when it receives
any Err from Context::compile at
src/base.rs:224
rather than converting it to a rustc diagnostic. I have open a PR rust-lang/rustc_codegen_cranelift/pull/1669 there to handle CodegenError::Unsupported the same way
the existing CodegenError::Verifier arm already does.

bjorn3 · 2026-07-01T06:09:06Z

+        if rsp_relative.saturating_add(probe_margin) > i32::MAX as u64
+            || rbp_relative > i32::MAX as u64
+        {
+            return Err(crate::CodegenError::Unsupported(


Should probably use ImplLimitExceeded instead.

Good call, thanks. ImplLimitExceeded doesn't carry a message though ...so is it okay?Happy to switch either way; want to make sure I'm not throwing away information you'd want to keep.

bjorn3 · 2026-07-01T08:31:11Z

        insts
    }

+    fn validate_frame_layout(frame_layout: &FrameLayout, guard_size: u32) -> CodegenResult<()> {


This validation should probably be done in generic code (maybe the caller of compute_frame_layout?). The FrameLayout type uses 32bit ints, so the 4GB stack frame size applies to all targets.

Sure ... will push a commit soon

Pushed a commit adding a generic check in compute_frame_layout.

I kept the x64-specific validate_frame_layout check as well, since it's
catching a different, tighter condition: x64 encodes these offsets as
signed 32-bit immediates (i32::MAX, not u32::MAX), and the inline
probestack loop rounds the frame size up to the next guard-page boundary
before encoding it - so a frame just under i32::MAX can get pushed
over it by that rounding. The original repro (~2.147GB) is under
u32::MAX (~4.295GB), so the generic check alone doesn't catch it -
only the x64-specific one does.

Happy to remove the x64-specific check if you'd rather keep this simpler
and generic-only, but wanted to flag that it currently exists to catch
a real case the generic check can't - unless other backends share x64's
signed-immediate constraint, in which case maybe it'd make sense to
generalize the tighter check too rather than keep it x64-only. Let me
know which direction you'd prefer.

alexcrichton · 2026-07-01T18:55:29Z

Thanks for the PR, but personally I'm pretty wary of adding a parallel path that calculates information about a frame that duplicates what's already in each backend. Frame handling/layout is pretty precise and requires extremely careful handling, and I wouldn't be confident that it could be accurately mirrored elsewhere in the system.

What I think should happen here is either (a) to propagate a result upwards at the time of a try_into or a checked_add happens (e.g. the try_into in this case, or (b) the case in question is updated. Given the line number it looks like this is failing to generate a stack probe, but we could enhance that logic to better support the stack probe loop with larger frame sizes.

soumik15630m · 2026-07-02T00:52:43Z

@alexcrichton
Thanks for the detailed feedback.
Dug into this in detail. There are actually two different kinds of
32-bit narrowing happening here, and I think they need different
treatment:

1. RSP arithmetic (the subq_mi/addq_mi adjustments in the probe
loop and clobber save/restore) - this is genuinely fixable by widening.
r11/r15 are already reserved as scratch registers in this ABI, and
movabsq_oi already exists to materialize a 64-bit value into one. So
for these sites, we could fall back to "materialize into a scratch reg,
then register-to-register add/sub" when the value doesn't fit in i32.

2. Stack-relative memory addressing (Amode::imm_reg(offset, rsp),
used for touching probe pages, clobber load/store, incoming-arg access)

this hits an actual x86-64 ISA limit, not a Cranelift shortcut: the
addressing mode's displacement field is a hard 32-bit signed value at
the instruction-encoding level. The fix is computing the absolute
effective address into a register first, then addressing off that with
a zero displacement.

One thing worth flagging on option (a): MachInstEmit::emit returns
() for all 5 backends, with no error path anywhere in MachBuffer.
So propagating a Result from the actual try_from/.expect() sites
in emit.rs would mean changing that trait's signature - which is
called once per instruction, everywhere, across every backend. That's
a much bigger surface than I first thought, and probably bigger than
(b), which stays contained to x64 frame setup/teardown.

Given that, (b) seems like the more proportionate direction, and I'm
happy to take it on. Let me know if that matches your thinking or if
I'm missing a narrower way to do (a).

alexcrichton · 2026-07-02T19:41:39Z

I'd like to confirm that you're aware of the Bytecode Alliance's AI tool usage policy and would ask that if you haven't seen that that you read over it.

My main concern here is still that I would want to make sure that there's not duplication in calculating or interpreting frame layout. I believe there should be a "mostly central" location that each backend handles all this and ideally a limit, if any, would fall out of that. It's true that returning an error during emission would require changing signatures and it would also affect other situations such as stack offsets.

I'll bring this up at next week's Cranelift meeting because I'm not sure how best to handle this. One option would be to just say "all backends must support up to N byte stack frames" and return a hard error for anything over N for all backends. That would notably be a Cranelift/cross-platform limitation rather than just the x64 backend. Alternatively if N is high enough (e.g. 4GiB) then it would require changes to the x64 backend (and possibly others). I think it'd be good to loop in others on this discussion though.

soumik15630m · 2026-07-02T19:50:36Z

Sure I will go through the ai tool usuage policy ...

and yeah this issue needs a deeper discussion cause to me both the approach looks equally reasonable ..so i'm happy to wait for that discussion, or start on whichever direction comes out of it. Let me know how I can help move it forward.

cfallin · 2026-07-02T19:56:44Z

(Ghost-of-cfallin popping in from PTO briefly to offer thoughts as original-person-who-inflicted-this-ABI-code-upon-us)

I think that while this is a real issue that should indeed be surfaced into ImplLimitExceeded, I agree with Alex that there is too much backend-specific logic here. Actually the whole thing seems overly complex. I'd prefer a maximum_frame_size trait method per backend, then validate that once in the platform-independent code. And we should set it to something reasonable like 1GiB rather than try to reason carefully about 2GiB-minus-epsilon and squeeze every little bit out of the layout; the latter is error-prone and brittle.

x64: report an error instead of panicking on oversized stack frames

92b0111

soumik15630m requested review from a team as code owners July 1, 2026 02:13

soumik15630m requested review from alexcrichton and removed request for a team July 1, 2026 02:13

soumik15630m mentioned this pull request Jul 1, 2026

Report CodegenError::Unsupported as a rustc diagnostic instead of panicking rust-lang/rustc_codegen_cranelift#1669

Closed

bjorn3 reviewed Jul 1, 2026

View reviewed changes

Adding a generic check for allocation in compute_frame_layout

b94de39

soumik15630m force-pushed the fix-x64-frame-size-overflow branch from 48d5999 to b94de39 Compare July 1, 2026 12:31

soumik15630m requested a review from bjorn3 July 1, 2026 12:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

x64: report an error instead of panicking on stack frames larger than i32::MAX#13783

x64: report an error instead of panicking on stack frames larger than i32::MAX#13783
soumik15630m wants to merge 2 commits into
bytecodealliance:mainfrom
soumik15630m:fix-x64-frame-size-overflow

soumik15630m commented Jul 1, 2026 •

edited

Loading

Uh oh!

bjorn3 Jul 1, 2026

Uh oh!

soumik15630m Jul 1, 2026

Uh oh!

bjorn3 Jul 1, 2026

Uh oh!

soumik15630m Jul 1, 2026

Uh oh!

soumik15630m Jul 1, 2026

Uh oh!

alexcrichton commented Jul 1, 2026

Uh oh!

soumik15630m commented Jul 2, 2026 •

edited

Loading

Uh oh!

alexcrichton commented Jul 2, 2026

Uh oh!

soumik15630m commented Jul 2, 2026

Uh oh!

cfallin commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

soumik15630m commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What this does

Testing

Known follow-up (not included here)

Uh oh!

bjorn3 Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

soumik15630m Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

bjorn3 Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

soumik15630m Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

soumik15630m Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

alexcrichton commented Jul 1, 2026

Uh oh!

soumik15630m commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexcrichton commented Jul 2, 2026

Uh oh!

soumik15630m commented Jul 2, 2026

Uh oh!

cfallin commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

soumik15630m commented Jul 1, 2026 •

edited

Loading

soumik15630m commented Jul 2, 2026 •

edited

Loading