Skip to content

Commit 75ae976

Browse files
authored
egraphs: fix accidental remat of call. (#5726)
In the provided test case in #5716, the result of a call was then added to 0. We have a rewrite rule that sets the remat-bit on any add of a value and a constant, because these frequently appear (e.g. from address offset calculations) and this can frequently reduce register pressure (one long-lived base vs. many long-lived base+offset values). Separately, we have an algebraic rule that `x+0` rewrites to `x`. The result of this was that we had an eclass with the remat bit set on the add, but the add was also union'd into the call. We pick the latter during extraction, because it's cheaper not to do the add at all; but we still get the remat bit, and try to remat a call (!), which blows up later. This PR fixes the logic to look up the "best value" for a value (i.e., whatever extraction determined), and look up the remat bit on *that* node, not the canonical node. (Why did the canonical node become the iadd and not the call? Because the former had a lower value-number, as an accident of IR construction; we don't impose any requirements on the input CLIF's value-number ordering, and I don't think this breaks any of the important acyclic properties, even though there is technically a dependence from a lower-numbered to a higher-numbered node. In essence one can think of them as having "virtual numbers" in any true topologically-sorted order, and the only place the actual integer indices matter should be in choosing the "canonical ID", which is just used for dedup'ing, modulo this bug.) Fixes #5716.
1 parent 16afefd commit 75ae976

File tree

2 files changed

+51
-10
lines changed

2 files changed

+51
-10
lines changed

cranelift/codegen/src/egraph/elaborate.rs

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -278,6 +278,14 @@ impl<'a> Elaborator<'a> {
278278
before
279279
);
280280

281+
// Get the best option; we use `value` (latest
282+
// value) here so we have a full view of the
283+
// eclass.
284+
trace!("looking up best value for {}", value);
285+
let (_, best_value) = self.value_to_best_value[value];
286+
debug_assert_ne!(best_value, Value::reserved_value());
287+
trace!("elaborate: value {} -> best {}", value, best_value);
288+
281289
let remat = if let Some(elab_val) =
282290
self.value_to_elaborated_value.get(&canonical_value)
283291
{
@@ -287,7 +295,7 @@ impl<'a> Elaborator<'a> {
287295
// from another block. If so, ignore the hit
288296
// and recompute below.
289297
let remat = elab_val.in_block != self.cur_block
290-
&& self.remat_values.contains(&canonical_value);
298+
&& self.remat_values.contains(&best_value);
291299
if !remat {
292300
trace!("elaborate: value {} -> {:?}", value, elab_val);
293301
self.stats.elaborate_memoize_hit += 1;
@@ -304,20 +312,12 @@ impl<'a> Elaborator<'a> {
304312
// Value not available; but still look up
305313
// whether it's been flagged for remat because
306314
// this affects placement.
307-
let remat = self.remat_values.contains(&canonical_value);
315+
let remat = self.remat_values.contains(&best_value);
308316
trace!(" -> not present in map; remat = {}", remat);
309317
remat
310318
};
311319
self.stats.elaborate_memoize_miss += 1;
312320

313-
// Get the best option; we use `value` (latest
314-
// value) here so we have a full view of the
315-
// eclass.
316-
trace!("looking up best value for {}", value);
317-
let (_, best_value) = self.value_to_best_value[value];
318-
debug_assert_ne!(best_value, Value::reserved_value());
319-
trace!("elaborate: value {} -> best {}", value, best_value,);
320-
321321
// Now resolve the value to its definition to see
322322
// how we can compute it.
323323
let (inst, result_idx) = match self.func.dfg.value_def(best_value) {
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
;; This tests that a call does not get rematerialized, even if a remat flag is
2+
;; set on a different node in its eclass.
3+
;;
4+
;; Below, `v97` is an add of `v238` (the call's first return value) and a
5+
;; constant 0; a mid-end rule rewrites this to just `v238` (i.e., `v97` is unioned
6+
;; in). Separately, a rule states that an add of a value and a constant always
7+
;; gets rematerialized at use. When `v97` is used in a later block, it would have
8+
;; rematerialized the add; except, if we instead use the result of the call
9+
;; directly, we should *not* remat the call. If we do, a compile error results
10+
;; later.
11+
12+
test compile
13+
set opt_level=speed_and_size
14+
target aarch64
15+
16+
function u0:33() system_v {
17+
ss0 = explicit_slot 32
18+
sig0 = (i64, i64, i64, i64, i64) -> i64, i64 system_v
19+
fn0 = colocated u0:0 sig0
20+
jt0 = jump_table [block36, block38]
21+
block0:
22+
v80 = iconst.i32 0
23+
v91 = iconst.i64 0
24+
v92 = iconst.i64 0
25+
v96 = iconst.i64 0
26+
v235 = iconst.i64 0
27+
v236 = iconst.i64 0
28+
v237 = iconst.i64 0
29+
v238, v239 = call fn0(v236, v237, v91, v92, v235) ; v236 = 0, v237 = 0, v91 = 0, v92 = 0, v235 = 0
30+
v97 = iadd v238, v96 ; v96 = 0
31+
br_table v80, block37, jt0 ; v80 = 0
32+
block36:
33+
trap user0
34+
block37:
35+
trap unreachable
36+
block38:
37+
v98 = load.i8 notrap v97
38+
v99 = fcvt_from_uint.f64 v98
39+
stack_store v99, ss0
40+
trap user0
41+
}

0 commit comments

Comments
 (0)