@@ -76,9 +76,13 @@ When to merge instruction locations
7676-----------------------------------
7777
7878A transformation should merge instruction locations if it replaces multiple
79- instructions with a single merged instruction, *and * that merged instruction
80- does not correspond to any of the original instructions' locations. The API to
81- use is ``Instruction::applyMergedLocation ``.
79+ instructions with one or more new instructions, *and * the new instruction(s)
80+ produce the output of more than one of the original instructions. The API to use
81+ is ``Instruction::applyMergedLocation ``. For each new instruction I, its new
82+ location should be a merge of the locations of all instructions whose output is
83+ produced by I. Typically, this includes any instruction being RAUWed by a new
84+ instruction, and excludes any instruction that only produces an intermediate
85+ value used by the RAUWed instruction.
8286
8387The purpose of this rule is to ensure that a) the single merged instruction
8488has a location with an accurate scope attached, and b) to prevent misleading
@@ -101,10 +105,15 @@ Examples of transformations that should follow this rule include:
101105* Merging identical loop-invariant stores (see the LICM utility
102106 ``llvm::promoteLoopAccessesToScalars ``).
103107
104- * Peephole optimizations which combine multiple instructions together, like
105- ``(add (mul A B) C) => llvm.fma.f32(A, B, C) ``. Note that the location of
106- the ``fma `` does not exactly correspond to the locations of either the
107- ``mul `` or the ``add `` instructions.
108+ * Scalar instructions being combined into a vector instruction, like
109+ ``(add A1, B1), (add A2, B2) => (add (A1, A2), (B1, B2)) ``. As the new vector
110+ ``add `` computes the result of both original ``add `` instructions
111+ simultaneously, it should use a merge of the two locations. Similarly, if
112+ prior optimizations have already produced vectors ``(A1, A2) `` and
113+ ``(B2, B1) ``, then we might create a ``(shufflevector (1, 0), (B2, B1)) ``
114+ instruction to produce ``(B1, B2) `` for the vector ``add ``; in this case we've
115+ created two instructions to replace the original ``adds ``, so both new
116+ instructions should use the merged location.
108117
109118Examples of transformations for which this rule *does not * apply include:
110119
@@ -113,6 +122,11 @@ Examples of transformations for which this rule *does not* apply include:
113122 ``zext `` is modified but remains in its block, so the rule for
114123 :ref: `preserving locations<WhenToPreserveLocation> ` should apply.
115124
125+ * Peephole optimizations which combine multiple instructions together, like
126+ ``(add (mul A B) C) => llvm.fma.f32(A, B, C) ``. Note that the result of the
127+ ``mul `` no longer appears in the program, while the result of the ``add `` is
128+ now produced by the ``fma ``, so the ``add ``'s location should be used.
129+
116130* Converting an if-then-else CFG diamond into a ``select ``. Preserving the
117131 debug locations of speculated instructions can make it seem like a condition
118132 is true when it's not (or vice versa), which leads to a confusing
0 commit comments