Skip to content

Commit 79ad749

Browse files
committed
[GlobalISel] Add multi-way splitting support for wide scalar shifts.
This patch implements direct N-way splitting for wide scalar shifts instead of recursive binary splitting. For example, an i512 G_SHL can now be split directly into 8 i64 operations rather than going through i256 -> i128 -> i64. The main motivation behind this is to alleviate (although not entirely fix) pathological compile time issues with huge types, like i4224. The problem we see is that the recursive splitting strategy combined with our messy artifact combiner ends up with terribly long compiles as tons of intermediate artifacts are generated, and then attempted to be combined ad-nauseum. Going directly from the large shifts to the destination types short-circuits a lot of these issues, but it's still an abuse of the backend and front-ends should never be doing this sort of thing.
1 parent 97b3cb2 commit 79ad749

File tree

4 files changed

+13568
-4697
lines changed

4 files changed

+13568
-4697
lines changed

llvm/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -364,6 +364,42 @@ class LegalizerHelper {
364364
LLT HalfTy,
365365
LLT ShiftAmtTy);
366366

367+
/// Multi-way shift legalization: directly split wide shifts into target-sized
368+
/// parts in a single step, avoiding recursive binary splitting.
369+
LLVM_ABI LegalizeResult narrowScalarShiftMultiway(MachineInstr &MI,
370+
LLT TargetTy);
371+
372+
/// Optimized path for constant shift amounts using static indexing.
373+
/// Directly calculates which source parts contribute to each output part
374+
/// without generating runtime select chains.
375+
LLVM_ABI LegalizeResult narrowScalarShiftByConstantMultiway(MachineInstr &MI,
376+
const APInt &Amt,
377+
LLT TargetTy,
378+
LLT ShiftAmtTy);
379+
380+
struct ShiftParams {
381+
Register WordShift; // Number of complete words to shift
382+
Register BitShift; // Number of bits to shift within words
383+
Register InvBitShift; // Complement bit shift (TargetBits - BitShift)
384+
Register Zero; // Zero constant for SHL/LSHR fill
385+
Register SignBit; // Sign extension value for ASHR fill
386+
};
387+
388+
/// Generates a single output part for constant shifts using direct indexing.
389+
/// Calculates which source parts contribute and how they're combined.
390+
Register buildConstantShiftPart(unsigned Opcode, unsigned PartIdx,
391+
unsigned NumParts,
392+
ArrayRef<Register> SrcParts,
393+
const ShiftParams &Params, LLT TargetTy,
394+
LLT ShiftAmtTy);
395+
396+
/// Generates a shift part with carry for variable shifts.
397+
/// Combines main operand shifted by BitShift with carry bits from adjacent
398+
/// operand.
399+
Register buildVariableShiftPart(unsigned Opcode, Register MainOperand,
400+
Register ShiftAmt, LLT TargetTy,
401+
Register CarryOperand = Register());
402+
367403
LLVM_ABI LegalizeResult fewerElementsVectorReductions(MachineInstr &MI,
368404
unsigned TypeIdx,
369405
LLT NarrowTy);

0 commit comments

Comments
 (0)