Skip to content

Conversation

@alex-t
Copy link
Contributor

@alex-t alex-t commented Oct 14, 2025

📌 Problem
When inserting new definitions into Machine IR (e.g., reloads after spills, rematerialized values, or optimization-driven value rewriting), maintaining SSA form is non-trivial, especially with subregister lanes:

  1. SSA Violation: Inserting a new def of an existing vreg creates multiple definitions, violating SSA

  2. Subregister Complexity: Partial redefinitions (e.g., redefining only sub2_3 of a 128-bit register) require lane-aware PHI insertion

  3. LiveInterval Corruption: Naive rewriting breaks LiveIntervals, causing downstream passes to fail

  4. PHI Placement: Computing where to insert PHI nodes requires pruned Iterated Dominance Frontier (IDF) analysis

  5. Use Rewriting: Rewriting uses must handle three cases:

    • Exact match: Direct substitution
    • Subset: Subreg remapping (use needs fewer lanes than def provides)
    • Super/Mixed: REG_SEQUENCE insertion (use needs more lanes than def provides)

Existing LLVM utilities (MachineSSAUpdater, LiveRangeEdit) either lack subregister awareness or are tightly coupled to specific passes, making them unsuitable for general-purpose SSA repair.

🧩 Solution
This patch introduces MachineLaneSSAUpdater, a universal SSA repair utility for Machine IR with full subregister lane awareness.
Core API:
class MachineLaneSSAUpdater {
public:
// Main entry point: Repair SSA for a new definition
// NewDefMI: Instruction with def operand that currently defines OrigVReg (violates SSA)
// Returns: Newly created virtual register
Register repairSSAForNewDef(MachineInstr &NewDefMI, Register OrigVReg);
};

Key Features:

  1. Lane-Aware PHI Insertion: Uses pruned IDF intersected with LiveInterval subranges to place PHIs only where necessary
  2. Intelligent Use Rewriting:
  • Exact match → direct substitution
  • Subset → subreg index remapping with lane mask shifting
  • Super/Mixed → REG_SEQUENCE synthesis with optimal subregister covering
  1. LiveInterval Preservation: Automatically extends/recomputes LiveIntervals for all modified vregs
  2. Dominance-Based Reachability: Uses MachineDominatorTree for robust def-use analysis during reconstruction
  3. Non-Contiguous Lane Support: Handles complex cases like "redefine sub2 of vreg_128, leaving sub0+sub1+sub3"

🔬 Design Highlights
Algorithm Overview:
Rename: Create new vreg, update NewDefMI to define it

  1. PHI Placement: Compute pruned IDF(NewDef) ∩ LiveIn(OrigVReg, DefMask)
  2. Iterative PHI Insertion: Place PHIs with per-edge lane analysis
  3. Use Rewriting: Walk dominated uses, apply exact/subset/super policy
  4. LiveInterval Repair: Extend/recompute all affected intervals
  5. Verification: Update dead flags, optionally verify MachineFunction

Pruned IDF Computation:

  • Intersects IDF of new def blocks with blocks where OrigVReg[DefMask] is live-in
  • Uses LLVM's IDFCalculatorBase for correctness
  • Avoids placing PHIs in unreachable or irrelevant blocks

REG_SEQUENCE Synthesis:

  • For super-uses (use needs lanes from both old and new defs), builds REG_SEQUENCE merging:

    • Lanes from NewVReg (rewritten lanes)
    • Lanes from OrigVReg (unchanged lanes)
  • Handles non-contiguous lane masks via getCoveringSubRegsForLaneMask():

    • Sorts candidates by lane count (prefer larger subregs)
    • Greedily selects minimal covering set
    • Example: LanesFromOld = 0x0F | 0xF0 → [sub0, sub2] not [lo16, hi16, sub2, sub3]

🚧 Limitations / Future Work

  • Target Scope: Currently tested on AMDGPU; other targets should work but need validation
  • Integration: Not wired into any existing passes yet (self-contained infrastructure)
  • Optimization: REG_SEQUENCE synthesis could be optimized further for specific target constraints
  • Undef Handling: Policy for undef edges (UndefEdgeAsImplicitDef) is placeholder, not fully exercised

📖 Usage Example
Spill/Reload Scenario:
// Insert reload instruction that violates SSA
MachineInstr *ReloadMI = BuildMI(MBB, InsertPt, DL, TII->get(LoadOp), SpilledReg)
.addFrameIndex(StackSlot);

// Repair SSA form
MachineLaneSSAUpdater Updater(MF, LIS, MDT, TRI);
Register NewReg = Updater.repairSSAForNewDef(*ReloadMI, SpilledReg);

// Done! ReloadMI now defines NewReg, uses are rewritten, PHIs inserted, LiveIntervals updated

Partial Subreg Reload:
// Reload only sub2_3 (64-bit) of a 128-bit register
MachineInstr *ReloadMI = BuildMI(MBB, InsertPt, DL, TII->get(LoadOp))
.addReg(SpilledReg, RegState::Define, AMDGPU::sub2_3)
.addFrameIndex(StackSlot);

Updater.repairSSAForNewDef(*ReloadMI, SpilledReg);
// Updater automatically detects lane mask from subreg index, places lane-specific PHIs

🔗 Related Work
This patch provides the SSA repair infrastructure needed for:

🧪 Testing
Unit Tests (llvm/unittests/CodeGen/MachineLaneSSAUpdaterTest.cpp, MachineLaneSSAUpdaterSpillReloadTest.cpp):
Test scenarios:
Simple linear reload (no PHI needed)
✅ Diamond CFG (PHI insertion at join point)
✅ Partial subreg reloads (e.g., reload only sub2_3 of vreg_128)
✅ Multiple predecessor PHIs with mixed lane sources
✅ Non-contiguous lane masks
✅ Same-block def-use with multiple uses
✅ REG_SEQUENCE synthesis for super-uses

Verification:

  • All tests run MachineFunction::verify() to ensure correctness
  • LiveInterval consistency checked via LiveInterval::verify()
  • Tests use MIR format for reproducibility

alex-t added 6 commits October 2, 2025 19:05
This patch introduces MachineLaneSSAUpdater, a new utility for performing
SSA repair on Machine IR with full subregister lane awareness. This is
particularly important for AMDGPU and other targets with complex
subregister structures.

Key features:
- Two explicit entry points: addDefAndRepairNewDef() for new definitions
  and addDefAndRepairAfterSpill() for reload-after-spill scenarios
- Lane-aware pruned IDF computation using LLVM's IDFCalculatorBase
- Worklist-driven PHI placement algorithm that correctly handles
  iterative PHI insertion
- Per-edge lane analysis for complex PHI construction with dual-PHI
  support when both old and new register lanes are live
- SpillCutCollector helper for capturing liveness endpoints during
  spill operations with proper subrange refinement

The implementation follows standard SSA reconstruction algorithms but
extends them to handle subregister lanes properly. PHI placement uses
pruned iterated dominance frontiers, and the lane analysis ensures
correct PHI operand construction even in complex scenarios where
different lanes come from different predecessors.

The design separates concerns cleanly:
- Entry points handle scenario-specific setup (indexing, interval extension)
- Common performSSARepair() handles PHI placement and use rewriting
- Lane-aware analysis throughout maintains correctness for partial
  register operations

This is the foundation for efficient SSA repair in the presence of
complex subregister usage patterns, with rewriteDominatedUses()
implementation to follow in subsequent patches.
… SSA repair

This commit implements a comprehensive lane-aware SSA repair utility for
Machine IR with the following key changes:

- Simplify PHI creation logic: Replace complex per-edge lane analysis with
  simplified single PHI creation for reload scenarios
- Add utility function getSubRegIndexForLaneMask() for lane-to-subregister
  mapping with proper inline implementation
- Implement complete dominated use rewriting with three-case policy:
  * Exact match: direct register replacement
  * Super/Mixed: REG_SEQUENCE construction for multi-lane uses
  * Subset: preserve subregister indices for partial uses
- Add comprehensive internal helper methods:
  * incomingOnEdge() for PHI operand VNInfo analysis
  * reachedByThisVNI() for dominance and same-block ordering
  * operandLaneMask() for lane mask calculation
  * buildRSForSuperUse() for REG_SEQUENCE construction
  * extendAt() for precise LiveInterval extension
- Refactor header organization: move from forward declarations to inline
  implementations where appropriate and add missing includes

The implementation focuses on correctness for spill-reload scenarios while
maintaining lane-level precision for subregister operations.
Add 11 unit tests covering SSA repair for diamonds, loops, subregisters,
and spill/reload scenarios. Fix critical bugs in subregister remapping
and LiveInterval handling.

Simplify design by removing ~250 lines of unnecessary spill-specific code
(SpillCutCollector, CutEndPoints, addDefAndRepairAfterSpill). The unified
repairSSAForNewDef() method handles all scenarios, naturally pruning
LiveIntervals through use rewriting and recomputation.
@github-actions
Copy link

github-actions bot commented Oct 14, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff origin/main HEAD --extensions h,cpp -- llvm/include/llvm/CodeGen/MachineLaneSSAUpdater.h llvm/lib/CodeGen/MachineLaneSSAUpdater.cpp llvm/unittests/CodeGen/MachineLaneSSAUpdaterSpillReloadTest.cpp llvm/unittests/CodeGen/MachineLaneSSAUpdaterTest.cpp --diff_from_common_commit

⚠️
The reproduction instructions above might return results for more than one PR
in a stack if you are using a stacked PR workflow. You can limit the results by
changing origin/main to the base branch/commit you want to compare against.
⚠️

View the diff from clang-format here.
diff --git a/llvm/include/llvm/CodeGen/MachineLaneSSAUpdater.h b/llvm/include/llvm/CodeGen/MachineLaneSSAUpdater.h
index f0f814490..040b96c2c 100644
--- a/llvm/include/llvm/CodeGen/MachineLaneSSAUpdater.h
+++ b/llvm/include/llvm/CodeGen/MachineLaneSSAUpdater.h
@@ -1,4 +1,5 @@
-//===- MachineLaneSSAUpdater.h - SSA repair for Machine IR (lane-aware) -*- C++ -*-===//
+//===- MachineLaneSSAUpdater.h - SSA repair for Machine IR (lane-aware) -*- C++
+//-*-===//
 //
 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
 // See https://llvm.org/LICENSE.txt for license information.
@@ -14,12 +15,12 @@
 
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
-#include "llvm/ADT/SmallVector.h"        // SmallVector
-#include "llvm/CodeGen/LiveInterval.h"    // LiveRange
-#include "llvm/CodeGen/Register.h"       // Register
-#include "llvm/CodeGen/SlotIndexes.h"    // SlotIndex
+#include "llvm/ADT/SmallVector.h"            // SmallVector
+#include "llvm/CodeGen/LiveInterval.h"       // LiveRange
+#include "llvm/CodeGen/Register.h"           // Register
+#include "llvm/CodeGen/SlotIndexes.h"        // SlotIndex
 #include "llvm/CodeGen/TargetRegisterInfo.h" // For inline function
-#include "llvm/MC/LaneBitmask.h"        // LaneBitmask
+#include "llvm/MC/LaneBitmask.h"             // LaneBitmask
 
 namespace llvm {
 
@@ -36,61 +37,71 @@ class MachinePostDominatorTree; // optional if you choose to use it
 // MachineLaneSSAUpdater: universal SSA repair for Machine IR (lane-aware)
 //
 // Primary Use Case: repairSSAForNewDef()
-//   - Caller creates a new instruction that defines an existing vreg (violating SSA)
-//   - This function creates a new vreg (or uses a caller-provided one), 
+//   - Caller creates a new instruction that defines an existing vreg (violating
+//   SSA)
+//   - This function creates a new vreg (or uses a caller-provided one),
 //     replaces the operand, and repairs SSA
 //   - Example: Insert "OrigVReg = ADD ..." and call repairSSAForNewDef()
 //   - Works for full register and subregister definitions
 //   - Handles all scenarios including spill/reload
 //
 // Advanced Usage: Caller-provided NewVReg
-//   - By default, repairSSAForNewDef() creates a new virtual register automatically
-//   - For special cases (e.g., subregister reloads where the spiller already 
+//   - By default, repairSSAForNewDef() creates a new virtual register
+//   automatically
+//   - For special cases (e.g., subregister reloads where the spiller already
 //     created a register of a specific class), caller can provide NewVReg
 //   - This gives full control over register class selection when needed
 //===----------------------------------------------------------------------===//
 class MachineLaneSSAUpdater {
 public:
-  MachineLaneSSAUpdater(MachineFunction &MF,
-                        LiveIntervals &LIS,
+  MachineLaneSSAUpdater(MachineFunction &MF, LiveIntervals &LIS,
                         MachineDominatorTree &MDT,
                         const TargetRegisterInfo &TRI)
       : MF(MF), LIS(LIS), MDT(MDT), TRI(TRI) {}
 
   // Repair SSA for a new definition that violates SSA form
-  // 
+  //
   // Parameters:
-  //   NewDefMI: Instruction with a def operand that currently defines OrigVReg (violating SSA)
-  //   OrigVReg: The virtual register being redefined
-  //   NewVReg:  (Optional) Pre-allocated virtual register to use instead of auto-creating one
+  //   NewDefMI: Instruction with a def operand that currently defines OrigVReg
+  //   (violating SSA) OrigVReg: The virtual register being redefined NewVReg:
+  //   (Optional) Pre-allocated virtual register to use instead of auto-creating
+  //   one
   //
   // This function will:
   //   1. Find the def operand in NewDefMI that defines OrigVReg
   //   2. Derive the lane mask from the operand's subreg index (if any)
-  //   3. Use NewVReg if provided, or create a new virtual register with appropriate class
+  //   3. Use NewVReg if provided, or create a new virtual register with
+  //   appropriate class
   //   4. Replace the operand in NewDefMI to define the new vreg
   //   5. Perform SSA repair (insert PHIs, rewrite uses)
   //
   // When to provide NewVReg:
-  //   - Leave it empty (default) for most cases - automatic class selection works well
+  //   - Leave it empty (default) for most cases - automatic class selection
+  //   works well
   //   - Provide it when you need precise control over register class selection
-  //   - Common use case: subregister spill/reload where target-specific constraints apply
-  //   - Example: Reloading a 96-bit subregister requires vreg_96 class (not vreg_128)
+  //   - Common use case: subregister spill/reload where target-specific
+  //   constraints apply
+  //   - Example: Reloading a 96-bit subregister requires vreg_96 class (not
+  //   vreg_128)
   //
   // Returns: The SSA-repaired virtual register (either NewVReg or auto-created)
   Register repairSSAForNewDef(MachineInstr &NewDefMI, Register OrigVReg,
-                             Register NewVReg = Register());
+                              Register NewVReg = Register());
 
 private:
   // Common SSA repair logic
-  void performSSARepair(Register NewVReg, Register OrigVReg, 
+  void performSSARepair(Register NewVReg, Register OrigVReg,
                         LaneBitmask DefMask, MachineBasicBlock *DefBB);
 
   // Optional knobs (fluent style); no-ops until implemented in .cpp.
   MachineLaneSSAUpdater &setUndefEdgePolicy(bool MaterializeImplicitDef) {
-    UndefEdgeAsImplicitDef = MaterializeImplicitDef; return *this; }
+    UndefEdgeAsImplicitDef = MaterializeImplicitDef;
+    return *this;
+  }
   MachineLaneSSAUpdater &setVerifyOnExit(bool Enable) {
-    VerifyOnExit = Enable; return *this; }
+    VerifyOnExit = Enable;
+    return *this;
+  }
 
   // --- Internal helpers ---
 
@@ -104,41 +115,42 @@ private:
                          const SmallVector<LaneBitmask, 4> &LaneMasks,
                          const MachineInstr &AtMI);
 
-  // Compute pruned IDF for a set of definition blocks (usually {block(NewDef)}),
-  // intersected with blocks where OrigVReg lanes specified by DefMask are live-in.
-  void computePrunedIDF(Register OrigVReg,
-                        LaneBitmask DefMask,
+  // Compute pruned IDF for a set of definition blocks (usually
+  // {block(NewDef)}), intersected with blocks where OrigVReg lanes specified by
+  // DefMask are live-in.
+  void computePrunedIDF(Register OrigVReg, LaneBitmask DefMask,
                         ArrayRef<MachineBasicBlock *> NewDefBlocks,
                         SmallVectorImpl<MachineBasicBlock *> &OutIDFBlocks);
 
   // Insert lane-aware Machine PHIs with iterative worklist processing.
-  // Seeds with InitialVReg definition, computes IDF, places PHIs, repeats until convergence.
-  // Returns all PHI result registers created during the iteration.
+  // Seeds with InitialVReg definition, computes IDF, places PHIs, repeats until
+  // convergence. Returns all PHI result registers created during the iteration.
   SmallVector<Register> insertLaneAwarePHI(Register InitialVReg,
-                                            Register OrigVReg,
-                                            LaneBitmask DefMask,
-                                            MachineBasicBlock *InitialDefBB);
+                                           Register OrigVReg,
+                                           LaneBitmask DefMask,
+                                           MachineBasicBlock *InitialDefBB);
 
   // Helper: Create PHI in a specific block with per-edge lane analysis
-  Register createPHIInBlock(MachineBasicBlock &JoinMBB,
-                           Register OrigVReg,
-                           Register NewVReg,
-                           LaneBitmask DefMask);
+  Register createPHIInBlock(MachineBasicBlock &JoinMBB, Register OrigVReg,
+                            Register NewVReg, LaneBitmask DefMask);
 
   // Rewrite dominated uses of OrigVReg to NewSSA according to the
   // exact/subset/super policy; create REG_SEQUENCE only when needed.
-  void rewriteDominatedUses(Register OrigVReg,
-                            Register NewSSA,
+  void rewriteDominatedUses(Register OrigVReg, Register NewSSA,
                             LaneBitmask MaskToRewrite);
 
   // Internal helper methods for use rewriting
-  VNInfo *incomingOnEdge(LiveInterval &LI, MachineInstr *Phi, MachineOperand &PhiOp);
-  bool defReachesUse(MachineInstr *DefMI, MachineInstr *UseMI, MachineOperand &UseOp);
+  VNInfo *incomingOnEdge(LiveInterval &LI, MachineInstr *Phi,
+                         MachineOperand &PhiOp);
+  bool defReachesUse(MachineInstr *DefMI, MachineInstr *UseMI,
+                     MachineOperand &UseOp);
   LaneBitmask operandLaneMask(const MachineOperand &MO);
   Register buildRSForSuperUse(MachineInstr *UseMI, MachineOperand &MO,
-                             Register OldVR, Register NewVR, LaneBitmask MaskToRewrite,
-                             LiveInterval &LI, const TargetRegisterClass *OpRC,
-                             SlotIndex &OutIdx, SmallVectorImpl<LaneBitmask> &LanesToExtend);
+                              Register OldVR, Register NewVR,
+                              LaneBitmask MaskToRewrite, LiveInterval &LI,
+                              const TargetRegisterClass *OpRC,
+                              SlotIndex &OutIdx,
+                              SmallVectorImpl<LaneBitmask> &LanesToExtend);
   void extendAt(LiveInterval &LI, SlotIndex Idx, ArrayRef<LaneBitmask> Lanes);
   void updateDeadFlags(Register Reg);
 
@@ -154,12 +166,14 @@ private:
 
 /// Get the subregister index that corresponds to the given lane mask.
 /// \param Mask The lane mask to convert to a subregister index
-/// \param TRI The target register info (provides target-specific subregister mapping)
+/// \param TRI The target register info (provides target-specific subregister
+/// mapping)
 /// \return The subregister index, or 0 if no single subregister matches
-inline unsigned getSubRegIndexForLaneMask(LaneBitmask Mask, const TargetRegisterInfo *TRI) {
+inline unsigned getSubRegIndexForLaneMask(LaneBitmask Mask,
+                                          const TargetRegisterInfo *TRI) {
   if (Mask.none())
     return 0; // No subregister
-  
+
   // Iterate through all subregister indices to find a match
   for (unsigned SubIdx = 1; SubIdx < TRI->getNumSubRegIndices(); ++SubIdx) {
     LaneBitmask SubMask = TRI->getSubRegIndexLaneMask(SubIdx);
@@ -167,28 +181,28 @@ inline unsigned getSubRegIndexForLaneMask(LaneBitmask Mask, const TargetRegister
       return SubIdx;
     }
   }
-  
-  // No exact match found - this might be a composite mask requiring REG_SEQUENCE
+
+  // No exact match found - this might be a composite mask requiring
+  // REG_SEQUENCE
   return 0;
 }
 
 // DenseMapInfo specialization for LaneBitmask
-template<>
-struct DenseMapInfo<LaneBitmask> {
+template <> struct DenseMapInfo<LaneBitmask> {
   static inline LaneBitmask getEmptyKey() {
     // Use a specific bit pattern for empty key
     return LaneBitmask(~0U - 1);
   }
-  
+
   static inline LaneBitmask getTombstoneKey() {
-    // Use a different bit pattern for tombstone  
+    // Use a different bit pattern for tombstone
     return LaneBitmask(~0U);
   }
-  
+
   static unsigned getHashValue(const LaneBitmask &Val) {
     return (unsigned)Val.getAsInteger();
   }
-  
+
   static bool isEqual(const LaneBitmask &LHS, const LaneBitmask &RHS) {
     return LHS == RHS;
   }
diff --git a/llvm/lib/CodeGen/MachineLaneSSAUpdater.cpp b/llvm/lib/CodeGen/MachineLaneSSAUpdater.cpp
index 0ffd489a9..6c1416b61 100644
--- a/llvm/lib/CodeGen/MachineLaneSSAUpdater.cpp
+++ b/llvm/lib/CodeGen/MachineLaneSSAUpdater.cpp
@@ -12,9 +12,11 @@
 //
 // Key features:
 //  - Two explicit entry points:
-//    * repairSSAForNewDef - Common use case: caller creates instruction defining
+//    * repairSSAForNewDef - Common use case: caller creates instruction
+//    defining
 //      existing vreg (violating SSA), updater creates new vreg and repairs
-//    * addDefAndRepairAfterSpill - Spill/reload use case: caller creates instruction
+//    * addDefAndRepairAfterSpill - Spill/reload use case: caller creates
+//    instruction
 //      with new vreg, updater repairs SSA using spill-time EndPoints
 //  - Lane-aware PHI insertion with per-edge masks
 //  - Pruned IDF computation (NewDefBlocks ∩ LiveIn(OldVR))
@@ -50,17 +52,19 @@ using namespace llvm;
 //===----------------------------------------------------------------------===//
 
 Register MachineLaneSSAUpdater::repairSSAForNewDef(MachineInstr &NewDefMI,
-                                                    Register OrigVReg,
-                                                    Register NewVReg) {
-  LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::repairSSAForNewDef VReg=" << OrigVReg);
+                                                   Register OrigVReg,
+                                                   Register NewVReg) {
+  LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::repairSSAForNewDef VReg="
+                    << OrigVReg);
   if (NewVReg.isValid()) {
     LLVM_DEBUG(dbgs() << ", caller-provided NewVReg=" << NewVReg);
   }
   LLVM_DEBUG(dbgs() << "\n");
-  
+
   MachineRegisterInfo &MRI = MF.getRegInfo();
-  
-  // Step 1: Find the def operand that currently defines OrigVReg (violating SSA)
+
+  // Step 1: Find the def operand that currently defines OrigVReg (violating
+  // SSA)
   MachineOperand *DefOp = nullptr;
   unsigned DefOpIdx = 0;
   for (MachineOperand &MO : NewDefMI.defs()) {
@@ -70,72 +74,80 @@ Register MachineLaneSSAUpdater::repairSSAForNewDef(MachineInstr &NewDefMI,
     }
     ++DefOpIdx;
   }
-  
+
   assert(DefOp && "NewDefMI should have a def operand for OrigVReg");
   assert(DefOp->isDef() && "Found operand should be a definition");
-  
+
   // Step 2: Derive DefMask from the operand's subreg index (if any)
   unsigned SubRegIdx = DefOp->getSubReg();
   LaneBitmask DefMask;
-  
+
   if (SubRegIdx) {
     // Partial register definition - get lane mask for this subreg
     DefMask = TRI.getSubRegIndexLaneMask(SubRegIdx);
-    LLVM_DEBUG(dbgs() << "  Partial def with subreg " << TRI.getSubRegIndexName(SubRegIdx)
+    LLVM_DEBUG(dbgs() << "  Partial def with subreg "
+                      << TRI.getSubRegIndexName(SubRegIdx)
                       << ", DefMask=" << PrintLaneMask(DefMask) << "\n");
   } else {
     // Full register definition - get all lanes for this register class
     DefMask = MRI.getMaxLaneMaskForVReg(OrigVReg);
-    LLVM_DEBUG(dbgs() << "  Full register def, DefMask=" << PrintLaneMask(DefMask) << "\n");
+    LLVM_DEBUG(dbgs() << "  Full register def, DefMask="
+                      << PrintLaneMask(DefMask) << "\n");
   }
-  
+
   // Step 3: Create or use provided new virtual register
   Register NewSSAVReg;
   if (NewVReg.isValid()) {
     // Caller provided a register - use it
     NewSSAVReg = NewVReg;
     const TargetRegisterClass *RC = MRI.getRegClass(NewSSAVReg);
-    LLVM_DEBUG(dbgs() << "  Using caller-provided SSA vreg " << NewSSAVReg 
+    LLVM_DEBUG(dbgs() << "  Using caller-provided SSA vreg " << NewSSAVReg
                       << " with RC=" << TRI.getRegClassName(RC) << "\n");
   } else {
     // Create a new virtual register with appropriate register class
-    // If this is a subreg def, we need the class for the subreg, not the full reg
+    // If this is a subreg def, we need the class for the subreg, not the full
+    // reg
     const TargetRegisterClass *RC;
     if (SubRegIdx) {
       // For subreg defs, get the subreg class
       const TargetRegisterClass *OrigRC = MRI.getRegClass(OrigVReg);
       RC = TRI.getSubRegisterClass(OrigRC, SubRegIdx);
-      assert(RC && "Failed to get subregister class for subreg def - would create incorrect MIR");
+      assert(RC && "Failed to get subregister class for subreg def - would "
+                   "create incorrect MIR");
     } else {
       // For full reg defs, use the same class as OrigVReg
       RC = MRI.getRegClass(OrigVReg);
     }
-    
+
     NewSSAVReg = MRI.createVirtualRegister(RC);
-    LLVM_DEBUG(dbgs() << "  Created new SSA vreg " << NewSSAVReg << " with RC=" << TRI.getRegClassName(RC) << "\n");
+    LLVM_DEBUG(dbgs() << "  Created new SSA vreg " << NewSSAVReg
+                      << " with RC=" << TRI.getRegClassName(RC) << "\n");
   }
-  
+
   // Step 4: Replace the operand in NewDefMI to define the new vreg
-  // If this was a subreg def, the new vreg is a full register of the subreg class
-  // so we clear the subreg index (e.g., %1.sub0:vreg_64 becomes %3:vgpr_32)
+  // If this was a subreg def, the new vreg is a full register of the subreg
+  // class so we clear the subreg index (e.g., %1.sub0:vreg_64 becomes
+  // %3:vgpr_32)
   DefOp->setReg(NewSSAVReg);
   if (SubRegIdx) {
     DefOp->setSubReg(0);
-    LLVM_DEBUG(dbgs() << "  Replaced operand: " << OrigVReg << "." << TRI.getSubRegIndexName(SubRegIdx)
-                      << " -> " << NewSSAVReg << " (full register)\n");
+    LLVM_DEBUG(dbgs() << "  Replaced operand: " << OrigVReg << "."
+                      << TRI.getSubRegIndexName(SubRegIdx) << " -> "
+                      << NewSSAVReg << " (full register)\n");
   } else {
-    LLVM_DEBUG(dbgs() << "  Replaced operand: " << OrigVReg << " -> " << NewSSAVReg << "\n");
+    LLVM_DEBUG(dbgs() << "  Replaced operand: " << OrigVReg << " -> "
+                      << NewSSAVReg << "\n");
   }
-  
+
   // Step 5: Index the new instruction in SlotIndexes/LIS
   indexNewInstr(NewDefMI);
-  
+
   // Step 6: Perform common SSA repair (PHI placement + use rewriting)
   // LiveInterval for NewSSAVReg will be created by getInterval() as needed
   performSSARepair(NewSSAVReg, OrigVReg, DefMask, NewDefMI.getParent());
-  
-  // Step 7: If SSA repair created subregister uses of OrigVReg (e.g., in PHIs or REG_SEQUENCEs),
-  // recompute its LiveInterval to create subranges
+
+  // Step 7: If SSA repair created subregister uses of OrigVReg (e.g., in PHIs
+  // or REG_SEQUENCEs), recompute its LiveInterval to create subranges
   LaneBitmask AllLanes = MRI.getMaxLaneMaskForVReg(OrigVReg);
   if (DefMask != AllLanes) {
     LiveInterval &OrigLI = LIS.getInterval(OrigVReg);
@@ -148,17 +160,18 @@ Register MachineLaneSSAUpdater::repairSSAForNewDef(MachineInstr &NewDefMI,
           break;
         }
       }
-      
+
       if (HasSubregUses) {
-        LLVM_DEBUG(dbgs() << "  Recomputing LiveInterval for " << OrigVReg 
+        LLVM_DEBUG(dbgs() << "  Recomputing LiveInterval for " << OrigVReg
                           << " after SSA repair created subregister uses\n");
         LIS.removeInterval(OrigVReg);
         LIS.createAndComputeVirtRegInterval(OrigVReg);
       }
     }
   }
-  
-  LLVM_DEBUG(dbgs() << "  repairSSAForNewDef complete, returning " << NewSSAVReg << "\n");
+
+  LLVM_DEBUG(dbgs() << "  repairSSAForNewDef complete, returning " << NewSSAVReg
+                    << "\n");
   return NewSSAVReg;
 }
 
@@ -166,60 +179,65 @@ Register MachineLaneSSAUpdater::repairSSAForNewDef(MachineInstr &NewDefMI,
 // Common SSA Repair Logic
 //===----------------------------------------------------------------------===//
 
-void MachineLaneSSAUpdater::performSSARepair(Register NewVReg, Register OrigVReg, 
-                                              LaneBitmask DefMask, MachineBasicBlock *DefBB) {
-  LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::performSSARepair NewVReg=" << NewVReg
-                    << " OrigVReg=" << OrigVReg << " DefMask=" << PrintLaneMask(DefMask) << "\n");
-  
+void MachineLaneSSAUpdater::performSSARepair(Register NewVReg,
+                                             Register OrigVReg,
+                                             LaneBitmask DefMask,
+                                             MachineBasicBlock *DefBB) {
+  LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::performSSARepair NewVReg="
+                    << NewVReg << " OrigVReg=" << OrigVReg
+                    << " DefMask=" << PrintLaneMask(DefMask) << "\n");
+
   // Step 1: Use worklist-driven PHI placement
-  SmallVector<Register> AllPHIVRegs = insertLaneAwarePHI(NewVReg, OrigVReg, DefMask, DefBB);
-  
+  SmallVector<Register> AllPHIVRegs =
+      insertLaneAwarePHI(NewVReg, OrigVReg, DefMask, DefBB);
+
   // Step 2: Rewrite dominated uses once for each new register
   // Note: getInterval() will automatically create LiveIntervals if needed
   rewriteDominatedUses(OrigVReg, NewVReg, DefMask);
   for (Register PHIVReg : AllPHIVRegs) {
     rewriteDominatedUses(OrigVReg, PHIVReg, DefMask);
   }
-  
+
   // Step 3: Renumber values if needed
   LiveInterval &NewLI = LIS.getInterval(NewVReg);
   NewLI.RenumberValues();
-  
+
   // Also renumber PHI intervals
   for (Register PHIVReg : AllPHIVRegs) {
     LiveInterval &PHILI = LIS.getInterval(PHIVReg);
     PHILI.RenumberValues();
   }
-  
+
   // Recompute OrigVReg's LiveInterval to account for PHI operands
   // We do a full recomputation because PHI operands may reference subregisters
   // that weren't previously live on those paths, and we need to extend liveness
   // from the definition to the PHI use.
   LIS.removeInterval(OrigVReg);
   LIS.createAndComputeVirtRegInterval(OrigVReg);
-  
-  // Note: We do NOT call shrinkToUses on OrigVReg even after recomputation because:
-  // shrinkToUses has a fundamental bug with PHI operands - it doesn't understand
-  // that PHI operands require their source lanes to be live at the END of
-  // predecessor blocks. When it sees a PHI operand like "%0.sub2_sub3" from BB3,
-  // it only considers the PHI location (start of join block), not the predecessor
-  // end where the value must be available. This causes it to incorrectly shrink
-  // away lanes that ARE needed by PHI operands, leading to verification errors:
-  // "Not all lanes of PHI source live at use". The createAndComputeVirtRegInterval
-  // already produces correct, minimal liveness that includes PHI uses properly.
-  
+
+  // Note: We do NOT call shrinkToUses on OrigVReg even after recomputation
+  // because: shrinkToUses has a fundamental bug with PHI operands - it doesn't
+  // understand that PHI operands require their source lanes to be live at the
+  // END of predecessor blocks. When it sees a PHI operand like "%0.sub2_sub3"
+  // from BB3, it only considers the PHI location (start of join block), not the
+  // predecessor end where the value must be available. This causes it to
+  // incorrectly shrink away lanes that ARE needed by PHI operands, leading to
+  // verification errors: "Not all lanes of PHI source live at use". The
+  // createAndComputeVirtRegInterval already produces correct, minimal liveness
+  // that includes PHI uses properly.
+
   // Step 4: Update operand flags to match the LiveIntervals
   updateDeadFlags(NewVReg);
   for (Register PHIVReg : AllPHIVRegs) {
     updateDeadFlags(PHIVReg);
   }
-  
+
   // Step 5: Verification if enabled
   if (VerifyOnExit) {
     LLVM_DEBUG(dbgs() << "  Verifying after SSA repair...\n");
     // TODO: Add verification calls
   }
-  
+
   LLVM_DEBUG(dbgs() << "  performSSARepair complete\n");
 }
 
@@ -229,62 +247,64 @@ void MachineLaneSSAUpdater::performSSARepair(Register NewVReg, Register OrigVReg
 
 SlotIndex MachineLaneSSAUpdater::indexNewInstr(MachineInstr &MI) {
   LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::indexNewInstr: " << MI);
-  
+
   // Register the instruction in SlotIndexes and LiveIntervals
   // This is typically done automatically when instructions are inserted,
   // but we need to ensure it's properly indexed
   SlotIndexes *SI = LIS.getSlotIndexes();
-  
+
   // Check if instruction is already indexed
   if (SI->hasIndex(MI)) {
     SlotIndex Idx = SI->getInstructionIndex(MI);
     LLVM_DEBUG(dbgs() << "  Already indexed at " << Idx << "\n");
     return Idx;
   }
-  
+
   // Insert the instruction in maps - this should be done by the caller
   // before calling our SSA repair methods, but we can verify
   LIS.InsertMachineInstrInMaps(MI);
-  
+
   SlotIndex Idx = SI->getInstructionIndex(MI);
   LLVM_DEBUG(dbgs() << "  Indexed at " << Idx << "\n");
   return Idx;
 }
 
-void MachineLaneSSAUpdater::extendPreciselyAt(const Register VReg,
-                                              const SmallVector<LaneBitmask, 4> &LaneMasks,
-                                              const MachineInstr &AtMI) {
-  LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::extendPreciselyAt VReg=" << VReg 
+void MachineLaneSSAUpdater::extendPreciselyAt(
+    const Register VReg, const SmallVector<LaneBitmask, 4> &LaneMasks,
+    const MachineInstr &AtMI) {
+  LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::extendPreciselyAt VReg=" << VReg
                     << " at " << LIS.getInstructionIndex(AtMI) << "\n");
-  
+
   if (!VReg.isVirtual()) {
     return; // Only handle virtual registers
   }
-  
+
   SlotIndex DefIdx = LIS.getInstructionIndex(AtMI).getRegSlot();
-  
+
   // Create or get the LiveInterval for this register
   LiveInterval &LI = LIS.getInterval(VReg);
-  
+
   // Extend the main live range to include the definition point
-  SmallVector<SlotIndex, 2> DefPoint = { DefIdx };
+  SmallVector<SlotIndex, 2> DefPoint = {DefIdx};
   LIS.extendToIndices(LI, DefPoint);
-  
+
   // For each lane mask, ensure appropriate subranges exist and are extended
-  // For now, assume all lanes are valid - we'll refine this later based on register class
+  // For now, assume all lanes are valid - we'll refine this later based on
+  // register class
   LaneBitmask RegCoverageMask = MF.getRegInfo().getMaxLaneMaskForVReg(VReg);
-  
+
   for (LaneBitmask LaneMask : LaneMasks) {
-    if (LaneMask == MF.getRegInfo().getMaxLaneMaskForVReg(VReg) || LaneMask == LaneBitmask::getNone()) {
+    if (LaneMask == MF.getRegInfo().getMaxLaneMaskForVReg(VReg) ||
+        LaneMask == LaneBitmask::getNone()) {
       continue; // Main range handles getAll(), skip getNone()
     }
-    
+
     // Only process lanes that are valid for this register class
     LaneBitmask ValidLanes = LaneMask & RegCoverageMask;
     if (ValidLanes.none()) {
       continue;
     }
-    
+
     // Find or create the appropriate subrange
     LiveInterval::SubRange *SR = nullptr;
     for (LiveInterval::SubRange &Sub : LI.subranges()) {
@@ -296,49 +316,52 @@ void MachineLaneSSAUpdater::extendPreciselyAt(const Register VReg,
     if (!SR) {
       SR = LI.createSubRange(LIS.getVNInfoAllocator(), ValidLanes);
     }
-    
+
     // Extend this subrange to include the definition point
     LIS.extendToIndices(*SR, DefPoint);
-    
-    LLVM_DEBUG(dbgs() << "  Extended subrange " << PrintLaneMask(ValidLanes) << "\n");
+
+    LLVM_DEBUG(dbgs() << "  Extended subrange " << PrintLaneMask(ValidLanes)
+                      << "\n");
   }
-  
+
   LLVM_DEBUG(dbgs() << "  LiveInterval extension complete\n");
 }
 
-void MachineLaneSSAUpdater::computePrunedIDF(Register OrigVReg,
-                                              LaneBitmask DefMask,
-                                              ArrayRef<MachineBasicBlock *> NewDefBlocks,
-                                              SmallVectorImpl<MachineBasicBlock *> &OutIDFBlocks) {
-  LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::computePrunedIDF VReg=" << OrigVReg 
-                    << " DefMask=" << PrintLaneMask(DefMask)
+void MachineLaneSSAUpdater::computePrunedIDF(
+    Register OrigVReg, LaneBitmask DefMask,
+    ArrayRef<MachineBasicBlock *> NewDefBlocks,
+    SmallVectorImpl<MachineBasicBlock *> &OutIDFBlocks) {
+  LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::computePrunedIDF VReg="
+                    << OrigVReg << " DefMask=" << PrintLaneMask(DefMask)
                     << " with " << NewDefBlocks.size() << " new def blocks\n");
-  
+
   // Clear output vector at entry
   OutIDFBlocks.clear();
-  
+
   // Early bail-out checks for robustness
   if (!OrigVReg.isVirtual()) {
     LLVM_DEBUG(dbgs() << "  Skipping non-virtual register\n");
     return;
   }
-  
+
   if (!LIS.hasInterval(OrigVReg)) {
-    LLVM_DEBUG(dbgs() << "  OrigVReg not tracked by LiveIntervals, bailing out\n");
+    LLVM_DEBUG(
+        dbgs() << "  OrigVReg not tracked by LiveIntervals, bailing out\n");
     return;
   }
-  
+
   // Get the main LiveInterval for OrigVReg
   LiveInterval &LI = LIS.getInterval(OrigVReg);
-  
-  // Build prune set: blocks where specified lanes (DefMask) are live-in at entry
+
+  // Build prune set: blocks where specified lanes (DefMask) are live-in at
+  // entry
   SmallPtrSet<MachineBasicBlock *, 32> LiveIn;
   for (MachineBasicBlock &BB : MF) {
     SlotIndex Start = LIS.getMBBStartIdx(&BB);
-    
+
     // Collect live lanes at block entry
     LaneBitmask LiveLanes = LaneBitmask::getNone();
-    
+
     if (DefMask == MF.getRegInfo().getMaxLaneMaskForVReg(OrigVReg)) {
       // For full register (e.g., reload case), check main interval
       if (LI.liveAt(Start)) {
@@ -351,20 +374,20 @@ void MachineLaneSSAUpdater::computePrunedIDF(Register OrigVReg,
           LiveLanes |= S.LaneMask;
         }
       }
-      
+
       // If no subranges found but main interval is live,
       // assume all lanes are covered by the main interval
       if (LiveLanes == LaneBitmask::getNone() && LI.liveAt(Start)) {
         LiveLanes = MF.getRegInfo().getMaxLaneMaskForVReg(OrigVReg);
       }
     }
-    
+
     // Check if any of the requested lanes (DefMask) are live
     if ((LiveLanes & DefMask).any()) {
       LiveIn.insert(&BB);
     }
   }
-  
+
   // Seed set: the blocks where new defs exist (e.g., reload or prior PHIs)
   SmallPtrSet<MachineBasicBlock *, 8> DefBlocks;
   for (MachineBasicBlock *B : NewDefBlocks) {
@@ -372,82 +395,86 @@ void MachineLaneSSAUpdater::computePrunedIDF(Register OrigVReg,
       DefBlocks.insert(B);
     }
   }
-  
+
   // Early exit if either set is empty
   if (DefBlocks.empty() || LiveIn.empty()) {
-    LLVM_DEBUG(dbgs() << "  DefBlocks=" << DefBlocks.size() << " LiveIn=" << LiveIn.size() 
-                      << ", early exit\n");
+    LLVM_DEBUG(dbgs() << "  DefBlocks=" << DefBlocks.size()
+                      << " LiveIn=" << LiveIn.size() << ", early exit\n");
     return;
   }
-  
-  LLVM_DEBUG(dbgs() << "  DefBlocks=" << DefBlocks.size() << " LiveIn=" << LiveIn.size() << "\n");
-  
+
+  LLVM_DEBUG(dbgs() << "  DefBlocks=" << DefBlocks.size()
+                    << " LiveIn=" << LiveIn.size() << "\n");
+
   // Use LLVM's IDFCalculatorBase for MachineBasicBlock with forward dominance
   using NodeTy = MachineBasicBlock;
-  
+
   // Access the underlying DomTreeBase from MachineDominatorTree
   // MachineDominatorTree inherits from DomTreeBase<MachineBasicBlock>
   DomTreeBase<NodeTy> &DT = MDT;
-  
+
   // Compute pruned IDF (forward dominance, IsPostDom=false)
   llvm::IDFCalculatorBase<NodeTy, /*IsPostDom=*/false> IDF(DT);
   IDF.setDefiningBlocks(DefBlocks);
   IDF.setLiveInBlocks(LiveIn);
   IDF.calculate(OutIDFBlocks);
-  
+
   LLVM_DEBUG(dbgs() << "  Computed " << OutIDFBlocks.size() << " IDF blocks\n");
-  
-  // Note: We do not place PHIs here; this function only computes candidate 
+
+  // Note: We do not place PHIs here; this function only computes candidate
   // join blocks. The IDFCalculator handles deduplication automatically.
 }
 
-SmallVector<Register> MachineLaneSSAUpdater::insertLaneAwarePHI(Register InitialVReg,
-                                                                Register OrigVReg,
-                                                                LaneBitmask DefMask,
-                                                                MachineBasicBlock *InitialDefBB) {
-  LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::insertLaneAwarePHI InitialVReg=" << InitialVReg
-                    << " OrigVReg=" << OrigVReg << " DefMask=" << PrintLaneMask(DefMask) << "\n");
-  
+SmallVector<Register> MachineLaneSSAUpdater::insertLaneAwarePHI(
+    Register InitialVReg, Register OrigVReg, LaneBitmask DefMask,
+    MachineBasicBlock *InitialDefBB) {
+  LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::insertLaneAwarePHI InitialVReg="
+                    << InitialVReg << " OrigVReg=" << OrigVReg
+                    << " DefMask=" << PrintLaneMask(DefMask) << "\n");
+
   SmallVector<Register> AllCreatedPHIs;
-  
-  // Step 1: Compute IDF (Iterated Dominance Frontier) for the initial definition
-  // This gives us ALL blocks where PHI nodes need to be inserted
+
+  // Step 1: Compute IDF (Iterated Dominance Frontier) for the initial
+  // definition This gives us ALL blocks where PHI nodes need to be inserted
   SmallVector<MachineBasicBlock *> DefBlocks = {InitialDefBB};
   SmallVector<MachineBasicBlock *> IDFBlocks;
   computePrunedIDF(OrigVReg, DefMask, DefBlocks, IDFBlocks);
-  
-  LLVM_DEBUG(dbgs() << "  Computed IDF: found " << IDFBlocks.size() << " blocks needing PHIs\n");
+
+  LLVM_DEBUG(dbgs() << "  Computed IDF: found " << IDFBlocks.size()
+                    << " blocks needing PHIs\n");
   for (MachineBasicBlock *MBB : IDFBlocks) {
     LLVM_DEBUG(dbgs() << "    BB#" << MBB->getNumber() << "\n");
   }
-  
+
   // Step 2: Iterate through IDF blocks sequentially, creating PHIs
   // Key insight: After creating a PHI, update NewVReg to the PHI result
   // so subsequent PHIs use the correct register
   Register CurrentNewVReg = InitialVReg;
-  
+
   for (MachineBasicBlock *JoinMBB : IDFBlocks) {
-    LLVM_DEBUG(dbgs() << "  Creating PHI in BB#" << JoinMBB->getNumber() 
+    LLVM_DEBUG(dbgs() << "  Creating PHI in BB#" << JoinMBB->getNumber()
                       << " with CurrentNewVReg=" << CurrentNewVReg << "\n");
-    
+
     // Create PHI: merges OrigVReg and CurrentNewVReg based on dominance
-    Register PHIResult = createPHIInBlock(*JoinMBB, OrigVReg, CurrentNewVReg, DefMask);
-    
+    Register PHIResult =
+        createPHIInBlock(*JoinMBB, OrigVReg, CurrentNewVReg, DefMask);
+
     if (PHIResult.isValid()) {
       AllCreatedPHIs.push_back(PHIResult);
-      
+
       // Update CurrentNewVReg to be the PHI result
-      // This ensures the next PHI (if any) uses this PHI's result, not the original InitialVReg
+      // This ensures the next PHI (if any) uses this PHI's result, not the
+      // original InitialVReg
       CurrentNewVReg = PHIResult;
-      
-      LLVM_DEBUG(dbgs() << "    Created PHI result VReg=" << PHIResult 
+
+      LLVM_DEBUG(dbgs() << "    Created PHI result VReg=" << PHIResult
                         << ", will use this for subsequent PHIs\n");
     }
   }
-  
-  LLVM_DEBUG(dbgs() << "  PHI insertion complete. Created " 
+
+  LLVM_DEBUG(dbgs() << "  PHI insertion complete. Created "
                     << AllCreatedPHIs.size() << " PHI registers total.\n");
-  
+
   return AllCreatedPHIs;
 }
 
@@ -457,112 +484,120 @@ Register MachineLaneSSAUpdater::createPHIInBlock(MachineBasicBlock &JoinMBB,
                                                  Register NewVReg,
                                                  LaneBitmask DefMask) {
   LLVM_DEBUG(dbgs() << "    createPHIInBlock in BB#" << JoinMBB.getNumber()
-                    << " OrigVReg=" << OrigVReg << " NewVReg=" << NewVReg 
+                    << " OrigVReg=" << OrigVReg << " NewVReg=" << NewVReg
                     << " DefMask=" << PrintLaneMask(DefMask) << "\n");
-  
+
   const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
   const LaneBitmask FullMask = MF.getRegInfo().getMaxLaneMaskForVReg(OrigVReg);
-  
+
   // Check if this is a partial lane redefinition
   const bool IsPartialReload = (DefMask != FullMask);
-  
+
   // Collect PHI operands for the specific reload lanes
   SmallVector<MachineOperand> PHIOperands;
-  
-  LLVM_DEBUG(dbgs() << "      Creating PHI for " << (IsPartialReload ? "partial reload" : "full reload")
+
+  LLVM_DEBUG(dbgs() << "      Creating PHI for "
+                    << (IsPartialReload ? "partial reload" : "full reload")
                     << " DefMask=" << PrintLaneMask(DefMask) << "\n");
-  
+
   // Get the definition block of NewVReg for dominance checks
   MachineRegisterInfo &MRI = MF.getRegInfo();
   MachineInstr *NewDefMI = MRI.getVRegDef(NewVReg);
   MachineBasicBlock *NewDefBB = NewDefMI->getParent();
-  
+
   for (MachineBasicBlock *PredMBB : JoinMBB.predecessors()) {
     // Use dominance check instead of liveness: if NewDefBB dominates PredMBB,
     // then NewVReg is available at the end of PredMBB
     bool UseNewReg = MDT.dominates(NewDefBB, PredMBB);
-    
+
     if (UseNewReg) {
-      // This is the reload path - use NewVReg (always full register for its class)
-      LLVM_DEBUG(dbgs() << "        Pred BB#" << PredMBB->getNumber() 
+      // This is the reload path - use NewVReg (always full register for its
+      // class)
+      LLVM_DEBUG(dbgs() << "        Pred BB#" << PredMBB->getNumber()
                         << " contributes NewVReg (reload path)\n");
-      
-      PHIOperands.push_back(MachineOperand::CreateReg(NewVReg, /*isDef*/ false));
+
+      PHIOperands.push_back(
+          MachineOperand::CreateReg(NewVReg, /*isDef*/ false));
       PHIOperands.push_back(MachineOperand::CreateMBB(PredMBB));
-      
+
     } else {
       // This is the original path - use OrigVReg with appropriate subregister
-      LLVM_DEBUG(dbgs() << "        Pred BB#" << PredMBB->getNumber() 
+      LLVM_DEBUG(dbgs() << "        Pred BB#" << PredMBB->getNumber()
                         << " contributes OrigVReg (original path)\n");
-      
+
       if (IsPartialReload) {
         // Partial case: z = PHI(y, BB1, x.sub2_3, BB0)
         // Use DefMask to find which subreg of OrigVReg was redefined
         unsigned SubIdx = getSubRegIndexForLaneMask(DefMask, &TRI);
-        PHIOperands.push_back(MachineOperand::CreateReg(OrigVReg, /*isDef*/ false,
-                                                       /*isImp*/ false, /*isKill*/ false,
-                                                       /*isDead*/ false, /*isUndef*/ false,
-                                                       /*isEarlyClobber*/ false, SubIdx));
+        PHIOperands.push_back(
+            MachineOperand::CreateReg(OrigVReg, /*isDef*/ false,
+                                      /*isImp*/ false, /*isKill*/ false,
+                                      /*isDead*/ false, /*isUndef*/ false,
+                                      /*isEarlyClobber*/ false, SubIdx));
       } else {
         // Full register case: z = PHI(y, BB1, x, BB0)
-        PHIOperands.push_back(MachineOperand::CreateReg(OrigVReg, /*isDef*/ false));
+        PHIOperands.push_back(
+            MachineOperand::CreateReg(OrigVReg, /*isDef*/ false));
       }
       PHIOperands.push_back(MachineOperand::CreateMBB(PredMBB));
     }
   }
-  
+
   // Create the single lane-specific PHI
   if (!PHIOperands.empty()) {
     const TargetRegisterClass *RC = MF.getRegInfo().getRegClass(NewVReg);
     Register PHIVReg = MF.getRegInfo().createVirtualRegister(RC);
-    
+
     auto PHINode = BuildMI(JoinMBB, JoinMBB.begin(), DebugLoc(),
-                          TII->get(TargetOpcode::PHI), PHIVReg);
+                           TII->get(TargetOpcode::PHI), PHIVReg);
     for (const MachineOperand &Op : PHIOperands) {
       PHINode.add(Op);
     }
-    
+
     MachineInstr *PHI = PHINode.getInstr();
     LIS.InsertMachineInstrInMaps(*PHI);
-    
+
     LLVM_DEBUG(dbgs() << "      Created lane-specific PHI: ");
     LLVM_DEBUG(PHI->print(dbgs()));
-    
+
     return PHIVReg;
   }
-  
+
   return Register();
 }
 
 void MachineLaneSSAUpdater::rewriteDominatedUses(Register OrigVReg,
-                                                  Register NewSSA,
-                                                  LaneBitmask MaskToRewrite) {
-  LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::rewriteDominatedUses OrigVReg=" << OrigVReg
-                    << " NewSSA=" << NewSSA << " Mask=" << PrintLaneMask(MaskToRewrite) << "\n");
-  
+                                                 Register NewSSA,
+                                                 LaneBitmask MaskToRewrite) {
+  LLVM_DEBUG(dbgs() << "MachineLaneSSAUpdater::rewriteDominatedUses OrigVReg="
+                    << OrigVReg << " NewSSA=" << NewSSA
+                    << " Mask=" << PrintLaneMask(MaskToRewrite) << "\n");
+
   const TargetRegisterInfo &TRI = *MF.getSubtarget().getRegisterInfo();
   MachineRegisterInfo &MRI = MF.getRegInfo();
-  
+
   // Find the definition instruction for NewSSA
   MachineInstr *DefMI = MRI.getVRegDef(NewSSA);
   if (!DefMI) {
     LLVM_DEBUG(dbgs() << "  No definition found for NewSSA, skipping\n");
     return;
   }
-  
+
   MachineBasicBlock *DefBB = DefMI->getParent();
   const TargetRegisterClass *NewRC = MRI.getRegClass(NewSSA);
 
-  LLVM_DEBUG(dbgs() << "  Rewriting uses dominated by definition in BB#" << DefBB->getNumber() << ": ");
+  LLVM_DEBUG(dbgs() << "  Rewriting uses dominated by definition in BB#"
+                    << DefBB->getNumber() << ": ");
   LLVM_DEBUG(DefMI->print(dbgs()));
 
   // Get OrigVReg's LiveInterval for reference
   LiveInterval &OrigLI = LIS.getInterval(OrigVReg);
 
   // Iterate through all uses of OrigVReg
-  for (MachineOperand &MO : llvm::make_early_inc_range(MRI.use_operands(OrigVReg))) {
+  for (MachineOperand &MO :
+       llvm::make_early_inc_range(MRI.use_operands(OrigVReg))) {
     MachineInstr *UseMI = MO.getParent();
-    
+
     // Skip the definition instruction itself
     if (UseMI == DefMI)
       continue;
@@ -576,7 +611,8 @@ void MachineLaneSSAUpdater::rewriteDominatedUses(Register OrigVReg,
     if ((OpMask & MaskToRewrite).none())
       continue;
 
-    LLVM_DEBUG(dbgs() << "    Processing use with OpMask=" << PrintLaneMask(OpMask) << ": ");
+    LLVM_DEBUG(dbgs() << "    Processing use with OpMask="
+                      << PrintLaneMask(OpMask) << ": ");
     LLVM_DEBUG(UseMI->print(dbgs()));
 
     const TargetRegisterClass *OpRC = MRI.getRegClass(OrigVReg);
@@ -586,40 +622,44 @@ void MachineLaneSSAUpdater::rewriteDominatedUses(Register OrigVReg,
       // Check register class compatibility
       // If operand uses a subreg, NewRC should match the subreg class
       // If operand uses full register, NewRC should match OpRC
-      const TargetRegisterClass *ExpectedRC = MO.getSubReg() != 0 
-          ? TRI.getSubRegisterClass(OpRC, MO.getSubReg()) 
-          : OpRC;
+      const TargetRegisterClass *ExpectedRC =
+          MO.getSubReg() != 0 ? TRI.getSubRegisterClass(OpRC, MO.getSubReg())
+                              : OpRC;
       bool Compatible = (ExpectedRC == NewRC);
-      
+
       if (Compatible) {
         LLVM_DEBUG(dbgs() << "      Exact match -> direct replacement\n");
         MO.setReg(NewSSA);
-        MO.setSubReg(0); // Clear subregister (NewSSA is a full register of NewRC)
-        
+        MO.setSubReg(
+            0); // Clear subregister (NewSSA is a full register of NewRC)
+
         // Extend NewSSA's live interval to cover this use
         SlotIndex UseIdx = LIS.getInstructionIndex(*UseMI).getRegSlot();
         LiveInterval &NewLI = LIS.getInterval(NewSSA);
         LIS.extendToIndices(NewLI, {UseIdx});
-        
+
         continue;
       }
-      
-      // Incompatible register classes with same lane mask indicates corrupted MIR
-      llvm_unreachable("Incompatible register classes with same lane mask - invalid MIR");
+
+      // Incompatible register classes with same lane mask indicates corrupted
+      // MIR
+      llvm_unreachable(
+          "Incompatible register classes with same lane mask - invalid MIR");
     }
 
     // Case 2: Super/Mixed - use needs more lanes than we're rewriting
     if ((OpMask & ~MaskToRewrite).any()) {
       LLVM_DEBUG(dbgs() << "      Super/Mixed case -> building REG_SEQUENCE\n");
-      
+
       SmallVector<LaneBitmask, 4> LanesToExtend;
       SlotIndex RSIdx;
-      Register RSReg = buildRSForSuperUse(UseMI, MO, OrigVReg, NewSSA, MaskToRewrite,
-                                          OrigLI, OpRC, RSIdx, LanesToExtend);
+      Register RSReg =
+          buildRSForSuperUse(UseMI, MO, OrigVReg, NewSSA, MaskToRewrite, OrigLI,
+                             OpRC, RSIdx, LanesToExtend);
       extendAt(OrigLI, RSIdx, LanesToExtend);
       MO.setReg(RSReg);
       MO.setSubReg(0);
-      
+
       // Extend RSReg's live interval to cover this use
       SlotIndex UseIdx;
       if (UseMI->isPHI()) {
@@ -632,62 +672,69 @@ void MachineLaneSSAUpdater::rewriteDominatedUses(Register OrigVReg,
       }
       LiveInterval &RSLI = LIS.getInterval(RSReg);
       LIS.extendToIndices(RSLI, {UseIdx});
-      
+
       // Update dead flag on REG_SEQUENCE result
       updateDeadFlags(RSReg);
-      
+
     } else {
       // Case 3: Subset - use needs fewer lanes than NewSSA provides
-      // Need to remap subregister index from OrigVReg's register class to NewSSA's register class
+      // Need to remap subregister index from OrigVReg's register class to
+      // NewSSA's register class
       //
-      // Example: OrigVReg is vreg_128, we redefine sub2_3 (64-bit), use accesses sub3 (32-bit)
+      // Example: OrigVReg is vreg_128, we redefine sub2_3 (64-bit), use
+      // accesses sub3 (32-bit)
       //   MaskToRewrite = 0xF0  // sub2_3: lanes 4-7 in vreg_128 space
       //   OpMask        = 0xC0  // sub3:   lanes 6-7 in vreg_128 space
-      //   NewSSA is vreg_64, has lanes 0-3 (but represents lanes 4-7 of OrigVReg)
+      //   NewSSA is vreg_64, has lanes 0-3 (but represents lanes 4-7 of
+      //   OrigVReg)
       //
-      // Algorithm: Shift OpMask down by the bit position of MaskToRewrite's LSB to map
-      // from OrigVReg's lane space into NewSSA's lane space, then find the subreg index.
+      // Algorithm: Shift OpMask down by the bit position of MaskToRewrite's LSB
+      // to map from OrigVReg's lane space into NewSSA's lane space, then find
+      // the subreg index.
       //
       // Why this works:
       //   1. MaskToRewrite is contiguous (comes from subreg definition)
       //   2. OpMask ⊆ MaskToRewrite (we're in subset case by construction)
-      //   3. Lane masks use bit positions that correspond to actual lane indices
+      //   3. Lane masks use bit positions that correspond to actual lane
+      //   indices
       //   4. Subreg boundaries are power-of-2 aligned in register class design
       //
       // Calculation:
-      //   Shift = countTrailingZeros(MaskToRewrite) = 4  // How far "up" MaskToRewrite is
-      //   NewMask = OpMask >> 4 = 0xC0 >> 4 = 0xC        // Map to NewSSA's lane space
-      //   0xC corresponds to sub1 in vreg_64 ✓
-      LLVM_DEBUG(dbgs() << "      Subset case -> remapping subregister index\n");
-      
+      //   Shift = countTrailingZeros(MaskToRewrite) = 4  // How far "up"
+      //   MaskToRewrite is NewMask = OpMask >> 4 = 0xC0 >> 4 = 0xC        //
+      //   Map to NewSSA's lane space 0xC corresponds to sub1 in vreg_64 ✓
+      LLVM_DEBUG(
+          dbgs() << "      Subset case -> remapping subregister index\n");
+
       // Find the bit offset of MaskToRewrite (position of its lowest set bit)
       unsigned ShiftAmt = llvm::countr_zero(MaskToRewrite.getAsInteger());
       assert(ShiftAmt < 64 && "MaskToRewrite should have at least one bit set");
-      
+
       // Shift OpMask down into NewSSA's lane space
       LaneBitmask NewMask = LaneBitmask(OpMask.getAsInteger() >> ShiftAmt);
-      
+
       // Find the subregister index for NewMask in NewSSA's register class
       unsigned NewSubReg = getSubRegIndexForLaneMask(NewMask, &TRI);
       assert(NewSubReg && "Should find subreg index for remapped lanes");
-      
-      LLVM_DEBUG(dbgs() << "        Remapping subreg:\n"
-                        << "          OrigVReg lanes: OpMask=" << PrintLaneMask(OpMask) 
-                        << " MaskToRewrite=" << PrintLaneMask(MaskToRewrite) << "\n"
-                        << "          Shift amount: " << ShiftAmt << "\n"
-                        << "          NewSSA lanes: NewMask=" << PrintLaneMask(NewMask)
-                        << " -> SubReg=" << TRI.getSubRegIndexName(NewSubReg) << "\n");
-      
+
+      LLVM_DEBUG(
+          dbgs() << "        Remapping subreg:\n"
+                 << "          OrigVReg lanes: OpMask=" << PrintLaneMask(OpMask)
+                 << " MaskToRewrite=" << PrintLaneMask(MaskToRewrite) << "\n"
+                 << "          Shift amount: " << ShiftAmt << "\n"
+                 << "          NewSSA lanes: NewMask=" << PrintLaneMask(NewMask)
+                 << " -> SubReg=" << TRI.getSubRegIndexName(NewSubReg) << "\n");
+
       MO.setReg(NewSSA);
       MO.setSubReg(NewSubReg);
-      
+
       // Extend NewSSA's live interval to cover this use
       SlotIndex UseIdx = LIS.getInstructionIndex(*UseMI).getRegSlot();
       LiveInterval &NewLI = LIS.getInterval(NewSSA);
       LIS.extendToIndices(NewLI, {UseIdx});
     }
   }
-  
+
   LLVM_DEBUG(dbgs() << "  Completed rewriting dominated uses\n");
 }
 
@@ -696,8 +743,9 @@ void MachineLaneSSAUpdater::rewriteDominatedUses(Register OrigVReg,
 //===----------------------------------------------------------------------===//
 
 /// Return the VNInfo reaching this PHI operand along its predecessor edge.
-VNInfo *MachineLaneSSAUpdater::incomingOnEdge(LiveInterval &LI, MachineInstr *Phi,
-                                               MachineOperand &PhiOp) {
+VNInfo *MachineLaneSSAUpdater::incomingOnEdge(LiveInterval &LI,
+                                              MachineInstr *Phi,
+                                              MachineOperand &PhiOp) {
   unsigned OpIdx = Phi->getOperandNo(&PhiOp);
   MachineBasicBlock *Pred = Phi->getOperand(OpIdx + 1).getMBB();
   SlotIndex EndB = LIS.getMBBEndIdx(Pred);
@@ -708,8 +756,8 @@ VNInfo *MachineLaneSSAUpdater::incomingOnEdge(LiveInterval &LI, MachineInstr *Ph
 /// During SSA reconstruction, LiveIntervals may not be complete yet, so we use
 /// dominance-based checking rather than querying LiveInterval reachability.
 bool MachineLaneSSAUpdater::defReachesUse(MachineInstr *DefMI,
-                                           MachineInstr *UseMI, 
-                                           MachineOperand &UseOp) {
+                                          MachineInstr *UseMI,
+                                          MachineOperand &UseOp) {
   // For PHI uses, check if DefMI dominates the predecessor block
   if (UseMI->isPHI()) {
     unsigned OpIdx = UseMI->getOperandNo(&UseOp);
@@ -723,7 +771,7 @@ bool MachineLaneSSAUpdater::defReachesUse(MachineInstr *DefMI,
     SlotIndex UseIdx = LIS.getInstructionIndex(*UseMI);
     return DefIdx < UseIdx;
   }
-  
+
   // For cross-block uses, check block dominance
   return MDT.dominates(DefMI->getParent(), UseMI->getParent());
 }
@@ -732,7 +780,7 @@ bool MachineLaneSSAUpdater::defReachesUse(MachineInstr *DefMI,
 LaneBitmask MachineLaneSSAUpdater::operandLaneMask(const MachineOperand &MO) {
   const TargetRegisterInfo &TRI = *MF.getSubtarget().getRegisterInfo();
   MachineRegisterInfo &MRI = MF.getRegInfo();
-  
+
   if (unsigned Sub = MO.getSubReg())
     return TRI.getSubRegIndexLaneMask(Sub);
   return MRI.getMaxLaneMaskForVReg(MO.getReg());
@@ -745,34 +793,35 @@ LaneBitmask MachineLaneSSAUpdater::operandLaneMask(const MachineOperand &MO) {
 /// Key algorithm: Sort candidates by lane count (prefer larger subregs) to get
 /// minimal covering set with largest possible subregisters.
 ///
-/// Example: For vreg_128 with LaneMask = 0x0F | 0xF0 (sub0 + sub2, skipping sub1)
+/// Example: For vreg_128 with LaneMask = 0x0F | 0xF0 (sub0 + sub2, skipping
+/// sub1)
 ///          Returns: [sub0_idx, sub2_idx] (not lo16, hi16, sub2, sub3)
-static SmallVector<unsigned, 4> getCoveringSubRegsForLaneMask(
-    LaneBitmask Mask, const TargetRegisterInfo *TRI, 
-    const TargetRegisterClass *RC) {
+static SmallVector<unsigned, 4>
+getCoveringSubRegsForLaneMask(LaneBitmask Mask, const TargetRegisterInfo *TRI,
+                              const TargetRegisterClass *RC) {
   if (Mask.none())
     return {};
-  
+
   // Step 1: Collect all candidate subregisters that overlap with Mask
   SmallVector<unsigned, 4> Candidates;
   for (unsigned SubIdx = 1; SubIdx < TRI->getNumSubRegIndices(); ++SubIdx) {
     // Check if this subreg index is valid for this register class
     if (!TRI->getSubRegisterClass(RC, SubIdx))
       continue;
-    
+
     LaneBitmask SubMask = TRI->getSubRegIndexLaneMask(SubIdx);
     // Add if it covers any lanes we need
     if ((SubMask & Mask).any()) {
       Candidates.push_back(SubIdx);
     }
   }
-  
+
   // Step 2: Sort by number of lanes (descending) to prefer larger subregisters
   llvm::stable_sort(Candidates, [&](unsigned A, unsigned B) {
     return TRI->getSubRegIndexLaneMask(A).getNumLanes() >
            TRI->getSubRegIndexLaneMask(B).getNumLanes();
   });
-  
+
   // Step 3: Greedily select subregisters, largest first
   SmallVector<unsigned, 4> OptimalSubIndices;
   for (unsigned SubIdx : Candidates) {
@@ -781,12 +830,12 @@ static SmallVector<unsigned, 4> getCoveringSubRegsForLaneMask(
     if ((Mask & SubMask) == SubMask) {
       OptimalSubIndices.push_back(SubIdx);
       Mask &= ~SubMask; // Remove covered lanes
-      
+
       if (Mask.none())
         break; // All lanes covered
     }
   }
-  
+
   return OptimalSubIndices;
 }
 
@@ -794,16 +843,15 @@ static SmallVector<unsigned, 4> getCoveringSubRegsForLaneMask(
 /// Inserts at the PHI predecessor terminator (for PHI uses) or right before
 /// UseMI otherwise. Returns the new full-width vreg, the RS index via OutIdx,
 /// and the subrange lane masks that should be extended to that point.
-Register MachineLaneSSAUpdater::buildRSForSuperUse(MachineInstr *UseMI, MachineOperand &MO,
-                                                   Register OldVR, Register NewVR,
-                                                   LaneBitmask MaskToRewrite, LiveInterval &LI,
-                                                   const TargetRegisterClass *OpRC,
-                                                   SlotIndex &OutIdx,
-                                                   SmallVectorImpl<LaneBitmask> &LanesToExtend) {
+Register MachineLaneSSAUpdater::buildRSForSuperUse(
+    MachineInstr *UseMI, MachineOperand &MO, Register OldVR, Register NewVR,
+    LaneBitmask MaskToRewrite, LiveInterval &LI,
+    const TargetRegisterClass *OpRC, SlotIndex &OutIdx,
+    SmallVectorImpl<LaneBitmask> &LanesToExtend) {
   const TargetInstrInfo &TII = *MF.getSubtarget().getInstrInfo();
   const TargetRegisterInfo &TRI = *MF.getSubtarget().getRegisterInfo();
   MachineRegisterInfo &MRI = MF.getRegInfo();
-  
+
   MachineBasicBlock *InsertBB = UseMI->getParent();
   MachineBasicBlock::iterator IP(UseMI);
   SlotIndex QueryIdx;
@@ -825,71 +873,76 @@ Register MachineLaneSSAUpdater::buildRSForSuperUse(MachineInstr *UseMI, MachineO
 
   // Determine what lanes the use needs
   LaneBitmask UseMask = operandLaneMask(MO);
-  
+
   // Decompose into lanes from NewVR (updated) and lanes from OldVR (unchanged)
   LaneBitmask LanesFromNew = UseMask & MaskToRewrite;
   LaneBitmask LanesFromOld = UseMask & ~MaskToRewrite;
-  
-  LLVM_DEBUG(dbgs() << "        Building REG_SEQUENCE: UseMask=" << PrintLaneMask(UseMask)
+
+  LLVM_DEBUG(dbgs() << "        Building REG_SEQUENCE: UseMask="
+                    << PrintLaneMask(UseMask)
                     << " LanesFromNew=" << PrintLaneMask(LanesFromNew)
                     << " LanesFromOld=" << PrintLaneMask(LanesFromOld) << "\n");
-  
+
   SmallDenseSet<unsigned, 8> AddedSubIdxs;
-  
+
   // Add source for lanes from NewVR (updated lanes)
   if (LanesFromNew.any()) {
     unsigned SubIdx = getSubRegIndexForLaneMask(LanesFromNew, &TRI);
     assert(SubIdx && "Failed to find subregister index for LanesFromNew");
-    RS.addReg(NewVR, 0, 0).addImm(SubIdx);  // NewVR is full register, no subreg
+    RS.addReg(NewVR, 0, 0).addImm(SubIdx); // NewVR is full register, no subreg
     AddedSubIdxs.insert(SubIdx);
     LanesToExtend.push_back(LanesFromNew);
   }
-  
+
   // Add source for lanes from OldVR (unchanged lanes)
   // Handle both contiguous and non-contiguous lane masks
-  // Non-contiguous example: Redefining only sub2 of vreg_128 leaves LanesFromOld = sub0+sub1+sub3
+  // Non-contiguous example: Redefining only sub2 of vreg_128 leaves
+  // LanesFromOld = sub0+sub1+sub3
   if (LanesFromOld.any()) {
     unsigned SubIdx = getSubRegIndexForLaneMask(LanesFromOld, &TRI);
-    
+
     if (SubIdx) {
       // Contiguous case: single subregister covers all lanes
-      RS.addReg(OldVR, 0, SubIdx).addImm(SubIdx);  // OldVR.subIdx
+      RS.addReg(OldVR, 0, SubIdx).addImm(SubIdx); // OldVR.subIdx
       AddedSubIdxs.insert(SubIdx);
       LanesToExtend.push_back(LanesFromOld);
     } else {
       // Non-contiguous case: decompose into multiple subregisters
       const TargetRegisterClass *OldRC = MRI.getRegClass(OldVR);
-      SmallVector<unsigned, 4> CoveringSubRegs = 
+      SmallVector<unsigned, 4> CoveringSubRegs =
           getCoveringSubRegsForLaneMask(LanesFromOld, &TRI, OldRC);
-      
-      assert(!CoveringSubRegs.empty() && 
-             "Failed to decompose non-contiguous lane mask into covering subregs");
-      
-      LLVM_DEBUG(dbgs() << "        Non-contiguous LanesFromOld=" << PrintLaneMask(LanesFromOld)
-                        << " decomposed into " << CoveringSubRegs.size() << " subregs\n");
-      
+
+      assert(
+          !CoveringSubRegs.empty() &&
+          "Failed to decompose non-contiguous lane mask into covering subregs");
+
+      LLVM_DEBUG(dbgs() << "        Non-contiguous LanesFromOld="
+                        << PrintLaneMask(LanesFromOld) << " decomposed into "
+                        << CoveringSubRegs.size() << " subregs\n");
+
       // Add each covering subregister as a source to the REG_SEQUENCE
       for (unsigned CoverSubIdx : CoveringSubRegs) {
         LaneBitmask CoverMask = TRI.getSubRegIndexLaneMask(CoverSubIdx);
-        RS.addReg(OldVR, 0, CoverSubIdx).addImm(CoverSubIdx);  // OldVR.CoverSubIdx
+        RS.addReg(OldVR, 0, CoverSubIdx)
+            .addImm(CoverSubIdx); // OldVR.CoverSubIdx
         AddedSubIdxs.insert(CoverSubIdx);
         LanesToExtend.push_back(CoverMask);
-        
-        LLVM_DEBUG(dbgs() << "          Added source: OldVR." 
-                          << TRI.getSubRegIndexName(CoverSubIdx)
-                          << " covering " << PrintLaneMask(CoverMask) << "\n");
+
+        LLVM_DEBUG(dbgs() << "          Added source: OldVR."
+                          << TRI.getSubRegIndexName(CoverSubIdx) << " covering "
+                          << PrintLaneMask(CoverMask) << "\n");
       }
     }
   }
-  
+
   assert(!AddedSubIdxs.empty() && "REG_SEQUENCE must have at least one source");
 
   LIS.InsertMachineInstrInMaps(*RS);
   OutIdx = LIS.getInstructionIndex(*RS);
-  
+
   // Create live interval for the REG_SEQUENCE result
   LIS.createAndComputeVirtRegInterval(Dest);
-  
+
   // Extend live intervals of all source registers to cover this REG_SEQUENCE
   // Use the register slot to ensure the live range covers the use
   SlotIndex UseSlot = OutIdx.getRegSlot();
@@ -908,7 +961,7 @@ Register MachineLaneSSAUpdater::buildRSForSuperUse(MachineInstr *UseMI, MachineO
 }
 
 /// Extend LI (and only the specified subranges) at Idx.
-void MachineLaneSSAUpdater::extendAt(LiveInterval &LI, SlotIndex Idx, 
+void MachineLaneSSAUpdater::extendAt(LiveInterval &LI, SlotIndex Idx,
                                      ArrayRef<LaneBitmask> Lanes) {
   SmallVector<SlotIndex, 1> P{Idx};
   LIS.extendToIndices(LI, P);
@@ -924,7 +977,7 @@ void MachineLaneSSAUpdater::updateDeadFlags(Register Reg) {
   MachineInstr *DefMI = MRI.getVRegDef(Reg);
   if (!DefMI)
     return;
-  
+
   for (MachineOperand &MO : DefMI->defs()) {
     if (MO.getReg() == Reg && MO.isDead()) {
       // Check if this register is actually live (has uses)
diff --git a/llvm/unittests/CodeGen/MachineLaneSSAUpdaterSpillReloadTest.cpp b/llvm/unittests/CodeGen/MachineLaneSSAUpdaterSpillReloadTest.cpp
index 81a1f7703..7ce940e3e 100644
--- a/llvm/unittests/CodeGen/MachineLaneSSAUpdaterSpillReloadTest.cpp
+++ b/llvm/unittests/CodeGen/MachineLaneSSAUpdaterSpillReloadTest.cpp
@@ -9,7 +9,7 @@
 // Unit tests for MachineLaneSSAUpdater focusing on spill/reload scenarios.
 //
 // NOTE: This file is currently a placeholder for future spiller-specific tests.
-// Analysis showed that repairSSAForNewDef() is sufficient for spill/reload 
+// Analysis showed that repairSSAForNewDef() is sufficient for spill/reload
 // scenarios - no special spill handling is needed. The spiller workflow is:
 //   1. Insert reload instruction before use
 //   2. Call repairSSAForNewDef(ReloadMI, SpilledReg)
@@ -19,14 +19,14 @@
 //
 //===----------------------------------------------------------------------===//
 
-#include "llvm/CodeGen/MachineLaneSSAUpdater.h"
 #include "llvm/CodeGen/LiveIntervals.h"
+#include "llvm/CodeGen/MIRParser/MIRParser.h"
 #include "llvm/CodeGen/MachineDominators.h"
 #include "llvm/CodeGen/MachineFunction.h"
 #include "llvm/CodeGen/MachineInstrBuilder.h"
+#include "llvm/CodeGen/MachineLaneSSAUpdater.h"
 #include "llvm/CodeGen/MachineModuleInfo.h"
 #include "llvm/CodeGen/MachineRegisterInfo.h"
-#include "llvm/CodeGen/MIRParser/MIRParser.h"
 #include "llvm/CodeGen/SlotIndexes.h"
 #include "llvm/CodeGen/TargetInstrInfo.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
@@ -56,10 +56,10 @@ struct SpillReloadTestPass : public MachineFunctionPass {
 char SpillReloadTestPass::ID = 0;
 
 namespace llvm {
-  void initializeSpillReloadTestPassPass(PassRegistry &);
+void initializeSpillReloadTestPassPass(PassRegistry &);
 }
 
-INITIALIZE_PASS(SpillReloadTestPass, "spillreloadtestpass", 
+INITIALIZE_PASS(SpillReloadTestPass, "spillreloadtestpass",
                 "spillreloadtestpass", false, false)
 
 namespace {
@@ -82,7 +82,7 @@ std::unique_ptr<TargetMachine> createTargetMachine() {
   const Target *T = TargetRegistry::lookupTarget("", TT, Error);
   if (!T)
     return nullptr;
-    
+
   TargetOptions Options;
   return std::unique_ptr<TargetMachine>(
       T->createTargetMachine(TT, "gfx900", "", Options, std::nullopt,
@@ -116,13 +116,14 @@ std::unique_ptr<Module> parseMIR(LLVMContext &Context,
 
 template <typename AnalysisType>
 struct SpillReloadTestPassT : public SpillReloadTestPass {
-  typedef std::function<void(MachineFunction&, AnalysisType&)> TestFx;
+  typedef std::function<void(MachineFunction &, AnalysisType &)> TestFx;
 
   SpillReloadTestPassT() {
-    // We should never call this but always use PM.add(new SpillReloadTestPass(...))
+    // We should never call this but always use PM.add(new
+    // SpillReloadTestPass(...))
     abort();
   }
-  
+
   SpillReloadTestPassT(TestFx T, bool ShouldPass)
       : T(T), ShouldPass(ShouldPass) {
     initializeSpillReloadTestPassPass(*PassRegistry::getPassRegistry());
@@ -132,8 +133,8 @@ struct SpillReloadTestPassT : public SpillReloadTestPass {
     AnalysisType &A = getAnalysis<AnalysisType>();
     T(MF, A);
     bool VerifyResult = MF.verify(this, /* Banner=*/nullptr,
-                                   /*OS=*/&llvm::errs(),
-                                   /* AbortOnError=*/false);
+                                  /*OS=*/&llvm::errs(),
+                                  /* AbortOnError=*/false);
     EXPECT_EQ(VerifyResult, ShouldPass);
     return true;
   }
@@ -144,7 +145,7 @@ struct SpillReloadTestPassT : public SpillReloadTestPass {
     AU.addPreserved<AnalysisType>();
     MachineFunctionPass::getAnalysisUsage(AU);
   }
-  
+
 private:
   TestFx T;
   bool ShouldPass;
@@ -155,7 +156,7 @@ static void doTest(StringRef MIRFunc,
                    typename SpillReloadTestPassT<AnalysisType>::TestFx T,
                    bool ShouldPass = true) {
   initLLVM();
-  
+
   LLVMContext Context;
   std::unique_ptr<TargetMachine> TM = createTargetMachine();
   if (!TM)
@@ -171,9 +172,10 @@ static void doTest(StringRef MIRFunc,
   PM.run(*M);
 }
 
-static void liveIntervalsTest(StringRef MIRFunc,
-                              SpillReloadTestPassT<LiveIntervalsWrapperPass>::TestFx T,
-                              bool ShouldPass = true) {
+static void
+liveIntervalsTest(StringRef MIRFunc,
+                  SpillReloadTestPassT<LiveIntervalsWrapperPass>::TestFx T,
+                  bool ShouldPass = true) {
   SmallString<512> S;
   StringRef MIRString = (Twine(R"MIR(
 --- |
@@ -186,7 +188,8 @@ registers:
   - { id: 0, class: vgpr_32 }
 body: |
   bb.0:
-)MIR") + Twine(MIRFunc) + Twine("...\n")).toNullTerminatedStringRef(S);
+)MIR") + Twine(MIRFunc) + Twine("...\n"))
+                            .toNullTerminatedStringRef(S);
 
   doTest<LiveIntervalsWrapperPass>(MIRString, T, ShouldPass);
 }
@@ -226,7 +229,8 @@ body: |
 // - No PHI needed (linear CFG)
 //
 TEST(MachineLaneSSAUpdaterSpillReloadTest, SimpleLinearSpillReload) {
-  liveIntervalsTest(R"MIR(
+  liveIntervalsTest(
+      R"MIR(
     %0:vgpr_32 = V_MOV_B32_e32 42, implicit $exec
     S_BRANCH %bb.1
 
@@ -239,94 +243,96 @@ TEST(MachineLaneSSAUpdaterSpillReloadTest, SimpleLinearSpillReload) {
     %2:vgpr_32 = V_ADD_U32_e32 %0, %1, implicit $exec
     S_ENDPGM 0
 )MIR",
-    [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
-      LiveIntervals &LIS = LISWrapper.getLIS();
-      MachineDominatorTree MDT(MF);
-      const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
-      
-      // Verify we have 3 blocks as expected
-      ASSERT_EQ(MF.size(), 3u) << "Should have bb.0, bb.1, bb.2";
-      
-      MachineBasicBlock *BB0 = MF.getBlockNumbered(0);
-      MachineBasicBlock *BB2 = MF.getBlockNumbered(2);
-      
-      // Find %0 definition in BB0 (first instruction should be V_MOV_B32)
-      MachineInstr *OrigDefMI = &*BB0->begin();
-      ASSERT_TRUE(OrigDefMI && OrigDefMI->getNumOperands() > 0);
-      Register OrigReg = OrigDefMI->getOperand(0).getReg();
-      ASSERT_TRUE(OrigReg.isValid()) << "Should have valid original register %0";
-      
-      // STEP 1: Insert reload instruction in BB2 before the use
-      // This creates a second definition of %0, violating SSA
-      const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
-      auto InsertPt = BB2->getFirstNonPHI();
-      
-      // Get opcode and register from the existing V_MOV_B32 in BB0
-      unsigned MovOpcode = OrigDefMI->getOpcode();
-      Register ExecReg = OrigDefMI->getOperand(2).getReg();
-      
-      // Insert reload: %0 = V_MOV_B32 999 (simulating load from stack)
-      // This violates SSA because %0 is already defined in BB0
-      MachineInstr *ReloadMI = BuildMI(*BB2, InsertPt, DebugLoc(),
-                                        TII->get(MovOpcode), OrigReg)
-                                   .addImm(999)  // Simulated reload value
-                                   .addReg(ExecReg, RegState::Implicit);
-      
-      // Set MachineFunction properties to allow SSA
-      MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
-      MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
-      
-      // STEP 2: Call repairSSAForNewDef to fix the SSA violation
-      // This will:
-      //   - Rename the reload to define a new register
-      //   - Rewrite uses dominated by the reload
-      //   - Naturally prune OrigReg's LiveInterval via recomputation
-      MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
-      Register ReloadReg = Updater.repairSSAForNewDef(*ReloadMI, OrigReg);
-      
-      // VERIFY RESULTS:
-      
-      // 1. ReloadReg should be valid and different from OrigReg
-      EXPECT_TRUE(ReloadReg.isValid()) << "Updater should return valid register";
-      EXPECT_NE(ReloadReg, OrigReg) << "Reload register should be different from original";
-      
-      // 2. ReloadMI should define the new ReloadReg (not OrigReg)
-      EXPECT_EQ(ReloadMI->getOperand(0).getReg(), ReloadReg) 
-          << "ReloadMI should define new reload register";
-      
-      // 3. Verify the ReloadReg has a valid LiveInterval
-      EXPECT_TRUE(LIS.hasInterval(ReloadReg)) 
-          << "Reload register should have live interval";
-      
-      // 4. No PHI should be inserted (linear CFG, reload dominates subsequent uses)
-      bool FoundPHI = false;
-      for (MachineBasicBlock &MBB : MF) {
-        for (MachineInstr &MI : MBB) {
-          if (MI.isPHI()) {
-            FoundPHI = true;
-            break;
+      [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
+        LiveIntervals &LIS = LISWrapper.getLIS();
+        MachineDominatorTree MDT(MF);
+        const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+
+        // Verify we have 3 blocks as expected
+        ASSERT_EQ(MF.size(), 3u) << "Should have bb.0, bb.1, bb.2";
+
+        MachineBasicBlock *BB0 = MF.getBlockNumbered(0);
+        MachineBasicBlock *BB2 = MF.getBlockNumbered(2);
+
+        // Find %0 definition in BB0 (first instruction should be V_MOV_B32)
+        MachineInstr *OrigDefMI = &*BB0->begin();
+        ASSERT_TRUE(OrigDefMI && OrigDefMI->getNumOperands() > 0);
+        Register OrigReg = OrigDefMI->getOperand(0).getReg();
+        ASSERT_TRUE(OrigReg.isValid())
+            << "Should have valid original register %0";
+
+        // STEP 1: Insert reload instruction in BB2 before the use
+        // This creates a second definition of %0, violating SSA
+        const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+        auto InsertPt = BB2->getFirstNonPHI();
+
+        // Get opcode and register from the existing V_MOV_B32 in BB0
+        unsigned MovOpcode = OrigDefMI->getOpcode();
+        Register ExecReg = OrigDefMI->getOperand(2).getReg();
+
+        // Insert reload: %0 = V_MOV_B32 999 (simulating load from stack)
+        // This violates SSA because %0 is already defined in BB0
+        MachineInstr *ReloadMI =
+            BuildMI(*BB2, InsertPt, DebugLoc(), TII->get(MovOpcode), OrigReg)
+                .addImm(999) // Simulated reload value
+                .addReg(ExecReg, RegState::Implicit);
+
+        // Set MachineFunction properties to allow SSA
+        MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
+        MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
+
+        // STEP 2: Call repairSSAForNewDef to fix the SSA violation
+        // This will:
+        //   - Rename the reload to define a new register
+        //   - Rewrite uses dominated by the reload
+        //   - Naturally prune OrigReg's LiveInterval via recomputation
+        MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
+        Register ReloadReg = Updater.repairSSAForNewDef(*ReloadMI, OrigReg);
+
+        // VERIFY RESULTS:
+
+        // 1. ReloadReg should be valid and different from OrigReg
+        EXPECT_TRUE(ReloadReg.isValid())
+            << "Updater should return valid register";
+        EXPECT_NE(ReloadReg, OrigReg)
+            << "Reload register should be different from original";
+
+        // 2. ReloadMI should define the new ReloadReg (not OrigReg)
+        EXPECT_EQ(ReloadMI->getOperand(0).getReg(), ReloadReg)
+            << "ReloadMI should define new reload register";
+
+        // 3. Verify the ReloadReg has a valid LiveInterval
+        EXPECT_TRUE(LIS.hasInterval(ReloadReg))
+            << "Reload register should have live interval";
+
+        // 4. No PHI should be inserted (linear CFG, reload dominates subsequent
+        // uses)
+        bool FoundPHI = false;
+        for (MachineBasicBlock &MBB : MF) {
+          for (MachineInstr &MI : MBB) {
+            if (MI.isPHI()) {
+              FoundPHI = true;
+              break;
+            }
           }
         }
-      }
-      EXPECT_FALSE(FoundPHI) 
-          << "Linear CFG should not require PHI nodes";
-      
-      // 5. Verify OrigReg's LiveInterval was naturally pruned
-      //    It should only cover BB0 now (definition to end of block)
-      EXPECT_TRUE(LIS.hasInterval(OrigReg)) 
-          << "Original register should still have live interval";
-      const LiveInterval &OrigLI = LIS.getInterval(OrigReg);
-      
-      // The performSSARepair recomputation naturally prunes OrigReg
-      // because all uses in BB2 were rewritten to ReloadReg
-      SlotIndex OrigEnd = OrigLI.endIndex();
-      
-      // OrigReg should not extend into BB2 where ReloadReg took over
-      SlotIndex BB2Start = LIS.getMBBStartIdx(BB2);
-      EXPECT_LE(OrigEnd, BB2Start) 
-          << "Original register should not extend into BB2 after reload";
-    });
+        EXPECT_FALSE(FoundPHI) << "Linear CFG should not require PHI nodes";
+
+        // 5. Verify OrigReg's LiveInterval was naturally pruned
+        //    It should only cover BB0 now (definition to end of block)
+        EXPECT_TRUE(LIS.hasInterval(OrigReg))
+            << "Original register should still have live interval";
+        const LiveInterval &OrigLI = LIS.getInterval(OrigReg);
+
+        // The performSSARepair recomputation naturally prunes OrigReg
+        // because all uses in BB2 were rewritten to ReloadReg
+        SlotIndex OrigEnd = OrigLI.endIndex();
+
+        // OrigReg should not extend into BB2 where ReloadReg took over
+        SlotIndex BB2Start = LIS.getMBBStartIdx(BB2);
+        EXPECT_LE(OrigEnd, BB2Start)
+            << "Original register should not extend into BB2 after reload";
+      });
 }
 
 } // anonymous namespace
-
diff --git a/llvm/unittests/CodeGen/MachineLaneSSAUpdaterTest.cpp b/llvm/unittests/CodeGen/MachineLaneSSAUpdaterTest.cpp
index 172cbf33d..fb59dbc5d 100644
--- a/llvm/unittests/CodeGen/MachineLaneSSAUpdaterTest.cpp
+++ b/llvm/unittests/CodeGen/MachineLaneSSAUpdaterTest.cpp
@@ -1,4 +1,5 @@
-//===- MachineLaneSSAUpdaterTest.cpp - Unit tests for MachineLaneSSAUpdater -===//
+//===- MachineLaneSSAUpdaterTest.cpp - Unit tests for MachineLaneSSAUpdater
+//-===//
 //
 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
 // See https://llvm.org/LICENSE.txt for license information.
@@ -8,12 +9,12 @@
 
 #include "llvm/CodeGen/MachineLaneSSAUpdater.h"
 #include "llvm/CodeGen/LiveIntervals.h"
+#include "llvm/CodeGen/MIRParser/MIRParser.h"
 #include "llvm/CodeGen/MachineDominators.h"
 #include "llvm/CodeGen/MachineFunction.h"
 #include "llvm/CodeGen/MachineInstrBuilder.h"
 #include "llvm/CodeGen/MachineModuleInfo.h"
 #include "llvm/CodeGen/MachineRegisterInfo.h"
-#include "llvm/CodeGen/MIRParser/MIRParser.h"
 #include "llvm/CodeGen/SlotIndexes.h"
 #include "llvm/CodeGen/TargetInstrInfo.h"
 #include "llvm/CodeGen/TargetRegisterInfo.h"
@@ -45,7 +46,7 @@ struct TestPass : public MachineFunctionPass {
 char TestPass::ID = 0;
 
 namespace llvm {
-  void initializeTestPassPass(PassRegistry &);
+void initializeTestPassPass(PassRegistry &);
 }
 
 INITIALIZE_PASS(TestPass, "testpass", "testpass", false, false)
@@ -70,7 +71,7 @@ std::unique_ptr<TargetMachine> createTargetMachine() {
   const Target *T = TargetRegistry::lookupTarget("", TT, Error);
   if (!T)
     return nullptr;
-    
+
   TargetOptions Options;
   return std::unique_ptr<TargetMachine>(
       T->createTargetMachine(TT, "gfx900", "", Options, std::nullopt,
@@ -102,17 +103,15 @@ std::unique_ptr<Module> parseMIR(LLVMContext &Context,
   return M;
 }
 
-template <typename AnalysisType>
-struct TestPassT : public TestPass {
-  typedef std::function<void(MachineFunction&, AnalysisType&)> TestFx;
+template <typename AnalysisType> struct TestPassT : public TestPass {
+  typedef std::function<void(MachineFunction &, AnalysisType &)> TestFx;
 
   TestPassT() {
     // We should never call this but always use PM.add(new TestPass(...))
     abort();
   }
-  
-  TestPassT(TestFx T, bool ShouldPass)
-      : T(T), ShouldPass(ShouldPass) {
+
+  TestPassT(TestFx T, bool ShouldPass) : T(T), ShouldPass(ShouldPass) {
     initializeTestPassPass(*PassRegistry::getPassRegistry());
   }
 
@@ -120,8 +119,8 @@ struct TestPassT : public TestPass {
     AnalysisType &A = getAnalysis<AnalysisType>();
     T(MF, A);
     bool VerifyResult = MF.verify(this, /* Banner=*/nullptr,
-                                   /*OS=*/&llvm::errs(),
-                                   /* AbortOnError=*/false);
+                                  /*OS=*/&llvm::errs(),
+                                  /* AbortOnError=*/false);
     EXPECT_EQ(VerifyResult, ShouldPass);
     return true;
   }
@@ -132,7 +131,7 @@ struct TestPassT : public TestPass {
     AU.addPreserved<AnalysisType>();
     MachineFunctionPass::getAnalysisUsage(AU);
   }
-  
+
 private:
   TestFx T;
   bool ShouldPass;
@@ -143,7 +142,7 @@ static void doTest(StringRef MIRFunc,
                    typename TestPassT<AnalysisType>::TestFx T,
                    bool ShouldPass = true) {
   initLLVM();
-  
+
   LLVMContext Context;
   std::unique_ptr<TargetMachine> TM = createTargetMachine();
   if (!TM)
@@ -174,7 +173,8 @@ registers:
   - { id: 0, class: vgpr_32 }
 body: |
   bb.0:
-)MIR") + Twine(MIRFunc) + Twine("...\n")).toNullTerminatedStringRef(S);
+)MIR") + Twine(MIRFunc) + Twine("...\n"))
+                            .toNullTerminatedStringRef(S);
 
   doTest<LiveIntervalsWrapperPass>(MIRString, T, ShouldPass);
 }
@@ -195,7 +195,8 @@ body: |
 //       BB4 (use %1) → PHI expected
 //
 TEST(MachineLaneSSAUpdaterTest, NewDefInsertsPhiAndRewritesUses) {
-  liveIntervalsTest(R"MIR(
+  liveIntervalsTest(
+      R"MIR(
     %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
     S_BRANCH %bb.1
 
@@ -220,89 +221,97 @@ TEST(MachineLaneSSAUpdaterTest, NewDefInsertsPhiAndRewritesUses) {
     %5:vgpr_32 = V_ADD_U32_e32 %1, %1, implicit $exec
     S_ENDPGM 0
 )MIR",
-    [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
-      LiveIntervals &LIS = LISWrapper.getLIS();
-      MachineDominatorTree MDT(MF);
-      const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
-      MachineRegisterInfo &MRI = MF.getRegInfo();
-      
-      // Verify we have 5 blocks as expected
-      ASSERT_EQ(MF.size(), 5u) << "Should have bb.0, bb.1, bb.2, bb.3, bb.4";
-      
-      MachineBasicBlock *BB1 = MF.getBlockNumbered(1);
-      MachineBasicBlock *BB3 = MF.getBlockNumbered(3);
-      MachineBasicBlock *BB4 = MF.getBlockNumbered(4);
-      
-      // Get %1 which is defined in bb.1 (first non-PHI instruction)
-      MachineInstr *OrigDefMI = &*BB1->getFirstNonPHI();
-      ASSERT_TRUE(OrigDefMI) << "Could not find instruction in bb.1";
-      ASSERT_TRUE(OrigDefMI->getNumOperands() > 0) << "Instruction has no operands";
-      
-      Register OrigReg = OrigDefMI->getOperand(0).getReg();
-      ASSERT_TRUE(OrigReg.isValid()) << "Could not get destination register %1 from bb.1";
-      
-      // Count uses before SSA repair
-      unsigned UseCountBefore = 0;
-      for (const MachineInstr &MI : MRI.use_instructions(OrigReg)) {
-        (void)MI;
-        ++UseCountBefore;
-      }
-      ASSERT_GT(UseCountBefore, 0u) << "Original register should have uses";
-      
-      // Find V_MOV_B32_e32 instruction in bb.0 to get its opcode
-      MachineBasicBlock *BB0 = MF.getBlockNumbered(0);
-      MachineInstr *MovInst = &*BB0->begin();
-      unsigned MovOpcode = MovInst->getOpcode();
-      Register ExecReg = MovInst->getOperand(2).getReg(); // Get EXEC register
-      
-      // Create a new definition in bb.3 that defines OrigReg (violating SSA)
-      // This creates a scenario where bb.4 needs a PHI to merge values from bb.2 and bb.3
-      const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
-      auto InsertPt = BB3->getFirstNonPHI();
-      MachineInstr *NewDefMI = BuildMI(*BB3, InsertPt, DebugLoc(), 
-                                        TII->get(MovOpcode), OrigReg)
-                                   .addImm(42)
-                                   .addReg(ExecReg, RegState::Implicit);
-      
-      // Set MachineFunction properties to allow PHIs and indicate SSA form
-      MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
-      MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
-      
-      // NOW TEST MachineLaneSSAUpdater: call repairSSAForNewDef
-      // Before: %1 defined in bb.1, used in bb.2 and bb.4
-      //         NewDefMI in bb.3 also defines %1 (violating SSA!)
-      // After repair: NewDefMI will define a new vreg, bb.4 gets PHI
-      MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
-      Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
-      
-      // VERIFY RESULTS:
-      
-      // 1. NewReg should be valid and different from OrigReg
-      EXPECT_TRUE(NewReg.isValid()) << "Updater should create a new register";
-      EXPECT_NE(NewReg, OrigReg) << "New register should be different from original";
-      
-      // 2. NewDefMI should now define NewReg (not OrigReg)
-      EXPECT_EQ(NewDefMI->getOperand(0).getReg(), NewReg) << "NewDefMI should now define the new register";
-      
-      
-      // 3. Check if PHI nodes were inserted in bb.4
-      bool FoundPHI = false;
-      for (MachineInstr &MI : *BB4) {
-        if (MI.isPHI()) {
-          FoundPHI = true;
-          break;
+      [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
+        LiveIntervals &LIS = LISWrapper.getLIS();
+        MachineDominatorTree MDT(MF);
+        const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+        MachineRegisterInfo &MRI = MF.getRegInfo();
+
+        // Verify we have 5 blocks as expected
+        ASSERT_EQ(MF.size(), 5u) << "Should have bb.0, bb.1, bb.2, bb.3, bb.4";
+
+        MachineBasicBlock *BB1 = MF.getBlockNumbered(1);
+        MachineBasicBlock *BB3 = MF.getBlockNumbered(3);
+        MachineBasicBlock *BB4 = MF.getBlockNumbered(4);
+
+        // Get %1 which is defined in bb.1 (first non-PHI instruction)
+        MachineInstr *OrigDefMI = &*BB1->getFirstNonPHI();
+        ASSERT_TRUE(OrigDefMI) << "Could not find instruction in bb.1";
+        ASSERT_TRUE(OrigDefMI->getNumOperands() > 0)
+            << "Instruction has no operands";
+
+        Register OrigReg = OrigDefMI->getOperand(0).getReg();
+        ASSERT_TRUE(OrigReg.isValid())
+            << "Could not get destination register %1 from bb.1";
+
+        // Count uses before SSA repair
+        unsigned UseCountBefore = 0;
+        for (const MachineInstr &MI : MRI.use_instructions(OrigReg)) {
+          (void)MI;
+          ++UseCountBefore;
         }
-      }
-      EXPECT_TRUE(FoundPHI) << "SSA repair should have inserted PHI node in bb.4";
-      
-      // 4. Verify LiveIntervals are still valid
-      EXPECT_TRUE(LIS.hasInterval(NewReg)) << "New register should have live interval";
-      EXPECT_TRUE(LIS.hasInterval(OrigReg)) << "Original register should still have live interval";
-      
-      // Verify the MachineFunction is still valid after SSA repair
-      EXPECT_TRUE(MF.verify(nullptr, /* Banner=*/nullptr, /*OS=*/nullptr, /* AbortOnError=*/false))
-          << "MachineFunction verification failed after SSA repair";
-    });
+        ASSERT_GT(UseCountBefore, 0u) << "Original register should have uses";
+
+        // Find V_MOV_B32_e32 instruction in bb.0 to get its opcode
+        MachineBasicBlock *BB0 = MF.getBlockNumbered(0);
+        MachineInstr *MovInst = &*BB0->begin();
+        unsigned MovOpcode = MovInst->getOpcode();
+        Register ExecReg = MovInst->getOperand(2).getReg(); // Get EXEC register
+
+        // Create a new definition in bb.3 that defines OrigReg (violating SSA)
+        // This creates a scenario where bb.4 needs a PHI to merge values from
+        // bb.2 and bb.3
+        const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+        auto InsertPt = BB3->getFirstNonPHI();
+        MachineInstr *NewDefMI =
+            BuildMI(*BB3, InsertPt, DebugLoc(), TII->get(MovOpcode), OrigReg)
+                .addImm(42)
+                .addReg(ExecReg, RegState::Implicit);
+
+        // Set MachineFunction properties to allow PHIs and indicate SSA form
+        MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
+        MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
+
+        // NOW TEST MachineLaneSSAUpdater: call repairSSAForNewDef
+        // Before: %1 defined in bb.1, used in bb.2 and bb.4
+        //         NewDefMI in bb.3 also defines %1 (violating SSA!)
+        // After repair: NewDefMI will define a new vreg, bb.4 gets PHI
+        MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
+        Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
+
+        // VERIFY RESULTS:
+
+        // 1. NewReg should be valid and different from OrigReg
+        EXPECT_TRUE(NewReg.isValid()) << "Updater should create a new register";
+        EXPECT_NE(NewReg, OrigReg)
+            << "New register should be different from original";
+
+        // 2. NewDefMI should now define NewReg (not OrigReg)
+        EXPECT_EQ(NewDefMI->getOperand(0).getReg(), NewReg)
+            << "NewDefMI should now define the new register";
+
+        // 3. Check if PHI nodes were inserted in bb.4
+        bool FoundPHI = false;
+        for (MachineInstr &MI : *BB4) {
+          if (MI.isPHI()) {
+            FoundPHI = true;
+            break;
+          }
+        }
+        EXPECT_TRUE(FoundPHI)
+            << "SSA repair should have inserted PHI node in bb.4";
+
+        // 4. Verify LiveIntervals are still valid
+        EXPECT_TRUE(LIS.hasInterval(NewReg))
+            << "New register should have live interval";
+        EXPECT_TRUE(LIS.hasInterval(OrigReg))
+            << "Original register should still have live interval";
+
+        // Verify the MachineFunction is still valid after SSA repair
+        EXPECT_TRUE(MF.verify(nullptr, /* Banner=*/nullptr, /*OS=*/nullptr,
+                              /* AbortOnError=*/false))
+            << "MachineFunction verification failed after SSA repair";
+      });
 }
 
 //===----------------------------------------------------------------------===//
@@ -324,14 +333,17 @@ TEST(MachineLaneSSAUpdaterTest, NewDefInsertsPhiAndRewritesUses) {
 //        bb.8 (use)
 //
 // Key insight: IDF(bb.5) = {bb.6, bb.7}
-// - bb.6 needs PHI because it's reachable from bb.4 (has %1) and bb.5 (has new def)
-// - bb.7 needs PHI because it's reachable from bb.2 (has %1) and bb.6 (has PHI result %X)
+// - bb.6 needs PHI because it's reachable from bb.4 (has %1) and bb.5 (has new
+// def)
+// - bb.7 needs PHI because it's reachable from bb.2 (has %1) and bb.6 (has PHI
+// result %X)
 //
 // This truly requires TWO PHI nodes for proper SSA form!
 //===----------------------------------------------------------------------===//
 
 TEST(MachineLaneSSAUpdaterTest, MultiplePhiInsertion) {
-  liveIntervalsTest(R"MIR(
+  liveIntervalsTest(
+      R"MIR(
     %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
     S_BRANCH %bb.1
 
@@ -377,121 +389,134 @@ TEST(MachineLaneSSAUpdaterTest, MultiplePhiInsertion) {
     %6:vgpr_32 = V_OR_B32_e32 %1, %1, implicit $exec
     S_ENDPGM 0
 )MIR",
-    [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
-      LiveIntervals &LIS = LISWrapper.getLIS();
-      MachineDominatorTree MDT(MF);
-      const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
-      MachineRegisterInfo &MRI = MF.getRegInfo();
-      
-      // Verify we have the expected number of blocks
-      ASSERT_EQ(MF.size(), 9u) << "Should have bb.0 through bb.8";
-      
-      MachineBasicBlock *BB1 = MF.getBlockNumbered(1);
-      MachineBasicBlock *BB5 = MF.getBlockNumbered(5);
-      MachineBasicBlock *BB6 = MF.getBlockNumbered(6);
-      MachineBasicBlock *BB7 = MF.getBlockNumbered(7);
-      
-      // Get %1 which is defined in bb.1
-      MachineInstr *OrigDefMI = &*BB1->getFirstNonPHI();
-      Register OrigReg = OrigDefMI->getOperand(0).getReg();
-      ASSERT_TRUE(OrigReg.isValid()) << "Could not get original register";
-      
-      // Count uses of %1 before SSA repair
-      unsigned UseCountBefore = 0;
-      for (const MachineInstr &MI : MRI.use_instructions(OrigReg)) {
-        (void)MI;
-        ++UseCountBefore;
-      }
-      ASSERT_GT(UseCountBefore, 0u) << "Original register should have uses";
-      LLVM_DEBUG(dbgs() << "Original register has " << UseCountBefore << " uses before SSA repair\n");
-      
-      // Get V_MOV opcode from bb.0
-      MachineBasicBlock *BB0 = MF.getBlockNumbered(0);
-      MachineInstr *MovInst = &*BB0->begin();
-      unsigned MovOpcode = MovInst->getOpcode();
-      Register ExecReg = MovInst->getOperand(2).getReg();
-      
-      // Insert new definition in bb.5 that defines OrigReg (violating SSA)
-      const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
-      auto InsertPt = BB5->getFirstNonPHI();
-      MachineInstr *NewDefMI = BuildMI(*BB5, InsertPt, DebugLoc(), 
-                                        TII->get(MovOpcode), OrigReg)
-                                   .addImm(100)
-                                   .addReg(ExecReg, RegState::Implicit);
-      
-      // Set MachineFunction properties
-      MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
-      MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
-      
-      // Call MachineLaneSSAUpdater
-      MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
-      Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
-      
-      EXPECT_TRUE(NewReg.isValid()) << "Updater should create a new register";
-      EXPECT_NE(NewReg, OrigReg) << "New register should be different from original";
-      EXPECT_EQ(NewDefMI->getOperand(0).getReg(), NewReg) << "NewDefMI should now define the new register";
-      
-      // Count PHI nodes inserted and track their locations
-      unsigned PHICount = 0;
-      std::map<unsigned, unsigned> PHIsPerBlock;
-      for (MachineBasicBlock &MBB : MF) {
-        unsigned BlockPHIs = 0;
-        for (MachineInstr &MI : MBB) {
+      [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
+        LiveIntervals &LIS = LISWrapper.getLIS();
+        MachineDominatorTree MDT(MF);
+        const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+        MachineRegisterInfo &MRI = MF.getRegInfo();
+
+        // Verify we have the expected number of blocks
+        ASSERT_EQ(MF.size(), 9u) << "Should have bb.0 through bb.8";
+
+        MachineBasicBlock *BB1 = MF.getBlockNumbered(1);
+        MachineBasicBlock *BB5 = MF.getBlockNumbered(5);
+        MachineBasicBlock *BB6 = MF.getBlockNumbered(6);
+        MachineBasicBlock *BB7 = MF.getBlockNumbered(7);
+
+        // Get %1 which is defined in bb.1
+        MachineInstr *OrigDefMI = &*BB1->getFirstNonPHI();
+        Register OrigReg = OrigDefMI->getOperand(0).getReg();
+        ASSERT_TRUE(OrigReg.isValid()) << "Could not get original register";
+
+        // Count uses of %1 before SSA repair
+        unsigned UseCountBefore = 0;
+        for (const MachineInstr &MI : MRI.use_instructions(OrigReg)) {
+          (void)MI;
+          ++UseCountBefore;
+        }
+        ASSERT_GT(UseCountBefore, 0u) << "Original register should have uses";
+        LLVM_DEBUG(dbgs() << "Original register has " << UseCountBefore
+                          << " uses before SSA repair\n");
+
+        // Get V_MOV opcode from bb.0
+        MachineBasicBlock *BB0 = MF.getBlockNumbered(0);
+        MachineInstr *MovInst = &*BB0->begin();
+        unsigned MovOpcode = MovInst->getOpcode();
+        Register ExecReg = MovInst->getOperand(2).getReg();
+
+        // Insert new definition in bb.5 that defines OrigReg (violating SSA)
+        const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+        auto InsertPt = BB5->getFirstNonPHI();
+        MachineInstr *NewDefMI =
+            BuildMI(*BB5, InsertPt, DebugLoc(), TII->get(MovOpcode), OrigReg)
+                .addImm(100)
+                .addReg(ExecReg, RegState::Implicit);
+
+        // Set MachineFunction properties
+        MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
+        MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
+
+        // Call MachineLaneSSAUpdater
+        MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
+        Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
+
+        EXPECT_TRUE(NewReg.isValid()) << "Updater should create a new register";
+        EXPECT_NE(NewReg, OrigReg)
+            << "New register should be different from original";
+        EXPECT_EQ(NewDefMI->getOperand(0).getReg(), NewReg)
+            << "NewDefMI should now define the new register";
+
+        // Count PHI nodes inserted and track their locations
+        unsigned PHICount = 0;
+        std::map<unsigned, unsigned> PHIsPerBlock;
+        for (MachineBasicBlock &MBB : MF) {
+          unsigned BlockPHIs = 0;
+          for (MachineInstr &MI : MBB) {
+            if (MI.isPHI()) {
+              ++PHICount;
+              ++BlockPHIs;
+              LLVM_DEBUG(dbgs()
+                         << "Found PHI in BB#" << MBB.getNumber() << ": ");
+              LLVM_DEBUG(MI.print(dbgs()));
+            }
+          }
+          if (BlockPHIs > 0) {
+            PHIsPerBlock[MBB.getNumber()] = BlockPHIs;
+          }
+        }
+
+        LLVM_DEBUG(dbgs() << "Total PHI nodes inserted: " << PHICount << "\n");
+
+        // Check for first PHI in bb.6 (joins bb.4 and bb.5)
+        bool FoundPHIInBB6 = false;
+        for (MachineInstr &MI : *BB6) {
           if (MI.isPHI()) {
-            ++PHICount;
-            ++BlockPHIs;
-            LLVM_DEBUG(dbgs() << "Found PHI in BB#" << MBB.getNumber() << ": ");
+            FoundPHIInBB6 = true;
+            LLVM_DEBUG(dbgs() << "First PHI in bb.6: ");
             LLVM_DEBUG(MI.print(dbgs()));
+            // Verify it has 2 incoming values (4 operands: 2 x (reg, mbb))
+            unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
+            EXPECT_EQ(NumIncoming, 2u)
+                << "First PHI in bb.6 should have 2 incoming values (from bb.4 "
+                   "and bb.5)";
+            break;
           }
         }
-        if (BlockPHIs > 0) {
-          PHIsPerBlock[MBB.getNumber()] = BlockPHIs;
-        }
-      }
-      
-      LLVM_DEBUG(dbgs() << "Total PHI nodes inserted: " << PHICount << "\n");
-      
-      // Check for first PHI in bb.6 (joins bb.4 and bb.5)
-      bool FoundPHIInBB6 = false;
-      for (MachineInstr &MI : *BB6) {
-        if (MI.isPHI()) {
-          FoundPHIInBB6 = true;
-          LLVM_DEBUG(dbgs() << "First PHI in bb.6: ");
-          LLVM_DEBUG(MI.print(dbgs()));
-          // Verify it has 2 incoming values (4 operands: 2 x (reg, mbb))
-          unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
-          EXPECT_EQ(NumIncoming, 2u) << "First PHI in bb.6 should have 2 incoming values (from bb.4 and bb.5)";
-          break;
-        }
-      }
-      EXPECT_TRUE(FoundPHIInBB6) << "Should have first PHI in bb.6 (joins bb.4 with %1 and bb.5 with new def)";
-      
-      // Check for second PHI in bb.7 (joins bb.2 and bb.6)
-      bool FoundPHIInBB7 = false;
-      for (MachineInstr &MI : *BB7) {
-        if (MI.isPHI()) {
-          FoundPHIInBB7 = true;
-          LLVM_DEBUG(dbgs() << "Second PHI in bb.7: ");
-          LLVM_DEBUG(MI.print(dbgs()));
-          // Verify it has 2 incoming values (4 operands: 2 x (reg, mbb))
-          unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
-          EXPECT_EQ(NumIncoming, 2u) << "Second PHI in bb.7 should have 2 incoming values (from bb.2 with %1 and bb.6 with first PHI result)";
-          break;
+        EXPECT_TRUE(FoundPHIInBB6) << "Should have first PHI in bb.6 (joins "
+                                      "bb.4 with %1 and bb.5 with new def)";
+
+        // Check for second PHI in bb.7 (joins bb.2 and bb.6)
+        bool FoundPHIInBB7 = false;
+        for (MachineInstr &MI : *BB7) {
+          if (MI.isPHI()) {
+            FoundPHIInBB7 = true;
+            LLVM_DEBUG(dbgs() << "Second PHI in bb.7: ");
+            LLVM_DEBUG(MI.print(dbgs()));
+            // Verify it has 2 incoming values (4 operands: 2 x (reg, mbb))
+            unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
+            EXPECT_EQ(NumIncoming, 2u)
+                << "Second PHI in bb.7 should have 2 incoming values (from "
+                   "bb.2 with %1 and bb.6 with first PHI result)";
+            break;
+          }
         }
-      }
-      EXPECT_TRUE(FoundPHIInBB7) << "Should have second PHI in bb.7 (joins bb.2 with %1 and bb.6 with first PHI)";
-      
-      // Should have exactly 2 PHIs
-      EXPECT_EQ(PHICount, 2u) << "Should have inserted exactly TWO PHI nodes (one at bb.6, one at bb.7)";
-      
-      // Verify LiveIntervals are valid
-      EXPECT_TRUE(LIS.hasInterval(NewReg)) << "New register should have live interval";
-      EXPECT_TRUE(LIS.hasInterval(OrigReg)) << "Original register should have live interval";
-      
-      // Verify the MachineFunction is still valid
-      EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
-          << "MachineFunction verification failed";
-    });
+        EXPECT_TRUE(FoundPHIInBB7) << "Should have second PHI in bb.7 (joins "
+                                      "bb.2 with %1 and bb.6 with first PHI)";
+
+        // Should have exactly 2 PHIs
+        EXPECT_EQ(PHICount, 2u) << "Should have inserted exactly TWO PHI nodes "
+                                   "(one at bb.6, one at bb.7)";
+
+        // Verify LiveIntervals are valid
+        EXPECT_TRUE(LIS.hasInterval(NewReg))
+            << "New register should have live interval";
+        EXPECT_TRUE(LIS.hasInterval(OrigReg))
+            << "Original register should have live interval";
+
+        // Verify the MachineFunction is still valid
+        EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
+            << "MachineFunction verification failed";
+      });
 }
 
 //===----------------------------------------------------------------------===//
@@ -506,7 +531,8 @@ TEST(MachineLaneSSAUpdaterTest, MultiplePhiInsertion) {
 //     1. Track that only sub0 lane is modified (not sub1)
 //     2. Create PHI that merges only the sub0 lane
 //     3. Preserve the original sub1 lane
-//     4. Generate REG_SEQUENCE to compose full register from PHI+unchanged lanes
+//     4. Generate REG_SEQUENCE to compose full register from PHI+unchanged
+//     lanes
 //
 // CFG Structure:
 //       BB0 (entry)
@@ -516,11 +542,12 @@ TEST(MachineLaneSSAUpdaterTest, MultiplePhiInsertion) {
 //    BB2   BB3 (INSERT: %3.sub0 = new_def)
 //     |     |
 //    use   (no use)
-//   sub0    
+//   sub0
 //      \   /
 //       BB4 (use sub0 + sub1) → PHI for sub0 lane only
 //        |
-//       BB5 (use full %3) → REG_SEQUENCE to compose full reg from PHI result + unchanged sub1
+//       BB5 (use full %3) → REG_SEQUENCE to compose full reg from PHI result +
+//       unchanged sub1
 //
 // Expected behavior:
 //   - PHI in BB4 merges only sub0 lane (changed)
@@ -529,7 +556,8 @@ TEST(MachineLaneSSAUpdaterTest, MultiplePhiInsertion) {
 //===----------------------------------------------------------------------===//
 
 TEST(MachineLaneSSAUpdaterTest, SubregisterLaneTracking) {
-  liveIntervalsTest(R"MIR(
+  liveIntervalsTest(
+      R"MIR(
     %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
     S_BRANCH %bb.1
 
@@ -565,126 +593,147 @@ TEST(MachineLaneSSAUpdaterTest, SubregisterLaneTracking) {
     %6:vreg_64 = V_LSHLREV_B64_e64 0, %3, implicit $exec
     S_ENDPGM 0
 )MIR",
-    [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
-      LiveIntervals &LIS = LISWrapper.getLIS();
-      MachineDominatorTree MDT(MF);
-      const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
-      MachineRegisterInfo &MRI = MF.getRegInfo();
-      
-      // Verify we have the expected number of blocks
-      ASSERT_EQ(MF.size(), 6u) << "Should have bb.0 through bb.5";
-      
-      MachineBasicBlock *BB3 = MF.getBlockNumbered(3);
-      
-      // Get the 64-bit register %3 (vreg_64) from the MIR
-      Register Reg64 = Register::index2VirtReg(3);
-      ASSERT_TRUE(Reg64.isValid()) << "Register %3 should be valid";
-      
-      const TargetRegisterClass *RC64 = MRI.getRegClass(Reg64);
-      ASSERT_EQ(TRI->getRegSizeInBits(*RC64), 64u) << "Register %3 should be 64-bit";
-      LLVM_DEBUG(dbgs() << "Using 64-bit register: %" << Reg64.virtRegIndex() << " (raw: " << Reg64 << ")\n");
-      
-      // Verify it has subranges for lane tracking
-      ASSERT_TRUE(LIS.hasInterval(Reg64)) << "Register should have live interval";
-      LiveInterval &LI = LIS.getInterval(Reg64);
-      if (LI.hasSubRanges()) {
-        LLVM_DEBUG(dbgs() << "Register has subranges (lane tracking active)\n");
-        for (const LiveInterval::SubRange &SR : LI.subranges()) {
-          LLVM_DEBUG(dbgs() << "  Lane mask: " << PrintLaneMask(SR.LaneMask) << "\n");
+      [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
+        LiveIntervals &LIS = LISWrapper.getLIS();
+        MachineDominatorTree MDT(MF);
+        const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+        MachineRegisterInfo &MRI = MF.getRegInfo();
+
+        // Verify we have the expected number of blocks
+        ASSERT_EQ(MF.size(), 6u) << "Should have bb.0 through bb.5";
+
+        MachineBasicBlock *BB3 = MF.getBlockNumbered(3);
+
+        // Get the 64-bit register %3 (vreg_64) from the MIR
+        Register Reg64 = Register::index2VirtReg(3);
+        ASSERT_TRUE(Reg64.isValid()) << "Register %3 should be valid";
+
+        const TargetRegisterClass *RC64 = MRI.getRegClass(Reg64);
+        ASSERT_EQ(TRI->getRegSizeInBits(*RC64), 64u)
+            << "Register %3 should be 64-bit";
+        LLVM_DEBUG(dbgs() << "Using 64-bit register: %" << Reg64.virtRegIndex()
+                          << " (raw: " << Reg64 << ")\n");
+
+        // Verify it has subranges for lane tracking
+        ASSERT_TRUE(LIS.hasInterval(Reg64))
+            << "Register should have live interval";
+        LiveInterval &LI = LIS.getInterval(Reg64);
+        if (LI.hasSubRanges()) {
+          LLVM_DEBUG(dbgs()
+                     << "Register has subranges (lane tracking active)\n");
+          for (const LiveInterval::SubRange &SR : LI.subranges()) {
+            LLVM_DEBUG(dbgs() << "  Lane mask: " << PrintLaneMask(SR.LaneMask)
+                              << "\n");
+          }
+        } else {
+          LLVM_DEBUG(dbgs() << "Warning: Register does not have subranges\n");
         }
-      } else {
-        LLVM_DEBUG(dbgs() << "Warning: Register does not have subranges\n");
-      }
-      
-      // Find the subreg index for a 32-bit subreg of the 64-bit register
-      unsigned Sub0Idx = 0;
-      for (unsigned Idx = 1, E = TRI->getNumSubRegIndices(); Idx <= E; ++Idx) {
-        const TargetRegisterClass *SubRC = TRI->getSubRegisterClass(RC64, Idx);
-        if (SubRC && TRI->getRegSizeInBits(*SubRC) == 32) {
-          Sub0Idx = Idx;
-          break;
+
+        // Find the subreg index for a 32-bit subreg of the 64-bit register
+        unsigned Sub0Idx = 0;
+        for (unsigned Idx = 1, E = TRI->getNumSubRegIndices(); Idx <= E;
+             ++Idx) {
+          const TargetRegisterClass *SubRC =
+              TRI->getSubRegisterClass(RC64, Idx);
+          if (SubRC && TRI->getRegSizeInBits(*SubRC) == 32) {
+            Sub0Idx = Idx;
+            break;
+          }
         }
-      }
-      ASSERT_NE(Sub0Idx, 0u) << "Could not find 32-bit subregister index";
-      LaneBitmask Sub0Mask = TRI->getSubRegIndexLaneMask(Sub0Idx);
-      LLVM_DEBUG(dbgs() << "Sub0 index=" << Sub0Idx << " (" << TRI->getSubRegIndexName(Sub0Idx)
-                   << "), mask=" << PrintLaneMask(Sub0Mask) << "\n");
-      
-      // Insert new definition in bb.3 that defines Reg64.sub0 (partial update, violating SSA)
-      // Use V_MOV with immediate - no liveness dependencies
-      // It's the caller's responsibility to ensure source operands are valid
-      const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
-      auto InsertPt = BB3->getFirstNonPHI();
-      
-      // Get V_MOV opcode and EXEC register from bb.0
-      MachineBasicBlock *BB0 = MF.getBlockNumbered(0);
-      MachineInstr *MovInst = &*BB0->begin();
-      unsigned MovOpcode = MovInst->getOpcode();
-      Register ExecReg = MovInst->getOperand(2).getReg();
-      
-      // Create a 32-bit temporary register
-      Register TempReg = MRI.createVirtualRegister(TRI->getSubRegisterClass(RC64, Sub0Idx));
-      
-      // Insert both instructions first (V_MOV and COPY)
-      MachineInstr *TempMI = BuildMI(*BB3, InsertPt, DebugLoc(), TII->get(MovOpcode), TempReg)
-          .addImm(99)
-          .addReg(ExecReg, RegState::Implicit);
-      
-      MachineInstr *NewDefMI = BuildMI(*BB3, InsertPt, DebugLoc(), 
-                                        TII->get(TargetOpcode::COPY))
-          .addReg(Reg64, RegState::Define, Sub0Idx)  // %3.sub0 = (violates SSA)
-          .addReg(TempReg);                           // COPY from temp
-      
-      // Caller's responsibility: index instructions and create live intervals
-      // Do this AFTER inserting both instructions so uses are visible
-      LIS.InsertMachineInstrInMaps(*TempMI);
-      LIS.InsertMachineInstrInMaps(*NewDefMI);
-      LIS.createAndComputeVirtRegInterval(TempReg);
-      
-      // Set MachineFunction properties
-      MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
-      MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
-      
-      // Call MachineLaneSSAUpdater to repair the SSA violation
-      // This should create a new vreg for the subreg def and insert lane-aware PHIs
-      MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
-      Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, Reg64);
-      
-      LLVM_DEBUG(dbgs() << "SSA repair created new register: %" << NewReg.virtRegIndex() << " (raw: " << NewReg << ")\n");
-      
-      // VERIFY RESULTS:
-      
-      // 1. NewReg should be a 32-bit register (for sub0), not 64-bit
-      EXPECT_TRUE(NewReg.isValid()) << "Updater should create a new register";
-      EXPECT_NE(NewReg, Reg64) << "New register should be different from original";
-      
-      const TargetRegisterClass *NewRC = MRI.getRegClass(NewReg);
-      EXPECT_EQ(TRI->getRegSizeInBits(*NewRC), 32u) << "New register should be 32-bit (subreg class)";
-      
-      // 2. NewDefMI should now define NewReg (not Reg64.sub0)
-      EXPECT_EQ(NewDefMI->getOperand(0).getReg(), NewReg) << "NewDefMI should now define new 32-bit register";
-      EXPECT_EQ(NewDefMI->getOperand(0).getSubReg(), 0u) << "NewDefMI should no longer have subreg index";
-      
-      // 3. Verify PHIs were inserted where needed
-      MachineBasicBlock *BB4 = MF.getBlockNumbered(4);
-      bool FoundPHI = false;
-      for (MachineInstr &MI : *BB4) {
-        if (MI.isPHI()) {
-          FoundPHI = true;
-          LLVM_DEBUG(dbgs() << "Found PHI in bb.4: ");
-          LLVM_DEBUG(MI.print(dbgs()));
-          break;
+        ASSERT_NE(Sub0Idx, 0u) << "Could not find 32-bit subregister index";
+        LaneBitmask Sub0Mask = TRI->getSubRegIndexLaneMask(Sub0Idx);
+        LLVM_DEBUG(dbgs() << "Sub0 index=" << Sub0Idx << " ("
+                          << TRI->getSubRegIndexName(Sub0Idx)
+                          << "), mask=" << PrintLaneMask(Sub0Mask) << "\n");
+
+        // Insert new definition in bb.3 that defines Reg64.sub0 (partial
+        // update, violating SSA) Use V_MOV with immediate - no liveness
+        // dependencies It's the caller's responsibility to ensure source
+        // operands are valid
+        const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+        auto InsertPt = BB3->getFirstNonPHI();
+
+        // Get V_MOV opcode and EXEC register from bb.0
+        MachineBasicBlock *BB0 = MF.getBlockNumbered(0);
+        MachineInstr *MovInst = &*BB0->begin();
+        unsigned MovOpcode = MovInst->getOpcode();
+        Register ExecReg = MovInst->getOperand(2).getReg();
+
+        // Create a 32-bit temporary register
+        Register TempReg =
+            MRI.createVirtualRegister(TRI->getSubRegisterClass(RC64, Sub0Idx));
+
+        // Insert both instructions first (V_MOV and COPY)
+        MachineInstr *TempMI =
+            BuildMI(*BB3, InsertPt, DebugLoc(), TII->get(MovOpcode), TempReg)
+                .addImm(99)
+                .addReg(ExecReg, RegState::Implicit);
+
+        MachineInstr *NewDefMI =
+            BuildMI(*BB3, InsertPt, DebugLoc(), TII->get(TargetOpcode::COPY))
+                .addReg(Reg64, RegState::Define,
+                        Sub0Idx)  // %3.sub0 = (violates SSA)
+                .addReg(TempReg); // COPY from temp
+
+        // Caller's responsibility: index instructions and create live intervals
+        // Do this AFTER inserting both instructions so uses are visible
+        LIS.InsertMachineInstrInMaps(*TempMI);
+        LIS.InsertMachineInstrInMaps(*NewDefMI);
+        LIS.createAndComputeVirtRegInterval(TempReg);
+
+        // Set MachineFunction properties
+        MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
+        MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
+
+        // Call MachineLaneSSAUpdater to repair the SSA violation
+        // This should create a new vreg for the subreg def and insert
+        // lane-aware PHIs
+        MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
+        Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, Reg64);
+
+        LLVM_DEBUG(dbgs() << "SSA repair created new register: %"
+                          << NewReg.virtRegIndex() << " (raw: " << NewReg
+                          << ")\n");
+
+        // VERIFY RESULTS:
+
+        // 1. NewReg should be a 32-bit register (for sub0), not 64-bit
+        EXPECT_TRUE(NewReg.isValid()) << "Updater should create a new register";
+        EXPECT_NE(NewReg, Reg64)
+            << "New register should be different from original";
+
+        const TargetRegisterClass *NewRC = MRI.getRegClass(NewReg);
+        EXPECT_EQ(TRI->getRegSizeInBits(*NewRC), 32u)
+            << "New register should be 32-bit (subreg class)";
+
+        // 2. NewDefMI should now define NewReg (not Reg64.sub0)
+        EXPECT_EQ(NewDefMI->getOperand(0).getReg(), NewReg)
+            << "NewDefMI should now define new 32-bit register";
+        EXPECT_EQ(NewDefMI->getOperand(0).getSubReg(), 0u)
+            << "NewDefMI should no longer have subreg index";
+
+        // 3. Verify PHIs were inserted where needed
+        MachineBasicBlock *BB4 = MF.getBlockNumbered(4);
+        bool FoundPHI = false;
+        for (MachineInstr &MI : *BB4) {
+          if (MI.isPHI()) {
+            FoundPHI = true;
+            LLVM_DEBUG(dbgs() << "Found PHI in bb.4: ");
+            LLVM_DEBUG(MI.print(dbgs()));
+            break;
+          }
         }
-      }
-      EXPECT_TRUE(FoundPHI) << "Should have inserted PHI for sub0 lane in bb.4";
-      
-      // 4. Verify LiveIntervals are valid
-      EXPECT_TRUE(LIS.hasInterval(NewReg)) << "New register should have live interval";
-      
-      // Verify the MachineFunction is still valid
-      EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
-          << "MachineFunction verification failed";
-    });
+        EXPECT_TRUE(FoundPHI)
+            << "Should have inserted PHI for sub0 lane in bb.4";
+
+        // 4. Verify LiveIntervals are valid
+        EXPECT_TRUE(LIS.hasInterval(NewReg))
+            << "New register should have live interval";
+
+        // Verify the MachineFunction is still valid
+        EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
+            << "MachineFunction verification failed";
+      });
 }
 
 //===----------------------------------------------------------------------===//
@@ -716,7 +765,8 @@ TEST(MachineLaneSSAUpdaterTest, SubregisterLaneTracking) {
 //===----------------------------------------------------------------------===//
 
 TEST(MachineLaneSSAUpdaterTest, SubregDefToFullRegPHI) {
-  liveIntervalsTest(R"MIR(
+  liveIntervalsTest(
+      R"MIR(
     %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
     S_BRANCH %bb.1
 
@@ -770,123 +820,137 @@ TEST(MachineLaneSSAUpdaterTest, SubregDefToFullRegPHI) {
     %8:vreg_64 = V_LSHLREV_B64_e64 0, %7, implicit $exec
     S_ENDPGM 0
 )MIR",
-    [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
-      LiveIntervals &LIS = LISWrapper.getLIS();
-      MachineDominatorTree MDT(MF);
-      const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
-      MachineRegisterInfo &MRI = MF.getRegInfo();
-      
-      ASSERT_EQ(MF.size(), 9u) << "Should have bb.0 through bb.8";
-      
-      MachineBasicBlock *BB5 = MF.getBlockNumbered(5); // New def inserted here
-      MachineBasicBlock *BB6 = MF.getBlockNumbered(6); // First join (bb.4 + bb.5)
-      MachineBasicBlock *BB7 = MF.getBlockNumbered(7); // PHI block (bb.2 + bb.6)
-      
-      // Get register X (%3, the 64-bit register from bb.1)
-      Register RegX = Register::index2VirtReg(3);
-      ASSERT_TRUE(RegX.isValid()) << "Register %3 (X) should be valid";
-      
-      const TargetRegisterClass *RC64 = MRI.getRegClass(RegX);
-      ASSERT_EQ(TRI->getRegSizeInBits(*RC64), 64u) << "Register X should be 64-bit";
-      
-      // Find sub0 index (32-bit subregister)
-      unsigned Sub0Idx = 0;
-      for (unsigned Idx = 1, E = TRI->getNumSubRegIndices(); Idx <= E; ++Idx) {
-        const TargetRegisterClass *SubRC = TRI->getSubRegisterClass(RC64, Idx);
-        if (SubRC && TRI->getRegSizeInBits(*SubRC) == 32) {
-          Sub0Idx = Idx;
-          break;
+      [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
+        LiveIntervals &LIS = LISWrapper.getLIS();
+        MachineDominatorTree MDT(MF);
+        const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+        MachineRegisterInfo &MRI = MF.getRegInfo();
+
+        ASSERT_EQ(MF.size(), 9u) << "Should have bb.0 through bb.8";
+
+        MachineBasicBlock *BB5 =
+            MF.getBlockNumbered(5); // New def inserted here
+        MachineBasicBlock *BB6 =
+            MF.getBlockNumbered(6); // First join (bb.4 + bb.5)
+        MachineBasicBlock *BB7 =
+            MF.getBlockNumbered(7); // PHI block (bb.2 + bb.6)
+
+        // Get register X (%3, the 64-bit register from bb.1)
+        Register RegX = Register::index2VirtReg(3);
+        ASSERT_TRUE(RegX.isValid()) << "Register %3 (X) should be valid";
+
+        const TargetRegisterClass *RC64 = MRI.getRegClass(RegX);
+        ASSERT_EQ(TRI->getRegSizeInBits(*RC64), 64u)
+            << "Register X should be 64-bit";
+
+        // Find sub0 index (32-bit subregister)
+        unsigned Sub0Idx = 0;
+        for (unsigned Idx = 1, E = TRI->getNumSubRegIndices(); Idx <= E;
+             ++Idx) {
+          const TargetRegisterClass *SubRC =
+              TRI->getSubRegisterClass(RC64, Idx);
+          if (SubRC && TRI->getRegSizeInBits(*SubRC) == 32) {
+            Sub0Idx = Idx;
+            break;
+          }
         }
-      }
-      ASSERT_NE(Sub0Idx, 0u) << "Could not find 32-bit subregister index";
-      
-      // Insert new definition in bb.5: X.sub0 = 3
-      const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
-      auto InsertPt = BB5->getFirstNonPHI();
-      
-      // Get V_MOV opcode and EXEC register
-      MachineBasicBlock *BB0 = MF.getBlockNumbered(0);
-      MachineInstr *MovInst = &*BB0->begin();
-      unsigned MovOpcode = MovInst->getOpcode();
-      Register ExecReg = MovInst->getOperand(2).getReg();
-      
-      // Create temporary register
-      Register TempReg = MRI.createVirtualRegister(TRI->getSubRegisterClass(RC64, Sub0Idx));
-      
-      MachineInstr *TempMI = BuildMI(*BB5, InsertPt, DebugLoc(), TII->get(MovOpcode), TempReg)
-          .addImm(30)
-          .addReg(ExecReg, RegState::Implicit);
-      
-      MachineInstr *NewDefMI = BuildMI(*BB5, InsertPt, DebugLoc(), 
-                                        TII->get(TargetOpcode::COPY))
-          .addReg(RegX, RegState::Define, Sub0Idx)  // X.sub0 = 
-          .addReg(TempReg);
-      
-      // Index instructions and create live interval for temp
-      LIS.InsertMachineInstrInMaps(*TempMI);
-      LIS.InsertMachineInstrInMaps(*NewDefMI);
-      LIS.createAndComputeVirtRegInterval(TempReg);
-      
-      // Set MachineFunction properties
-      MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
-      MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
-      
-      // Call SSA updater
-      MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
-      Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, RegX);
-      
-      LLVM_DEBUG(dbgs() << "SSA repair created new register: %" << NewReg.virtRegIndex() << " (raw: " << NewReg << ")\n");
-      
-      // VERIFY RESULTS:
-      
-      // 1. New register should be 32-bit (subreg class)
-      EXPECT_TRUE(NewReg.isValid());
-      EXPECT_NE(NewReg, RegX);
-      const TargetRegisterClass *NewRC = MRI.getRegClass(NewReg);
-      EXPECT_EQ(TRI->getRegSizeInBits(*NewRC), 32u) << "New register should be 32-bit";
-      
-      // 2. NewDefMI should now define NewReg without subreg index
-      EXPECT_EQ(NewDefMI->getOperand(0).getReg(), NewReg);
-      EXPECT_EQ(NewDefMI->getOperand(0).getSubReg(), 0u);
-      
-      // 3. Check the existing PHI in bb.7
-      bool FoundPHI = false;
-      Register PHIReg;
-      for (MachineInstr &MI : *BB7) {
-        if (MI.isPHI()) {
-          FoundPHI = true;
-          PHIReg = MI.getOperand(0).getReg();
-          LLVM_DEBUG(dbgs() << "PHI in bb.7 after SSA repair: ");
-          LLVM_DEBUG(MI.print(dbgs()));
-          break;
+        ASSERT_NE(Sub0Idx, 0u) << "Could not find 32-bit subregister index";
+
+        // Insert new definition in bb.5: X.sub0 = 3
+        const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+        auto InsertPt = BB5->getFirstNonPHI();
+
+        // Get V_MOV opcode and EXEC register
+        MachineBasicBlock *BB0 = MF.getBlockNumbered(0);
+        MachineInstr *MovInst = &*BB0->begin();
+        unsigned MovOpcode = MovInst->getOpcode();
+        Register ExecReg = MovInst->getOperand(2).getReg();
+
+        // Create temporary register
+        Register TempReg =
+            MRI.createVirtualRegister(TRI->getSubRegisterClass(RC64, Sub0Idx));
+
+        MachineInstr *TempMI =
+            BuildMI(*BB5, InsertPt, DebugLoc(), TII->get(MovOpcode), TempReg)
+                .addImm(30)
+                .addReg(ExecReg, RegState::Implicit);
+
+        MachineInstr *NewDefMI =
+            BuildMI(*BB5, InsertPt, DebugLoc(), TII->get(TargetOpcode::COPY))
+                .addReg(RegX, RegState::Define, Sub0Idx) // X.sub0 =
+                .addReg(TempReg);
+
+        // Index instructions and create live interval for temp
+        LIS.InsertMachineInstrInMaps(*TempMI);
+        LIS.InsertMachineInstrInMaps(*NewDefMI);
+        LIS.createAndComputeVirtRegInterval(TempReg);
+
+        // Set MachineFunction properties
+        MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
+        MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
+
+        // Call SSA updater
+        MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
+        Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, RegX);
+
+        LLVM_DEBUG(dbgs() << "SSA repair created new register: %"
+                          << NewReg.virtRegIndex() << " (raw: " << NewReg
+                          << ")\n");
+
+        // VERIFY RESULTS:
+
+        // 1. New register should be 32-bit (subreg class)
+        EXPECT_TRUE(NewReg.isValid());
+        EXPECT_NE(NewReg, RegX);
+        const TargetRegisterClass *NewRC = MRI.getRegClass(NewReg);
+        EXPECT_EQ(TRI->getRegSizeInBits(*NewRC), 32u)
+            << "New register should be 32-bit";
+
+        // 2. NewDefMI should now define NewReg without subreg index
+        EXPECT_EQ(NewDefMI->getOperand(0).getReg(), NewReg);
+        EXPECT_EQ(NewDefMI->getOperand(0).getSubReg(), 0u);
+
+        // 3. Check the existing PHI in bb.7
+        bool FoundPHI = false;
+        Register PHIReg;
+        for (MachineInstr &MI : *BB7) {
+          if (MI.isPHI()) {
+            FoundPHI = true;
+            PHIReg = MI.getOperand(0).getReg();
+            LLVM_DEBUG(dbgs() << "PHI in bb.7 after SSA repair: ");
+            LLVM_DEBUG(MI.print(dbgs()));
+            break;
+          }
         }
-      }
-      ASSERT_TRUE(FoundPHI) << "Should have PHI in bb.7 (from input MIR)";
-      
-      // 4. CRITICAL: Check for REG_SEQUENCE in bb.6 (first join, before branch to PHI)
-      // The updater must build REG_SEQUENCE to provide full register to the PHI
-      bool FoundREGSEQ = false;
-      for (MachineInstr &MI : *BB6) {
-        if (MI.getOpcode() == TargetOpcode::REG_SEQUENCE) {
-          FoundREGSEQ = true;
-          LLVM_DEBUG(dbgs() << "Found REG_SEQUENCE in bb.6: ");
-          LLVM_DEBUG(MI.print(dbgs()));
-          
-          // Should combine new sub0 with original sub1
-          EXPECT_GE(MI.getNumOperands(), 5u) << "REG_SEQUENCE should have result + 2 source pairs";
-          break;
+        ASSERT_TRUE(FoundPHI) << "Should have PHI in bb.7 (from input MIR)";
+
+        // 4. CRITICAL: Check for REG_SEQUENCE in bb.6 (first join, before
+        // branch to PHI) The updater must build REG_SEQUENCE to provide full
+        // register to the PHI
+        bool FoundREGSEQ = false;
+        for (MachineInstr &MI : *BB6) {
+          if (MI.getOpcode() == TargetOpcode::REG_SEQUENCE) {
+            FoundREGSEQ = true;
+            LLVM_DEBUG(dbgs() << "Found REG_SEQUENCE in bb.6: ");
+            LLVM_DEBUG(MI.print(dbgs()));
+
+            // Should combine new sub0 with original sub1
+            EXPECT_GE(MI.getNumOperands(), 5u)
+                << "REG_SEQUENCE should have result + 2 source pairs";
+            break;
+          }
         }
-      }
-      EXPECT_TRUE(FoundREGSEQ) << "Should have built REG_SEQUENCE in bb.6 to provide full register to PHI in bb.7";
-      
-      // 5. Verify LiveIntervals
-      EXPECT_TRUE(LIS.hasInterval(NewReg));
-      EXPECT_TRUE(LIS.hasInterval(PHIReg));
-      
-      // Verify the MachineFunction is still valid
-      EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
-          << "MachineFunction verification failed";
-    });
+        EXPECT_TRUE(FoundREGSEQ) << "Should have built REG_SEQUENCE in bb.6 to "
+                                    "provide full register to PHI in bb.7";
+
+        // 5. Verify LiveIntervals
+        EXPECT_TRUE(LIS.hasInterval(NewReg));
+        EXPECT_TRUE(LIS.hasInterval(PHIReg));
+
+        // Verify the MachineFunction is still valid
+        EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
+            << "MachineFunction verification failed";
+      });
 }
 
 //===----------------------------------------------------------------------===//
@@ -912,7 +976,8 @@ TEST(MachineLaneSSAUpdaterTest, SubregDefToFullRegPHI) {
 //     └──→ bb.1 (back edge)
 //
 // Key test: Dominance-based PHI construction should correctly use NewReg
-// for the back edge operand since NewDefBB (bb.2) dominates the loop latch (bb.2).
+// for the back edge operand since NewDefBB (bb.2) dominates the loop latch
+// (bb.2).
 //===----------------------------------------------------------------------===//
 
 // Test loop with new definition in loop body requiring PHI in loop header
@@ -930,7 +995,8 @@ TEST(MachineLaneSSAUpdaterTest, SubregDefToFullRegPHI) {
 //    +-(backedge) -> PHI needed in BB1 to merge initial value and loop value
 //
 TEST(MachineLaneSSAUpdaterTest, LoopWithDefInBody) {
-  liveIntervalsTest(R"MIR(
+  liveIntervalsTest(
+      R"MIR(
     %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
     ; Original definition of %1 (before loop)
     %1:vgpr_32 = V_ADD_U32_e32 %0, %0, implicit $exec
@@ -955,108 +1021,116 @@ TEST(MachineLaneSSAUpdaterTest, LoopWithDefInBody) {
     %3:vgpr_32 = V_ADD_U32_e32 %1, %1, implicit $exec
     S_ENDPGM 0
 )MIR",
-    [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
-      LiveIntervals &LIS = LISWrapper.getLIS();
-      MachineDominatorTree MDT(MF);
-      const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
-      
-      ASSERT_EQ(MF.size(), 4u) << "Should have bb.0 through bb.3";
-      
-      MachineBasicBlock *BB0 = MF.getBlockNumbered(0); // Entry with original def
-      MachineBasicBlock *BB1 = MF.getBlockNumbered(1); // Loop header
-      MachineBasicBlock *BB2 = MF.getBlockNumbered(2); // Loop body
-      
-      // Get %1 (defined in bb.0, used in loop)
-      // Skip the first V_MOV instruction, get the V_ADD
-      auto It = BB0->begin();
-      ++It; // Skip %0 = V_MOV
-      MachineInstr *OrigDefMI = &*It;
-      Register OrigReg = OrigDefMI->getOperand(0).getReg();
-      ASSERT_TRUE(OrigReg.isValid()) << "Could not get original register";
-      
-      LLVM_DEBUG(dbgs() << "Original register: %" << OrigReg.virtRegIndex() << "\n");
-      
-      // Insert new definition in loop body (bb.2)
-      // This violates SSA because %1 is defined both in bb.0 and bb.2
-      MachineInstr *MovInst = &*BB0->begin();
-      unsigned MovOpcode = MovInst->getOpcode();
-      Register ExecReg = MovInst->getOperand(2).getReg();
-      
-      const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
-      auto InsertPt = BB2->getFirstNonPHI();
-      MachineInstr *NewDefMI = BuildMI(*BB2, InsertPt, DebugLoc(), 
-                                        TII->get(MovOpcode), OrigReg)
-                                   .addImm(99)
-                                   .addReg(ExecReg, RegState::Implicit);
-      
-      // Set MachineFunction properties
-      MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
-      MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
-      
-      // Call SSA updater
-      MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
-      Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
-      
-      LLVM_DEBUG(dbgs() << "SSA repair created new register: %" << NewReg.virtRegIndex() << "\n");
-      
-      // VERIFY RESULTS:
-      
-      // 1. NewReg should be valid and different from OrigReg
-      EXPECT_TRUE(NewReg.isValid());
-      EXPECT_NE(NewReg, OrigReg);
-      
-      // 2. NewDefMI should now define NewReg
-      EXPECT_EQ(NewDefMI->getOperand(0).getReg(), NewReg);
-      
-      // 3. PHI should be inserted in loop header (bb.1)
-      bool FoundPHIInHeader = false;
-      for (MachineInstr &MI : *BB1) {
-        if (MI.isPHI()) {
-          FoundPHIInHeader = true;
-          LLVM_DEBUG(dbgs() << "Found PHI in loop header (bb.1): ");
-          LLVM_DEBUG(MI.print(dbgs()));
-          
-          // Verify PHI has 2 incoming values
-          unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
-          EXPECT_EQ(NumIncoming, 2u) << "Loop header PHI should have 2 incoming values";
-          
-          // Check the operands
-          // One should be from bb.0 (entry, using OrigReg)
-          // One should be from bb.2 (back edge, using NewReg)
-          bool HasEntryPath = false;
-          bool HasBackEdge = false;
-          
-          for (unsigned i = 1; i < MI.getNumOperands(); i += 2) {
-            Register IncomingReg = MI.getOperand(i).getReg();
-            MachineBasicBlock *IncomingMBB = MI.getOperand(i + 1).getMBB();
-            
-            if (IncomingMBB == BB0) {
-              HasEntryPath = true;
-              EXPECT_EQ(IncomingReg, OrigReg) << "Entry path should use OrigReg";
-              LLVM_DEBUG(dbgs() << "  Entry path (bb.0): %" << IncomingReg.virtRegIndex() << "\n");
-            } else if (IncomingMBB == BB2) {
-              HasBackEdge = true;
-              EXPECT_EQ(IncomingReg, NewReg) << "Back edge should use NewReg";
-              LLVM_DEBUG(dbgs() << "  Back edge (bb.2): %" << IncomingReg.virtRegIndex() << "\n");
+      [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
+        LiveIntervals &LIS = LISWrapper.getLIS();
+        MachineDominatorTree MDT(MF);
+        const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+
+        ASSERT_EQ(MF.size(), 4u) << "Should have bb.0 through bb.3";
+
+        MachineBasicBlock *BB0 =
+            MF.getBlockNumbered(0); // Entry with original def
+        MachineBasicBlock *BB1 = MF.getBlockNumbered(1); // Loop header
+        MachineBasicBlock *BB2 = MF.getBlockNumbered(2); // Loop body
+
+        // Get %1 (defined in bb.0, used in loop)
+        // Skip the first V_MOV instruction, get the V_ADD
+        auto It = BB0->begin();
+        ++It; // Skip %0 = V_MOV
+        MachineInstr *OrigDefMI = &*It;
+        Register OrigReg = OrigDefMI->getOperand(0).getReg();
+        ASSERT_TRUE(OrigReg.isValid()) << "Could not get original register";
+
+        LLVM_DEBUG(dbgs() << "Original register: %" << OrigReg.virtRegIndex()
+                          << "\n");
+
+        // Insert new definition in loop body (bb.2)
+        // This violates SSA because %1 is defined both in bb.0 and bb.2
+        MachineInstr *MovInst = &*BB0->begin();
+        unsigned MovOpcode = MovInst->getOpcode();
+        Register ExecReg = MovInst->getOperand(2).getReg();
+
+        const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+        auto InsertPt = BB2->getFirstNonPHI();
+        MachineInstr *NewDefMI =
+            BuildMI(*BB2, InsertPt, DebugLoc(), TII->get(MovOpcode), OrigReg)
+                .addImm(99)
+                .addReg(ExecReg, RegState::Implicit);
+
+        // Set MachineFunction properties
+        MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
+        MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
+
+        // Call SSA updater
+        MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
+        Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
+
+        LLVM_DEBUG(dbgs() << "SSA repair created new register: %"
+                          << NewReg.virtRegIndex() << "\n");
+
+        // VERIFY RESULTS:
+
+        // 1. NewReg should be valid and different from OrigReg
+        EXPECT_TRUE(NewReg.isValid());
+        EXPECT_NE(NewReg, OrigReg);
+
+        // 2. NewDefMI should now define NewReg
+        EXPECT_EQ(NewDefMI->getOperand(0).getReg(), NewReg);
+
+        // 3. PHI should be inserted in loop header (bb.1)
+        bool FoundPHIInHeader = false;
+        for (MachineInstr &MI : *BB1) {
+          if (MI.isPHI()) {
+            FoundPHIInHeader = true;
+            LLVM_DEBUG(dbgs() << "Found PHI in loop header (bb.1): ");
+            LLVM_DEBUG(MI.print(dbgs()));
+
+            // Verify PHI has 2 incoming values
+            unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
+            EXPECT_EQ(NumIncoming, 2u)
+                << "Loop header PHI should have 2 incoming values";
+
+            // Check the operands
+            // One should be from bb.0 (entry, using OrigReg)
+            // One should be from bb.2 (back edge, using NewReg)
+            bool HasEntryPath = false;
+            bool HasBackEdge = false;
+
+            for (unsigned i = 1; i < MI.getNumOperands(); i += 2) {
+              Register IncomingReg = MI.getOperand(i).getReg();
+              MachineBasicBlock *IncomingMBB = MI.getOperand(i + 1).getMBB();
+
+              if (IncomingMBB == BB0) {
+                HasEntryPath = true;
+                EXPECT_EQ(IncomingReg, OrigReg)
+                    << "Entry path should use OrigReg";
+                LLVM_DEBUG(dbgs() << "  Entry path (bb.0): %"
+                                  << IncomingReg.virtRegIndex() << "\n");
+              } else if (IncomingMBB == BB2) {
+                HasBackEdge = true;
+                EXPECT_EQ(IncomingReg, NewReg) << "Back edge should use NewReg";
+                LLVM_DEBUG(dbgs() << "  Back edge (bb.2): %"
+                                  << IncomingReg.virtRegIndex() << "\n");
+              }
             }
+
+            EXPECT_TRUE(HasEntryPath) << "PHI should have entry path from bb.0";
+            EXPECT_TRUE(HasBackEdge) << "PHI should have back edge from bb.2";
+
+            break;
           }
-          
-          EXPECT_TRUE(HasEntryPath) << "PHI should have entry path from bb.0";
-          EXPECT_TRUE(HasBackEdge) << "PHI should have back edge from bb.2";
-          
-          break;
         }
-      }
-      EXPECT_TRUE(FoundPHIInHeader) << "Should have inserted PHI in loop header (bb.1)";
-      
-      // 4. Verify LiveIntervals are valid
-      EXPECT_TRUE(LIS.hasInterval(NewReg));
-      EXPECT_TRUE(LIS.hasInterval(OrigReg));
-      
-      // Verify the MachineFunction is still valid
-      EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
-          << "MachineFunction verification failed";
-    });
+        EXPECT_TRUE(FoundPHIInHeader)
+            << "Should have inserted PHI in loop header (bb.1)";
+
+        // 4. Verify LiveIntervals are valid
+        EXPECT_TRUE(LIS.hasInterval(NewReg));
+        EXPECT_TRUE(LIS.hasInterval(OrigReg));
+
+        // Verify the MachineFunction is still valid
+        EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
+            << "MachineFunction verification failed";
+      });
 }
 
 //===----------------------------------------------------------------------===//
@@ -1098,13 +1172,16 @@ TEST(MachineLaneSSAUpdaterTest, LoopWithDefInBody) {
 //                requiring PHI2 in the loop header for proper SSA form.
 //
 // Expected SSA repair:
-//   - PHI1 created in BB4 (diamond join): merges unchanged X from BB2, new def from BB3
-//   - PHI2 created in BB1 (loop header): merges entry X from BB0, PHI1 result from BB5
+//   - PHI1 created in BB4 (diamond join): merges unchanged X from BB2, new def
+//   from BB3
+//   - PHI2 created in BB1 (loop header): merges entry X from BB0, PHI1 result
+//   from BB5
 //   - Use in BB1 rewritten to PHI2
 //   - Use in BB5 rewritten to PHI1
 //===----------------------------------------------------------------------===//
 TEST(MachineLaneSSAUpdaterTest, ComplexLoopWithDiamondAndUseBeforeDef) {
-  liveIntervalsTest(R"MIR(
+  liveIntervalsTest(
+      R"MIR(
     %0:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
     ; X = 1 (the register we'll redefine in loop)
     %1:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
@@ -1160,190 +1237,206 @@ TEST(MachineLaneSSAUpdaterTest, ComplexLoopWithDiamondAndUseBeforeDef) {
     %11:vgpr_32 = V_OR_B32_e32 %1, %1, implicit $exec
     S_ENDPGM 0
 )MIR",
-    [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
-      LiveIntervals &LIS = LISWrapper.getLIS();
-      MachineDominatorTree MDT(MF);
-      const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
-      
-      ASSERT_EQ(MF.size(), 7u) << "Should have bb.0 through bb.6";
-      
-      MachineBasicBlock *BB0 = MF.getBlockNumbered(0); // Entry
-      MachineBasicBlock *BB1 = MF.getBlockNumbered(1); // Loop header
-      MachineBasicBlock *BB3 = MF.getBlockNumbered(3); // Else (new def here)
-      MachineBasicBlock *BB4 = MF.getBlockNumbered(4); // Diamond join
-      MachineBasicBlock *BB5 = MF.getBlockNumbered(5); // Latch
-      
-      // Get %1 (X, defined in bb.0)
-      auto It = BB0->begin();
-      ++It; // Skip %0 = V_MOV_B32_e32 0
-      MachineInstr *OrigDefMI = &*It; // %1 = V_MOV_B32_e32 1
-      Register OrigReg = OrigDefMI->getOperand(0).getReg();
-      ASSERT_TRUE(OrigReg.isValid()) << "Could not get original register X";
-      
-      LLVM_DEBUG(dbgs() << "Original register X: %" << OrigReg.virtRegIndex() << "\n");
-      
-      // Find the use-before-def in bb.1 (loop header)
-      MachineInstr *UseBeforeDefMI = nullptr;
-      for (MachineInstr &MI : *BB1) {
-        if (!MI.isPHI() && MI.getOpcode() != TargetOpcode::IMPLICIT_DEF) {
-          // First non-PHI instruction should be V_ADD using %1
-          if (MI.getNumOperands() >= 3 && MI.getOperand(1).isReg() && 
-              MI.getOperand(1).getReg() == OrigReg) {
-            UseBeforeDefMI = &MI;
+      [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
+        LiveIntervals &LIS = LISWrapper.getLIS();
+        MachineDominatorTree MDT(MF);
+        const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+
+        ASSERT_EQ(MF.size(), 7u) << "Should have bb.0 through bb.6";
+
+        MachineBasicBlock *BB0 = MF.getBlockNumbered(0); // Entry
+        MachineBasicBlock *BB1 = MF.getBlockNumbered(1); // Loop header
+        MachineBasicBlock *BB3 = MF.getBlockNumbered(3); // Else (new def here)
+        MachineBasicBlock *BB4 = MF.getBlockNumbered(4); // Diamond join
+        MachineBasicBlock *BB5 = MF.getBlockNumbered(5); // Latch
+
+        // Get %1 (X, defined in bb.0)
+        auto It = BB0->begin();
+        ++It;                           // Skip %0 = V_MOV_B32_e32 0
+        MachineInstr *OrigDefMI = &*It; // %1 = V_MOV_B32_e32 1
+        Register OrigReg = OrigDefMI->getOperand(0).getReg();
+        ASSERT_TRUE(OrigReg.isValid()) << "Could not get original register X";
+
+        LLVM_DEBUG(dbgs() << "Original register X: %" << OrigReg.virtRegIndex()
+                          << "\n");
+
+        // Find the use-before-def in bb.1 (loop header)
+        MachineInstr *UseBeforeDefMI = nullptr;
+        for (MachineInstr &MI : *BB1) {
+          if (!MI.isPHI() && MI.getOpcode() != TargetOpcode::IMPLICIT_DEF) {
+            // First non-PHI instruction should be V_ADD using %1
+            if (MI.getNumOperands() >= 3 && MI.getOperand(1).isReg() &&
+                MI.getOperand(1).getReg() == OrigReg) {
+              UseBeforeDefMI = &MI;
+              break;
+            }
+          }
+        }
+        ASSERT_TRUE(UseBeforeDefMI)
+            << "Could not find use-before-def in loop header";
+        LLVM_DEBUG(
+            dbgs() << "Found use-before-def in bb.1: %"
+                   << UseBeforeDefMI->getOperand(0).getReg().virtRegIndex()
+                   << "\n");
+
+        // Insert new definition in bb.3 (else branch): X = 99
+        MachineInstr *MovInst = &*BB0->begin();
+        unsigned MovOpcode = MovInst->getOpcode();
+        Register ExecReg = MovInst->getOperand(2).getReg();
+
+        const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+        auto InsertPt = BB3->getFirstNonPHI();
+        MachineInstr *NewDefMI =
+            BuildMI(*BB3, InsertPt, DebugLoc(), TII->get(MovOpcode), OrigReg)
+                .addImm(99)
+                .addReg(ExecReg, RegState::Implicit);
+
+        // Set MachineFunction properties
+        MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
+        MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
+
+        // Call SSA updater
+        MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
+        Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
+
+        LLVM_DEBUG(dbgs() << "SSA repair created new register: %"
+                          << NewReg.virtRegIndex() << "\n");
+
+        // VERIFY RESULTS:
+
+        // 1. NewReg should be valid and different from OrigReg
+        EXPECT_TRUE(NewReg.isValid());
+        EXPECT_NE(NewReg, OrigReg);
+        EXPECT_EQ(NewDefMI->getOperand(0).getReg(), NewReg);
+
+        // 2. PHI1 should exist in diamond join (bb.4)
+        bool FoundPHI1 = false;
+        Register PHI1Reg;
+        for (MachineInstr &MI : *BB4) {
+          if (MI.isPHI()) {
+            FoundPHI1 = true;
+            PHI1Reg = MI.getOperand(0).getReg();
+            LLVM_DEBUG(dbgs() << "Found PHI1 in diamond join (bb.4): ");
+            LLVM_DEBUG(MI.print(dbgs()));
+
+            // Should have 2 incoming: OrigReg from bb.2, NewReg from bb.3
+            unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
+            EXPECT_EQ(NumIncoming, 2u)
+                << "Diamond join PHI should have 2 incoming";
             break;
           }
         }
-      }
-      ASSERT_TRUE(UseBeforeDefMI) << "Could not find use-before-def in loop header";
-      LLVM_DEBUG(dbgs() << "Found use-before-def in bb.1: %"
-                   << UseBeforeDefMI->getOperand(0).getReg().virtRegIndex() << "\n");
-      
-      // Insert new definition in bb.3 (else branch): X = 99
-      MachineInstr *MovInst = &*BB0->begin();
-      unsigned MovOpcode = MovInst->getOpcode();
-      Register ExecReg = MovInst->getOperand(2).getReg();
-      
-      const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
-      auto InsertPt = BB3->getFirstNonPHI();
-      MachineInstr *NewDefMI = BuildMI(*BB3, InsertPt, DebugLoc(), 
-                                        TII->get(MovOpcode), OrigReg)
-                                   .addImm(99)
-                                   .addReg(ExecReg, RegState::Implicit);
-      
-      // Set MachineFunction properties
-      MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
-      MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
-      
-      // Call SSA updater
-      MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
-      Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
-      
-      LLVM_DEBUG(dbgs() << "SSA repair created new register: %" << NewReg.virtRegIndex() << "\n");
-      
-      // VERIFY RESULTS:
-      
-      // 1. NewReg should be valid and different from OrigReg
-      EXPECT_TRUE(NewReg.isValid());
-      EXPECT_NE(NewReg, OrigReg);
-      EXPECT_EQ(NewDefMI->getOperand(0).getReg(), NewReg);
-      
-      // 2. PHI1 should exist in diamond join (bb.4)
-      bool FoundPHI1 = false;
-      Register PHI1Reg;
-      for (MachineInstr &MI : *BB4) {
-        if (MI.isPHI()) {
-          FoundPHI1 = true;
-          PHI1Reg = MI.getOperand(0).getReg();
-          LLVM_DEBUG(dbgs() << "Found PHI1 in diamond join (bb.4): ");
-          LLVM_DEBUG(MI.print(dbgs()));
-          
-          // Should have 2 incoming: OrigReg from bb.2, NewReg from bb.3
-          unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
-          EXPECT_EQ(NumIncoming, 2u) << "Diamond join PHI should have 2 incoming";
-          break;
+        EXPECT_TRUE(FoundPHI1) << "Should have PHI1 in diamond join (bb.4)";
+
+        // 3. PHI2 should exist in loop header (bb.1)
+        // First, count all PHIs
+        unsigned TotalPHICount = 0;
+        for (MachineInstr &MI : *BB1) {
+          if (MI.isPHI())
+            TotalPHICount++;
         }
-      }
-      EXPECT_TRUE(FoundPHI1) << "Should have PHI1 in diamond join (bb.4)";
-      
-      // 3. PHI2 should exist in loop header (bb.1)
-      // First, count all PHIs
-      unsigned TotalPHICount = 0;
-      for (MachineInstr &MI : *BB1) {
-        if (MI.isPHI())
-          TotalPHICount++;
-      }
-      LLVM_DEBUG(dbgs() << "Total PHIs in loop header: " << TotalPHICount << "\n");
-      EXPECT_EQ(TotalPHICount, 2u) << "Loop header should have 2 PHIs (induction var + SSA repair)";
-      
-      // Now find the SSA repair PHI (not the induction variable PHI %3)
-      bool FoundPHI2 = false;
-      Register PHI2Reg;
-      Register InductionVarPHI = Register::index2VirtReg(3); // %3 from input MIR
-      for (MachineInstr &MI : *BB1) {
-        if (MI.isPHI()) {
-          Register PHIResult = MI.getOperand(0).getReg();
-          
-          // Skip the induction variable PHI (%3 from input MIR) when looking for SSA repair PHI
-          if (PHIResult == InductionVarPHI)
-            continue;
-          
-          FoundPHI2 = true;
-          PHI2Reg = PHIResult;
-          LLVM_DEBUG(dbgs() << "Found PHI2 (SSA repair) in loop header (bb.1): ");
-          LLVM_DEBUG(MI.print(dbgs()));
-          
-          // Should have 2 incoming: OrigReg from bb.0, PHI1Reg from bb.5
-          unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
-          EXPECT_EQ(NumIncoming, 2u) << "Loop header PHI2 should have 2 incoming";
-          
-          // Verify operands
-          bool HasEntryPath = false;
-          bool HasBackEdge = false;
-          for (unsigned i = 1; i < MI.getNumOperands(); i += 2) {
-            Register IncomingReg = MI.getOperand(i).getReg();
-            MachineBasicBlock *IncomingMBB = MI.getOperand(i + 1).getMBB();
-            
-            if (IncomingMBB == BB0) {
-              HasEntryPath = true;
-              EXPECT_EQ(IncomingReg, OrigReg) << "Entry path should use OrigReg";
-            } else if (IncomingMBB == BB5) {
-              HasBackEdge = true;
-              EXPECT_EQ(IncomingReg, PHI1Reg) << "Back edge should use PHI1 result";
+        LLVM_DEBUG(dbgs() << "Total PHIs in loop header: " << TotalPHICount
+                          << "\n");
+        EXPECT_EQ(TotalPHICount, 2u)
+            << "Loop header should have 2 PHIs (induction var + SSA repair)";
+
+        // Now find the SSA repair PHI (not the induction variable PHI %3)
+        bool FoundPHI2 = false;
+        Register PHI2Reg;
+        Register InductionVarPHI =
+            Register::index2VirtReg(3); // %3 from input MIR
+        for (MachineInstr &MI : *BB1) {
+          if (MI.isPHI()) {
+            Register PHIResult = MI.getOperand(0).getReg();
+
+            // Skip the induction variable PHI (%3 from input MIR) when looking
+            // for SSA repair PHI
+            if (PHIResult == InductionVarPHI)
+              continue;
+
+            FoundPHI2 = true;
+            PHI2Reg = PHIResult;
+            LLVM_DEBUG(dbgs()
+                       << "Found PHI2 (SSA repair) in loop header (bb.1): ");
+            LLVM_DEBUG(MI.print(dbgs()));
+
+            // Should have 2 incoming: OrigReg from bb.0, PHI1Reg from bb.5
+            unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
+            EXPECT_EQ(NumIncoming, 2u)
+                << "Loop header PHI2 should have 2 incoming";
+
+            // Verify operands
+            bool HasEntryPath = false;
+            bool HasBackEdge = false;
+            for (unsigned i = 1; i < MI.getNumOperands(); i += 2) {
+              Register IncomingReg = MI.getOperand(i).getReg();
+              MachineBasicBlock *IncomingMBB = MI.getOperand(i + 1).getMBB();
+
+              if (IncomingMBB == BB0) {
+                HasEntryPath = true;
+                EXPECT_EQ(IncomingReg, OrigReg)
+                    << "Entry path should use OrigReg";
+              } else if (IncomingMBB == BB5) {
+                HasBackEdge = true;
+                EXPECT_EQ(IncomingReg, PHI1Reg)
+                    << "Back edge should use PHI1 result";
+              }
             }
+
+            EXPECT_TRUE(HasEntryPath)
+                << "PHI2 should have entry path from bb.0";
+            EXPECT_TRUE(HasBackEdge) << "PHI2 should have back edge from bb.5";
+            break;
           }
-          
-          EXPECT_TRUE(HasEntryPath) << "PHI2 should have entry path from bb.0";
-          EXPECT_TRUE(HasBackEdge) << "PHI2 should have back edge from bb.5";
-          break;
         }
-      }
-      EXPECT_TRUE(FoundPHI2) << "Should have PHI2 (SSA repair) in loop header (bb.1)";
-      
-      // 4. Use-before-def in bb.1 should be rewritten to PHI2
-      EXPECT_EQ(UseBeforeDefMI->getOperand(1).getReg(), PHI2Reg)
-          << "Use-before-def should be rewritten to PHI2 result";
-      LLVM_DEBUG(dbgs() << "Use-before-def correctly rewritten to PHI2: %"
-                   << PHI2Reg.virtRegIndex() << "\n");
-      
-      // 5. Use in latch (bb.5) should be rewritten to PHI1
-      // Find instruction using PHI1 (originally used %1)
-      bool FoundLatchUse = false;
-      for (MachineInstr &MI : *BB5) {
-        // Skip PHIs and branches
-        if (MI.isPHI() || MI.isBranch())
-          continue;
-        
-        // Look for any instruction that uses PHI1Reg
-        for (unsigned i = 0; i < MI.getNumOperands(); ++i) {
-          MachineOperand &MO = MI.getOperand(i);
-          if (MO.isReg() && MO.isUse() && MO.getReg() == PHI1Reg) {
-            LLVM_DEBUG(dbgs() << "Latch use correctly rewritten to PHI1: %"
-                         << PHI1Reg.virtRegIndex() << " in: ");
-            LLVM_DEBUG(MI.print(dbgs()));
-            FoundLatchUse = true;
-            break;
+        EXPECT_TRUE(FoundPHI2)
+            << "Should have PHI2 (SSA repair) in loop header (bb.1)";
+
+        // 4. Use-before-def in bb.1 should be rewritten to PHI2
+        EXPECT_EQ(UseBeforeDefMI->getOperand(1).getReg(), PHI2Reg)
+            << "Use-before-def should be rewritten to PHI2 result";
+        LLVM_DEBUG(dbgs() << "Use-before-def correctly rewritten to PHI2: %"
+                          << PHI2Reg.virtRegIndex() << "\n");
+
+        // 5. Use in latch (bb.5) should be rewritten to PHI1
+        // Find instruction using PHI1 (originally used %1)
+        bool FoundLatchUse = false;
+        for (MachineInstr &MI : *BB5) {
+          // Skip PHIs and branches
+          if (MI.isPHI() || MI.isBranch())
+            continue;
+
+          // Look for any instruction that uses PHI1Reg
+          for (unsigned i = 0; i < MI.getNumOperands(); ++i) {
+            MachineOperand &MO = MI.getOperand(i);
+            if (MO.isReg() && MO.isUse() && MO.getReg() == PHI1Reg) {
+              LLVM_DEBUG(dbgs() << "Latch use correctly rewritten to PHI1: %"
+                                << PHI1Reg.virtRegIndex() << " in: ");
+              LLVM_DEBUG(MI.print(dbgs()));
+              FoundLatchUse = true;
+              break;
+            }
           }
+          if (FoundLatchUse)
+            break;
         }
-        if (FoundLatchUse)
-          break;
-      }
-      EXPECT_TRUE(FoundLatchUse) << "Should find use of PHI1 in latch (bb.5)";
-      
-      // 6. Verify LiveIntervals
-      EXPECT_TRUE(LIS.hasInterval(NewReg));
-      EXPECT_TRUE(LIS.hasInterval(PHI1Reg));
-      EXPECT_TRUE(LIS.hasInterval(PHI2Reg));
-      
-      // Verify the MachineFunction is still valid
-      EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
-          << "MachineFunction verification failed";
-    });
+        EXPECT_TRUE(FoundLatchUse) << "Should find use of PHI1 in latch (bb.5)";
+
+        // 6. Verify LiveIntervals
+        EXPECT_TRUE(LIS.hasInterval(NewReg));
+        EXPECT_TRUE(LIS.hasInterval(PHI1Reg));
+        EXPECT_TRUE(LIS.hasInterval(PHI2Reg));
+
+        // Verify the MachineFunction is still valid
+        EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
+            << "MachineFunction verification failed";
+      });
 }
 
-// Test 7: Multiple subreg redefinitions in loop (X.sub0 in one branch, X.sub1 in latch)
-// This tests the most complex scenario: two separate lane redefinitions with REG_SEQUENCE
-// composition at the backedge.
-// Test multiple subregister redefinitions in different paths within a loop
+// Test 7: Multiple subreg redefinitions in loop (X.sub0 in one branch, X.sub1
+// in latch) This tests the most complex scenario: two separate lane
+// redefinitions with REG_SEQUENCE composition at the backedge. Test multiple
+// subregister redefinitions in different paths within a loop
 //
 // CFG Structure:
 //         BB0 (entry, %1:vreg_64 = IMPLICIT_DEF)
@@ -1419,188 +1512,199 @@ body: |
   bb.4:
     S_ENDPGM 0
 ...
-)MIR")).toNullTerminatedStringRef(S);
-
-  doTest<LiveIntervalsWrapperPass>(MIRString,
-             [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
-      LiveIntervals &LIS = LISWrapper.getLIS();
-      MachineDominatorTree MDT(MF);
-      LLVM_DEBUG(dbgs() << "\n=== MultipleSubregRedefsInLoop Test ===\n");
-      
-      // Get basic blocks
-      auto BBI = MF.begin();
-      ++BBI;  // Skip BB0 (Entry)
-      MachineBasicBlock *BB1 = &*BBI++;  // Loop header
-      ++BBI;  // Skip BB2 (True branch)
-      MachineBasicBlock *BB5 = &*BBI++;  // False branch (uses X.LO, INSERT def X.LO)
-      MachineBasicBlock *BB3 = &*BBI++;  // Latch (increment, INSERT def X.HI)
-      // Skip BB4 (Exit)
-      
-      MachineRegisterInfo &MRI = MF.getRegInfo();
-      const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
-      const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
-      (void)MRI;  // May be unused, suppress warning
-      
-      // Find the 64-bit register and its subregister indices
-      Register OrigReg = Register::index2VirtReg(0); // %0 from MIR
-      ASSERT_TRUE(OrigReg.isValid()) << "Register %0 should be valid";
-      unsigned Sub0Idx = 0, Sub1Idx = 0;
-      
-      // Find sub0 (low 32 bits) and sub1 (high 32 bits)
-      for (unsigned Idx = 1; Idx < TRI->getNumSubRegIndices(); ++Idx) {
-        LaneBitmask Mask = TRI->getSubRegIndexLaneMask(Idx);
-        unsigned SubRegSize = TRI->getSubRegIdxSize(Idx);
-        
-        if (SubRegSize == 32) {
-          if (Mask.getAsInteger() == 0x3) { // Low lanes
-            Sub0Idx = Idx;
-          } else if (Mask.getAsInteger() == 0xC) { // High lanes
-            Sub1Idx = Idx;
-          }
+)MIR"))
+                            .toNullTerminatedStringRef(S);
+
+  doTest<LiveIntervalsWrapperPass>(MIRString, [](MachineFunction &MF,
+                                                 LiveIntervalsWrapperPass
+                                                     &LISWrapper) {
+    LiveIntervals &LIS = LISWrapper.getLIS();
+    MachineDominatorTree MDT(MF);
+    LLVM_DEBUG(dbgs() << "\n=== MultipleSubregRedefsInLoop Test ===\n");
+
+    // Get basic blocks
+    auto BBI = MF.begin();
+    ++BBI;                            // Skip BB0 (Entry)
+    MachineBasicBlock *BB1 = &*BBI++; // Loop header
+    ++BBI;                            // Skip BB2 (True branch)
+    MachineBasicBlock *BB5 =
+        &*BBI++; // False branch (uses X.LO, INSERT def X.LO)
+    MachineBasicBlock *BB3 = &*BBI++; // Latch (increment, INSERT def X.HI)
+    // Skip BB4 (Exit)
+
+    MachineRegisterInfo &MRI = MF.getRegInfo();
+    const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+    const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+    (void)MRI; // May be unused, suppress warning
+
+    // Find the 64-bit register and its subregister indices
+    Register OrigReg = Register::index2VirtReg(0); // %0 from MIR
+    ASSERT_TRUE(OrigReg.isValid()) << "Register %0 should be valid";
+    unsigned Sub0Idx = 0, Sub1Idx = 0;
+
+    // Find sub0 (low 32 bits) and sub1 (high 32 bits)
+    for (unsigned Idx = 1; Idx < TRI->getNumSubRegIndices(); ++Idx) {
+      LaneBitmask Mask = TRI->getSubRegIndexLaneMask(Idx);
+      unsigned SubRegSize = TRI->getSubRegIdxSize(Idx);
+
+      if (SubRegSize == 32) {
+        if (Mask.getAsInteger() == 0x3) { // Low lanes
+          Sub0Idx = Idx;
+        } else if (Mask.getAsInteger() == 0xC) { // High lanes
+          Sub1Idx = Idx;
         }
       }
-      
-      ASSERT_NE(Sub0Idx, 0u) << "Should find sub0 index";
-      ASSERT_NE(Sub1Idx, 0u) << "Should find sub1 index";
-      
-      LLVM_DEBUG(dbgs() << "Using 64-bit register: %" << OrigReg.virtRegIndex()
-                   << " with sub0=" << Sub0Idx << ", sub1=" << Sub1Idx << "\n");
-      
-      // Get V_MOV opcode and EXEC register from existing instruction
-      MachineInstr *MovInst = nullptr;
-      Register ExecReg;
-      for (MachineInstr &MI : *BB1) {
-        if (!MI.isPHI() && MI.getNumOperands() >= 3 && MI.getOperand(2).isReg()) {
-          MovInst = &MI;
-          ExecReg = MI.getOperand(2).getReg();
-          break;
-        }
+    }
+
+    ASSERT_NE(Sub0Idx, 0u) << "Should find sub0 index";
+    ASSERT_NE(Sub1Idx, 0u) << "Should find sub1 index";
+
+    LLVM_DEBUG(dbgs() << "Using 64-bit register: %" << OrigReg.virtRegIndex()
+                      << " with sub0=" << Sub0Idx << ", sub1=" << Sub1Idx
+                      << "\n");
+
+    // Get V_MOV opcode and EXEC register from existing instruction
+    MachineInstr *MovInst = nullptr;
+    Register ExecReg;
+    for (MachineInstr &MI : *BB1) {
+      if (!MI.isPHI() && MI.getNumOperands() >= 3 && MI.getOperand(2).isReg()) {
+        MovInst = &MI;
+        ExecReg = MI.getOperand(2).getReg();
+        break;
       }
-      ASSERT_NE(MovInst, nullptr) << "Should find V_MOV in BB1";
-      unsigned MovOpcode = MovInst->getOpcode();
-      
-      // === FIRST INSERTION: X.sub0 in BB5 (else branch) ===
-      LLVM_DEBUG(dbgs() << "\n=== First insertion: X.sub0 in BB5 ===\n");
-      
-      // Find insertion point in BB5 (after the use of X.sub0)
-      MachineInstr *InsertPoint1 = nullptr;
-      for (MachineInstr &MI : *BB5) {
-        if (MI.isBranch()) {
-          InsertPoint1 = &MI;
-          break;
-        }
-      }
-      ASSERT_NE(InsertPoint1, nullptr) << "Should find branch in BB5";
-      
-      // Create first new def: X.sub0 = 99
-      MachineInstrBuilder MIB1 = BuildMI(*BB5, InsertPoint1, DebugLoc(), 
-                                          TII->get(MovOpcode))
-          .addReg(OrigReg, RegState::Define, Sub0Idx)
-          .addImm(99)
-          .addReg(ExecReg, RegState::Implicit);
-      
-      MachineInstr &NewDefMI1 = *MIB1;
-      LLVM_DEBUG(dbgs() << "Created first def in BB5: ");
-      LLVM_DEBUG(NewDefMI1.print(dbgs()));
-      
-      // Create SSA updater and repair after first insertion
-      MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
-      Register NewReg1 = Updater.repairSSAForNewDef(NewDefMI1, OrigReg);
-      
-      LLVM_DEBUG(dbgs() << "SSA repair #1 created new register: %" << NewReg1.virtRegIndex() << "\n");
-      
-      // === SECOND INSERTION: X.sub1 in BB3 (after increment) ===
-      LLVM_DEBUG(dbgs() << "\n=== Second insertion: X.sub1 in BB3 (after increment) ===\n");
-      
-      // Find the increment instruction in BB3 (look for vreg_64 def)
-      MachineInstr *IncrementMI = nullptr;
-      Register IncrementReg;
-      for (MachineInstr &MI : *BB3) {
-        if (!MI.isPHI() && MI.getNumOperands() > 0 && MI.getOperand(0).isReg() && 
-            MI.getOperand(0).isDef()) {
-          Register DefReg = MI.getOperand(0).getReg();
-          if (DefReg.isVirtual() && DefReg == Register::index2VirtReg(3)) {
-            IncrementMI = &MI;
-            IncrementReg = DefReg; // This is %3
-            LLVM_DEBUG(dbgs() << "Found increment: ");
-            LLVM_DEBUG(MI.print(dbgs()));
-            break;
-          }
-        }
+    }
+    ASSERT_NE(MovInst, nullptr) << "Should find V_MOV in BB1";
+    unsigned MovOpcode = MovInst->getOpcode();
+
+    // === FIRST INSERTION: X.sub0 in BB5 (else branch) ===
+    LLVM_DEBUG(dbgs() << "\n=== First insertion: X.sub0 in BB5 ===\n");
+
+    // Find insertion point in BB5 (after the use of X.sub0)
+    MachineInstr *InsertPoint1 = nullptr;
+    for (MachineInstr &MI : *BB5) {
+      if (MI.isBranch()) {
+        InsertPoint1 = &MI;
+        break;
       }
-      ASSERT_NE(IncrementMI, nullptr) << "Should find increment (def of %3) in BB3";
-      ASSERT_TRUE(IncrementReg.isValid()) << "Increment register should be valid";
-      
-      // Create second new def: %3.sub1 = 200 (redefine increment result's sub1)
-      MachineBasicBlock::iterator InsertPoint2 = std::next(IncrementMI->getIterator());
-      MachineInstrBuilder MIB2 = BuildMI(*BB3, InsertPoint2, DebugLoc(),
-                                          TII->get(MovOpcode))
-          .addReg(IncrementReg, RegState::Define, Sub1Idx)  // Redefine %3.sub1, not %0.sub1!
-          .addImm(200)
-          .addReg(ExecReg, RegState::Implicit);
-      
-      MachineInstr &NewDefMI2 = *MIB2;
-      LLVM_DEBUG(dbgs() << "Created second def in BB3 (redefining %3.sub1): ");
-      LLVM_DEBUG(NewDefMI2.print(dbgs()));
-      
-      // Repair SSA after second insertion (for %3, the increment result)
-      Register NewReg2 = Updater.repairSSAForNewDef(NewDefMI2, IncrementReg);
-      
-      LLVM_DEBUG(dbgs() << "SSA repair #2 created new register: %" << NewReg2.virtRegIndex() << "\n");
-      
-      // === Verification ===
-      LLVM_DEBUG(dbgs() << "\n=== Verification ===\n");
-      
-      // Print final MIR
-      LLVM_DEBUG(dbgs() << "Final BB3 (latch):\n");
-      LLVM_DEBUG(for (MachineInstr &MI : *BB3) {
-        MI.print(dbgs());
-      });
-      
-      // 1. Should have PHI for 32-bit X.sub0 at BB3 (diamond join)
-      bool FoundSub0PHI = false;
-      for (MachineInstr &MI : *BB3) {
-        if (MI.isPHI()) {
-          Register PHIResult = MI.getOperand(0).getReg();
-          if (PHIResult != Register::index2VirtReg(3)) { // Not the increment result PHI
-            FoundSub0PHI = true;
-            LLVM_DEBUG(dbgs() << "Found sub0 PHI in BB3: ");
-            LLVM_DEBUG(MI.print(dbgs()));
-          }
+    }
+    ASSERT_NE(InsertPoint1, nullptr) << "Should find branch in BB5";
+
+    // Create first new def: X.sub0 = 99
+    MachineInstrBuilder MIB1 =
+        BuildMI(*BB5, InsertPoint1, DebugLoc(), TII->get(MovOpcode))
+            .addReg(OrigReg, RegState::Define, Sub0Idx)
+            .addImm(99)
+            .addReg(ExecReg, RegState::Implicit);
+
+    MachineInstr &NewDefMI1 = *MIB1;
+    LLVM_DEBUG(dbgs() << "Created first def in BB5: ");
+    LLVM_DEBUG(NewDefMI1.print(dbgs()));
+
+    // Create SSA updater and repair after first insertion
+    MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
+    Register NewReg1 = Updater.repairSSAForNewDef(NewDefMI1, OrigReg);
+
+    LLVM_DEBUG(dbgs() << "SSA repair #1 created new register: %"
+                      << NewReg1.virtRegIndex() << "\n");
+
+    // === SECOND INSERTION: X.sub1 in BB3 (after increment) ===
+    LLVM_DEBUG(
+        dbgs()
+        << "\n=== Second insertion: X.sub1 in BB3 (after increment) ===\n");
+
+    // Find the increment instruction in BB3 (look for vreg_64 def)
+    MachineInstr *IncrementMI = nullptr;
+    Register IncrementReg;
+    for (MachineInstr &MI : *BB3) {
+      if (!MI.isPHI() && MI.getNumOperands() > 0 && MI.getOperand(0).isReg() &&
+          MI.getOperand(0).isDef()) {
+        Register DefReg = MI.getOperand(0).getReg();
+        if (DefReg.isVirtual() && DefReg == Register::index2VirtReg(3)) {
+          IncrementMI = &MI;
+          IncrementReg = DefReg; // This is %3
+          LLVM_DEBUG(dbgs() << "Found increment: ");
+          LLVM_DEBUG(MI.print(dbgs()));
+          break;
         }
       }
-      EXPECT_TRUE(FoundSub0PHI) << "Should have PHI for sub0 lane in BB3";
-      
-      // 2. Should have REG_SEQUENCE in BB3 before backedge to compose full 64-bit
-      bool FoundREGSEQ = false;
-      for (MachineInstr &MI : *BB3) {
-        if (MI.getOpcode() == TargetOpcode::REG_SEQUENCE) {
-          FoundREGSEQ = true;
-          LLVM_DEBUG(dbgs() << "Found REG_SEQUENCE in BB3: ");
+    }
+    ASSERT_NE(IncrementMI, nullptr)
+        << "Should find increment (def of %3) in BB3";
+    ASSERT_TRUE(IncrementReg.isValid()) << "Increment register should be valid";
+
+    // Create second new def: %3.sub1 = 200 (redefine increment result's sub1)
+    MachineBasicBlock::iterator InsertPoint2 =
+        std::next(IncrementMI->getIterator());
+    MachineInstrBuilder MIB2 =
+        BuildMI(*BB3, InsertPoint2, DebugLoc(), TII->get(MovOpcode))
+            .addReg(IncrementReg, RegState::Define,
+                    Sub1Idx) // Redefine %3.sub1, not %0.sub1!
+            .addImm(200)
+            .addReg(ExecReg, RegState::Implicit);
+
+    MachineInstr &NewDefMI2 = *MIB2;
+    LLVM_DEBUG(dbgs() << "Created second def in BB3 (redefining %3.sub1): ");
+    LLVM_DEBUG(NewDefMI2.print(dbgs()));
+
+    // Repair SSA after second insertion (for %3, the increment result)
+    Register NewReg2 = Updater.repairSSAForNewDef(NewDefMI2, IncrementReg);
+
+    LLVM_DEBUG(dbgs() << "SSA repair #2 created new register: %"
+                      << NewReg2.virtRegIndex() << "\n");
+
+    // === Verification ===
+    LLVM_DEBUG(dbgs() << "\n=== Verification ===\n");
+
+    // Print final MIR
+    LLVM_DEBUG(dbgs() << "Final BB3 (latch):\n");
+    LLVM_DEBUG(for (MachineInstr &MI : *BB3) { MI.print(dbgs()); });
+
+    // 1. Should have PHI for 32-bit X.sub0 at BB3 (diamond join)
+    bool FoundSub0PHI = false;
+    for (MachineInstr &MI : *BB3) {
+      if (MI.isPHI()) {
+        Register PHIResult = MI.getOperand(0).getReg();
+        if (PHIResult !=
+            Register::index2VirtReg(3)) { // Not the increment result PHI
+          FoundSub0PHI = true;
+          LLVM_DEBUG(dbgs() << "Found sub0 PHI in BB3: ");
           LLVM_DEBUG(MI.print(dbgs()));
-          
-          // Verify it composes both lanes
-          unsigned NumSources = (MI.getNumOperands() - 1) / 2;
-          EXPECT_GE(NumSources, 2u) << "REG_SEQUENCE should have at least 2 sources (sub0 and sub1)";
         }
       }
-      
-      EXPECT_TRUE(FoundREGSEQ) << "Should have REG_SEQUENCE at backedge in BB3";
-      
-      // 3. Verify LiveIntervals
-      EXPECT_TRUE(LIS.hasInterval(NewReg1));
-      EXPECT_TRUE(LIS.hasInterval(NewReg2));
-      
-      // Verify the MachineFunction is still valid
-      EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
-          << "MachineFunction verification failed";
-    });
+    }
+    EXPECT_TRUE(FoundSub0PHI) << "Should have PHI for sub0 lane in BB3";
+
+    // 2. Should have REG_SEQUENCE in BB3 before backedge to compose full 64-bit
+    bool FoundREGSEQ = false;
+    for (MachineInstr &MI : *BB3) {
+      if (MI.getOpcode() == TargetOpcode::REG_SEQUENCE) {
+        FoundREGSEQ = true;
+        LLVM_DEBUG(dbgs() << "Found REG_SEQUENCE in BB3: ");
+        LLVM_DEBUG(MI.print(dbgs()));
+
+        // Verify it composes both lanes
+        unsigned NumSources = (MI.getNumOperands() - 1) / 2;
+        EXPECT_GE(NumSources, 2u)
+            << "REG_SEQUENCE should have at least 2 sources (sub0 and sub1)";
+      }
+    }
+
+    EXPECT_TRUE(FoundREGSEQ) << "Should have REG_SEQUENCE at backedge in BB3";
+
+    // 3. Verify LiveIntervals
+    EXPECT_TRUE(LIS.hasInterval(NewReg1));
+    EXPECT_TRUE(LIS.hasInterval(NewReg2));
+
+    // Verify the MachineFunction is still valid
+    EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
+        << "MachineFunction verification failed";
+  });
 }
 
 // Test 8: Nested loops with SSA repair across multiple loop levels
-// This tests SSA repair with a new definition in an inner loop body that propagates
-// to both the inner loop header and outer loop header PHIs.
-// Test nested loops with SSA repair across multiple loop levels
+// This tests SSA repair with a new definition in an inner loop body that
+// propagates to both the inner loop header and outer loop header PHIs. Test
+// nested loops with SSA repair across multiple loop levels
 //
 // CFG Structure:
 //         BB0 (entry, %0 = 100)
@@ -1689,163 +1793,173 @@ body: |
     ; Exit
     S_ENDPGM 0
 ...
-)MIR")).toNullTerminatedStringRef(S);
-
-  doTest<LiveIntervalsWrapperPass>(MIRString,
-             [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
-      LiveIntervals &LIS = LISWrapper.getLIS();
-      MachineDominatorTree MDT(MF);
-      LLVM_DEBUG(dbgs() << "\n=== NestedLoopsWithSSARepair Test ===\n");
-      
-      // Get basic blocks
-      auto BBI = MF.begin();
-      MachineBasicBlock *BB0 = &*BBI++;  // Entry
-      MachineBasicBlock *BB1 = &*BBI++;  // Outer loop header
-      MachineBasicBlock *BB2 = &*BBI++;  // Inner loop header
-      MachineBasicBlock *BB3 = &*BBI++;  // Inner loop body (INSERT HERE)
-      MachineBasicBlock *BB4 = &*BBI++;  // Outer loop body (after inner)
-      // BB5 = Exit (not needed)
-      
-      const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
-      const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
-      
-      // Get the register that will be redefined (%0 is the initial value)
-      Register OrigReg = Register::index2VirtReg(0);
-      ASSERT_TRUE(OrigReg.isValid()) << "Register %0 should be valid";
-      
-      LLVM_DEBUG(dbgs() << "Original register: %" << OrigReg.virtRegIndex() << "\n");
-      
-      // Get V_MOV opcode and EXEC register
-      MachineInstr *MovInst = &*BB0->begin();
-      unsigned MovOpcode = MovInst->getOpcode();
-      Register ExecReg = MovInst->getOperand(2).getReg();
-      
-      // Print initial state
-      LLVM_DEBUG(dbgs() << "\nInitial BB2 (inner loop header):\n");
-      for (MachineInstr &MI : *BB2) {
-        LLVM_DEBUG(MI.print(dbgs()));
-      }
-      
-      LLVM_DEBUG(dbgs() << "\nInitial BB1 (outer loop header):\n");
-      for (MachineInstr &MI : *BB1) {
-        LLVM_DEBUG(MI.print(dbgs()));
-      }
-      
-      // Insert new definition in BB3 (inner loop body)
-      // Find insertion point before the branch
-      MachineInstr *InsertPt = nullptr;
-      for (MachineInstr &MI : *BB3) {
-        if (MI.isBranch()) {
-          InsertPt = &MI;
-          break;
+)MIR"))
+                            .toNullTerminatedStringRef(S);
+
+  doTest<LiveIntervalsWrapperPass>(
+      MIRString, [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
+        LiveIntervals &LIS = LISWrapper.getLIS();
+        MachineDominatorTree MDT(MF);
+        LLVM_DEBUG(dbgs() << "\n=== NestedLoopsWithSSARepair Test ===\n");
+
+        // Get basic blocks
+        auto BBI = MF.begin();
+        MachineBasicBlock *BB0 = &*BBI++; // Entry
+        MachineBasicBlock *BB1 = &*BBI++; // Outer loop header
+        MachineBasicBlock *BB2 = &*BBI++; // Inner loop header
+        MachineBasicBlock *BB3 = &*BBI++; // Inner loop body (INSERT HERE)
+        MachineBasicBlock *BB4 = &*BBI++; // Outer loop body (after inner)
+        // BB5 = Exit (not needed)
+
+        const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+        const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+
+        // Get the register that will be redefined (%0 is the initial value)
+        Register OrigReg = Register::index2VirtReg(0);
+        ASSERT_TRUE(OrigReg.isValid()) << "Register %0 should be valid";
+
+        LLVM_DEBUG(dbgs() << "Original register: %" << OrigReg.virtRegIndex()
+                          << "\n");
+
+        // Get V_MOV opcode and EXEC register
+        MachineInstr *MovInst = &*BB0->begin();
+        unsigned MovOpcode = MovInst->getOpcode();
+        Register ExecReg = MovInst->getOperand(2).getReg();
+
+        // Print initial state
+        LLVM_DEBUG(dbgs() << "\nInitial BB2 (inner loop header):\n");
+        for (MachineInstr &MI : *BB2) {
+          LLVM_DEBUG(MI.print(dbgs()));
         }
-      }
-      ASSERT_NE(InsertPt, nullptr) << "Should find branch in BB3";
-      
-      // Insert: X = 999 (violates SSA)
-      MachineInstr *NewDefMI = BuildMI(*BB3, InsertPt, DebugLoc(),
-                                        TII->get(MovOpcode), OrigReg)
-          .addImm(999)
-          .addReg(ExecReg, RegState::Implicit);
-      
-      LLVM_DEBUG(dbgs() << "\nInserted new def in BB3 (inner loop body): ");
-      LLVM_DEBUG(NewDefMI->print(dbgs()));
-      
-      // Create SSA updater and repair
-      MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
-      Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
-      
-      LLVM_DEBUG(dbgs() << "SSA repair created new register: %" << NewReg.virtRegIndex() << "\n");
-      
-      // === Verification ===
-      LLVM_DEBUG(dbgs() << "\n=== Verification ===\n");
-      
-      LLVM_DEBUG(dbgs() << "\nFinal BB2 (inner loop header):\n");
-      for (MachineInstr &MI : *BB2) {
-        LLVM_DEBUG(MI.print(dbgs()));
-      }
-      
-      LLVM_DEBUG(dbgs() << "\nFinal BB1 (outer loop header):\n");
-      for (MachineInstr &MI : *BB1) {
-        LLVM_DEBUG(MI.print(dbgs()));
-      }
-      
-      LLVM_DEBUG(dbgs() << "\nFinal BB4 (outer loop body after inner):\n");
-      for (MachineInstr &MI : *BB4) {
-        LLVM_DEBUG(MI.print(dbgs()));
-      }
-      
-      // 1. Inner loop header (BB2) should have NEW PHI created by SSA repair
-      bool FoundSSARepairPHI = false;
-      Register SSARepairPHIReg;
-      for (MachineInstr &MI : *BB2) {
-        if (MI.isPHI()) {
-          // Look for a PHI that has NewReg as one of its incoming values
-          for (unsigned i = 1; i < MI.getNumOperands(); i += 2) {
-            Register IncomingReg = MI.getOperand(i).getReg();
-            MachineBasicBlock *IncomingMBB = MI.getOperand(i + 1).getMBB();
-            
-            if (IncomingMBB == BB3 && IncomingReg == NewReg) {
-              FoundSSARepairPHI = true;
-              SSARepairPHIReg = MI.getOperand(0).getReg();
-              LLVM_DEBUG(dbgs() << "Found SSA repair PHI in inner loop header: ");
-              LLVM_DEBUG(MI.print(dbgs()));
-              
-              // Should have incoming from BB1 and BB3
-              unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
-              EXPECT_EQ(NumIncoming, 2u) << "SSA repair PHI should have 2 incoming";
-              break;
-            }
-          }
-          if (FoundSSARepairPHI)
+
+        LLVM_DEBUG(dbgs() << "\nInitial BB1 (outer loop header):\n");
+        for (MachineInstr &MI : *BB1) {
+          LLVM_DEBUG(MI.print(dbgs()));
+        }
+
+        // Insert new definition in BB3 (inner loop body)
+        // Find insertion point before the branch
+        MachineInstr *InsertPt = nullptr;
+        for (MachineInstr &MI : *BB3) {
+          if (MI.isBranch()) {
+            InsertPt = &MI;
             break;
+          }
         }
-      }
-      EXPECT_TRUE(FoundSSARepairPHI) << "Should find SSA repair PHI in BB2 (inner loop header)";
-      
-      // 2. Outer loop header (BB1) may have PHI updated if needed
-      bool FoundOuterPHI = false;
-      for (MachineInstr &MI : *BB1) {
-        if (MI.isPHI() && MI.getOperand(0).getReg() == Register::index2VirtReg(1)) {
-          FoundOuterPHI = true;
-          LLVM_DEBUG(dbgs() << "Found outer loop PHI: ");
+        ASSERT_NE(InsertPt, nullptr) << "Should find branch in BB3";
+
+        // Insert: X = 999 (violates SSA)
+        MachineInstr *NewDefMI =
+            BuildMI(*BB3, InsertPt, DebugLoc(), TII->get(MovOpcode), OrigReg)
+                .addImm(999)
+                .addReg(ExecReg, RegState::Implicit);
+
+        LLVM_DEBUG(dbgs() << "\nInserted new def in BB3 (inner loop body): ");
+        LLVM_DEBUG(NewDefMI->print(dbgs()));
+
+        // Create SSA updater and repair
+        MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
+        Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
+
+        LLVM_DEBUG(dbgs() << "SSA repair created new register: %"
+                          << NewReg.virtRegIndex() << "\n");
+
+        // === Verification ===
+        LLVM_DEBUG(dbgs() << "\n=== Verification ===\n");
+
+        LLVM_DEBUG(dbgs() << "\nFinal BB2 (inner loop header):\n");
+        for (MachineInstr &MI : *BB2) {
           LLVM_DEBUG(MI.print(dbgs()));
         }
-      }
-      EXPECT_TRUE(FoundOuterPHI) << "Should find outer loop PHI in BB1";
-      
-      // 3. Use in BB4 should be updated
-      bool FoundUseInBB4 = false;
-      for (MachineInstr &MI : *BB4) {
-        if (!MI.isPHI() && MI.getNumOperands() > 1) {
-          for (unsigned i = 0; i < MI.getNumOperands(); ++i) {
-            if (MI.getOperand(i).isReg() && MI.getOperand(i).isUse()) {
-              Register UseReg = MI.getOperand(i).getReg();
-              if (UseReg.isVirtual()) {
-                FoundUseInBB4 = true;
-                LLVM_DEBUG(dbgs() << "Found use in BB4: %" << UseReg.virtRegIndex() << " in ");
+
+        LLVM_DEBUG(dbgs() << "\nFinal BB1 (outer loop header):\n");
+        for (MachineInstr &MI : *BB1) {
+          LLVM_DEBUG(MI.print(dbgs()));
+        }
+
+        LLVM_DEBUG(dbgs() << "\nFinal BB4 (outer loop body after inner):\n");
+        for (MachineInstr &MI : *BB4) {
+          LLVM_DEBUG(MI.print(dbgs()));
+        }
+
+        // 1. Inner loop header (BB2) should have NEW PHI created by SSA repair
+        bool FoundSSARepairPHI = false;
+        Register SSARepairPHIReg;
+        for (MachineInstr &MI : *BB2) {
+          if (MI.isPHI()) {
+            // Look for a PHI that has NewReg as one of its incoming values
+            for (unsigned i = 1; i < MI.getNumOperands(); i += 2) {
+              Register IncomingReg = MI.getOperand(i).getReg();
+              MachineBasicBlock *IncomingMBB = MI.getOperand(i + 1).getMBB();
+
+              if (IncomingMBB == BB3 && IncomingReg == NewReg) {
+                FoundSSARepairPHI = true;
+                SSARepairPHIReg = MI.getOperand(0).getReg();
+                LLVM_DEBUG(dbgs()
+                           << "Found SSA repair PHI in inner loop header: ");
                 LLVM_DEBUG(MI.print(dbgs()));
+
+                // Should have incoming from BB1 and BB3
+                unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
+                EXPECT_EQ(NumIncoming, 2u)
+                    << "SSA repair PHI should have 2 incoming";
+                break;
               }
             }
+            if (FoundSSARepairPHI)
+              break;
           }
         }
-      }
-      EXPECT_TRUE(FoundUseInBB4) << "Should find uses in outer loop body (BB4)";
-      
-      // 4. Verify LiveIntervals
-      EXPECT_TRUE(LIS.hasInterval(NewReg));
-      
-      // Verify the MachineFunction is still valid
-      EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
-          << "MachineFunction verification failed";
-    });
+        EXPECT_TRUE(FoundSSARepairPHI)
+            << "Should find SSA repair PHI in BB2 (inner loop header)";
+
+        // 2. Outer loop header (BB1) may have PHI updated if needed
+        bool FoundOuterPHI = false;
+        for (MachineInstr &MI : *BB1) {
+          if (MI.isPHI() &&
+              MI.getOperand(0).getReg() == Register::index2VirtReg(1)) {
+            FoundOuterPHI = true;
+            LLVM_DEBUG(dbgs() << "Found outer loop PHI: ");
+            LLVM_DEBUG(MI.print(dbgs()));
+          }
+        }
+        EXPECT_TRUE(FoundOuterPHI) << "Should find outer loop PHI in BB1";
+
+        // 3. Use in BB4 should be updated
+        bool FoundUseInBB4 = false;
+        for (MachineInstr &MI : *BB4) {
+          if (!MI.isPHI() && MI.getNumOperands() > 1) {
+            for (unsigned i = 0; i < MI.getNumOperands(); ++i) {
+              if (MI.getOperand(i).isReg() && MI.getOperand(i).isUse()) {
+                Register UseReg = MI.getOperand(i).getReg();
+                if (UseReg.isVirtual()) {
+                  FoundUseInBB4 = true;
+                  LLVM_DEBUG(dbgs() << "Found use in BB4: %"
+                                    << UseReg.virtRegIndex() << " in ");
+                  LLVM_DEBUG(MI.print(dbgs()));
+                }
+              }
+            }
+          }
+        }
+        EXPECT_TRUE(FoundUseInBB4)
+            << "Should find uses in outer loop body (BB4)";
+
+        // 4. Verify LiveIntervals
+        EXPECT_TRUE(LIS.hasInterval(NewReg));
+
+        // Verify the MachineFunction is still valid
+        EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
+            << "MachineFunction verification failed";
+      });
 }
 
 //===----------------------------------------------------------------------===//
 // Test 9: 128-bit register with 64-bit subreg redef and multiple lane uses
 //
 // This comprehensive test covers:
-// 1. Large register (128-bit) with multiple subregisters (sub0, sub1, sub2, sub3)
+// 1. Large register (128-bit) with multiple subregisters (sub0, sub1, sub2,
+// sub3)
 // 2. Partial redefinition (64-bit sub2_3 covering two lanes: sub2+sub3)
 // 3. Uses of changed lanes (sub2, sub3) in different paths
 // 4. Uses of unchanged lanes (sub0, sub1) in different paths
@@ -1864,7 +1978,8 @@ body: |
 //      use   use
 //      sub0  sub3 (changed)
 //        \   /
-//         BB5 (join) -> PHI for sub2_3 lanes (sub2+sub3 changed, sub0+sub1 unchanged)
+//         BB5 (join) -> PHI for sub2_3 lanes (sub2+sub3 changed, sub0+sub1
+//         unchanged)
 //          |
 //         use sub1 (unchanged, flows from BB1)
 //          |
@@ -1974,161 +2089,176 @@ body: |
     dead %5:vgpr_32 = V_MOV_B32_e32 %0.sub0:vreg_128, implicit $exec
     S_ENDPGM 0
 ...
-)MIR")).toNullTerminatedStringRef(S);
-
-  doTest<LiveIntervalsWrapperPass>(MIRString,
-             [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
-      LiveIntervals &LIS = LISWrapper.getLIS();
-      MachineDominatorTree MDT(MF);
-      LLVM_DEBUG(dbgs() << "\n=== MultipleSubregUsesAcrossDiamonds Test ===\n");
-      
-      // Get basic blocks
-      auto BBI = MF.begin();
-      ++BBI; // Skip BB0 (entry)
-      ++BBI; // Skip BB1 (Initial def)
-      ++BBI; // Skip BB2 (Diamond1 split)
-      MachineBasicBlock *BB3 = &*BBI++; // Diamond1 true (no redef)
-      MachineBasicBlock *BB4 = &*BBI++; // Diamond1 false (INSERT HERE)
-      MachineBasicBlock *BB5 = &*BBI++; // Diamond1 join
-      
-      const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
-      const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
-      MachineRegisterInfo &MRI = MF.getRegInfo();
-      (void)MRI; // May be unused, suppress warning
-      
-      // Find the 128-bit register %0
-      Register OrigReg = Register::index2VirtReg(0);
-      ASSERT_TRUE(OrigReg.isValid()) << "Register %0 should be valid";
-      
-      LLVM_DEBUG(dbgs() << "Using 128-bit register: %" << OrigReg.virtRegIndex() << "\n");
-      
-      // Find sub2_3 subregister index (64-bit covering bits 64-127)
-      unsigned Sub2_3Idx = 0;
-      for (unsigned Idx = 1; Idx < TRI->getNumSubRegIndices(); ++Idx) {
-        unsigned SubRegSize = TRI->getSubRegIdxSize(Idx);
-        LaneBitmask Mask = TRI->getSubRegIndexLaneMask(Idx);
-        
-        // Looking for 64-bit subreg covering upper half (lanes for sub2+sub3)
-        // sub2_3 should have mask 0xF0 (lanes for bits 64-127)
-        if (SubRegSize == 64 && (Mask.getAsInteger() & 0xF0) == 0xF0) {
-          Sub2_3Idx = Idx;
-          LLVM_DEBUG(dbgs() << "Found sub2_3 index: " << Idx
-                       << " (size=" << SubRegSize
-                       << ", mask=0x" << llvm::format("%X", Mask.getAsInteger()) << ")\n");
-          break;
+)MIR"))
+                            .toNullTerminatedStringRef(S);
+
+  doTest<LiveIntervalsWrapperPass>(
+      MIRString, [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
+        LiveIntervals &LIS = LISWrapper.getLIS();
+        MachineDominatorTree MDT(MF);
+        LLVM_DEBUG(
+            dbgs() << "\n=== MultipleSubregUsesAcrossDiamonds Test ===\n");
+
+        // Get basic blocks
+        auto BBI = MF.begin();
+        ++BBI;                            // Skip BB0 (entry)
+        ++BBI;                            // Skip BB1 (Initial def)
+        ++BBI;                            // Skip BB2 (Diamond1 split)
+        MachineBasicBlock *BB3 = &*BBI++; // Diamond1 true (no redef)
+        MachineBasicBlock *BB4 = &*BBI++; // Diamond1 false (INSERT HERE)
+        MachineBasicBlock *BB5 = &*BBI++; // Diamond1 join
+
+        const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+        const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+        MachineRegisterInfo &MRI = MF.getRegInfo();
+        (void)MRI; // May be unused, suppress warning
+
+        // Find the 128-bit register %0
+        Register OrigReg = Register::index2VirtReg(0);
+        ASSERT_TRUE(OrigReg.isValid()) << "Register %0 should be valid";
+
+        LLVM_DEBUG(dbgs() << "Using 128-bit register: %"
+                          << OrigReg.virtRegIndex() << "\n");
+
+        // Find sub2_3 subregister index (64-bit covering bits 64-127)
+        unsigned Sub2_3Idx = 0;
+        for (unsigned Idx = 1; Idx < TRI->getNumSubRegIndices(); ++Idx) {
+          unsigned SubRegSize = TRI->getSubRegIdxSize(Idx);
+          LaneBitmask Mask = TRI->getSubRegIndexLaneMask(Idx);
+
+          // Looking for 64-bit subreg covering upper half (lanes for sub2+sub3)
+          // sub2_3 should have mask 0xF0 (lanes for bits 64-127)
+          if (SubRegSize == 64 && (Mask.getAsInteger() & 0xF0) == 0xF0) {
+            Sub2_3Idx = Idx;
+            LLVM_DEBUG(dbgs()
+                       << "Found sub2_3 index: " << Idx
+                       << " (size=" << SubRegSize << ", mask=0x"
+                       << llvm::format("%X", Mask.getAsInteger()) << ")\n");
+            break;
+          }
         }
-      }
-      
-      ASSERT_NE(Sub2_3Idx, 0u) << "Should find sub2_3 subregister index";
-      
-      // Insert new definition in BB4: %0.sub2_3 = IMPLICIT_DEF
-      // Find insertion point (before the use of sub3)
-      MachineInstr *UseOfSub3 = nullptr;
-      
-      for (MachineInstr &MI : *BB4) {
-        if (MI.getNumOperands() >= 2 && MI.getOperand(0).isReg() && 
-            MI.getOperand(1).isReg() && MI.getOperand(1).getReg() == OrigReg) {
-          UseOfSub3 = &MI;
-          break;
+
+        ASSERT_NE(Sub2_3Idx, 0u) << "Should find sub2_3 subregister index";
+
+        // Insert new definition in BB4: %0.sub2_3 = IMPLICIT_DEF
+        // Find insertion point (before the use of sub3)
+        MachineInstr *UseOfSub3 = nullptr;
+
+        for (MachineInstr &MI : *BB4) {
+          if (MI.getNumOperands() >= 2 && MI.getOperand(0).isReg() &&
+              MI.getOperand(1).isReg() &&
+              MI.getOperand(1).getReg() == OrigReg) {
+            UseOfSub3 = &MI;
+            break;
+          }
         }
-      }
-      ASSERT_NE(UseOfSub3, nullptr) << "Should find use of sub3 in BB4";
-      
-      // Create new def: %0.sub2_3 = IMPLICIT_DEF
-      // We use IMPLICIT_DEF because it works for any register size and the SSA updater
-      // doesn't care about the specific instruction semantics - we're just testing SSA repair
-      MachineInstrBuilder MIB = BuildMI(*BB4, UseOfSub3, DebugLoc(), 
-                                         TII->get(TargetOpcode::IMPLICIT_DEF))
-        .addDef(OrigReg, RegState::Define, Sub2_3Idx);
-      
-      MachineInstr *NewDefMI = MIB.getInstr();
-      LLVM_DEBUG(dbgs() << "Inserted new def in BB4: ");
-      LLVM_DEBUG(NewDefMI->print(dbgs()));
-      
-      // Index the new instruction
-      LIS.InsertMachineInstrInMaps(*NewDefMI);
-      
-      // Set MachineFunction properties to allow PHI insertion
-      MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
-      MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
-      
-      // Create SSA updater and repair
-      MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
-      Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
-      
-      LLVM_DEBUG(dbgs() << "SSA repair created new register: %" << NewReg.virtRegIndex() << "\n");
-      
-      // Print final state of key blocks
-      LLVM_DEBUG(dbgs() << "\nFinal BB5 (diamond1 join):\n");
-      for (MachineInstr &MI : *BB5) {
-        LLVM_DEBUG(MI.print(dbgs()));
-      }
-      
-      // Verify SSA repair results
-      
-      // 1. Should have PHI in BB5 for sub2+sub3 lanes
-      bool FoundPHI = false;
-      for (MachineInstr &MI : *BB5) {
-        if (MI.isPHI()) {
-          Register PHIResult = MI.getOperand(0).getReg();
-          if (PHIResult.isVirtual()) {
-            LLVM_DEBUG(dbgs() << "Found PHI in BB5: ");
-            LLVM_DEBUG(MI.print(dbgs()));
-            
-            // Check that it has 2 incoming values
-            unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
-            EXPECT_EQ(NumIncoming, 2u) << "PHI should have 2 incoming values";
-            
-            // Check that one incoming is the new register from BB4
-            // and the other incoming from BB3 uses %0.sub2_3
-            bool HasNewRegFromBB4 = false;
-            bool HasCorrectSubregFromBB3 = false;
-            for (unsigned i = 1; i < MI.getNumOperands(); i += 2) {
-              Register IncomingReg = MI.getOperand(i).getReg();
-              unsigned IncomingSubReg = MI.getOperand(i).getSubReg();
-              MachineBasicBlock *IncomingMBB = MI.getOperand(i + 1).getMBB();
-              
-              if (IncomingMBB == BB4) {
-                HasNewRegFromBB4 = (IncomingReg == NewReg);
-                LLVM_DEBUG(dbgs() << "  Incoming from BB4: %" << IncomingReg.virtRegIndex() << "\n");
-              } else if (IncomingMBB == BB3) {
-                // Should be %0.sub2_3 (the lanes we redefined)
-                LLVM_DEBUG(dbgs() << "  Incoming from BB3: %" << IncomingReg.virtRegIndex());
-                if (IncomingSubReg) {
-                  LLVM_DEBUG(dbgs() << "." << TRI->getSubRegIndexName(IncomingSubReg));
-                }
-                LLVM_DEBUG(dbgs() << "\n");
-                
-                // Verify it's using sub2_3
-                if (IncomingReg == OrigReg && IncomingSubReg == Sub2_3Idx) {
-                  HasCorrectSubregFromBB3 = true;
+        ASSERT_NE(UseOfSub3, nullptr) << "Should find use of sub3 in BB4";
+
+        // Create new def: %0.sub2_3 = IMPLICIT_DEF
+        // We use IMPLICIT_DEF because it works for any register size and the
+        // SSA updater doesn't care about the specific instruction semantics -
+        // we're just testing SSA repair
+        MachineInstrBuilder MIB =
+            BuildMI(*BB4, UseOfSub3, DebugLoc(),
+                    TII->get(TargetOpcode::IMPLICIT_DEF))
+                .addDef(OrigReg, RegState::Define, Sub2_3Idx);
+
+        MachineInstr *NewDefMI = MIB.getInstr();
+        LLVM_DEBUG(dbgs() << "Inserted new def in BB4: ");
+        LLVM_DEBUG(NewDefMI->print(dbgs()));
+
+        // Index the new instruction
+        LIS.InsertMachineInstrInMaps(*NewDefMI);
+
+        // Set MachineFunction properties to allow PHI insertion
+        MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
+        MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
+
+        // Create SSA updater and repair
+        MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
+        Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
+
+        LLVM_DEBUG(dbgs() << "SSA repair created new register: %"
+                          << NewReg.virtRegIndex() << "\n");
+
+        // Print final state of key blocks
+        LLVM_DEBUG(dbgs() << "\nFinal BB5 (diamond1 join):\n");
+        for (MachineInstr &MI : *BB5) {
+          LLVM_DEBUG(MI.print(dbgs()));
+        }
+
+        // Verify SSA repair results
+
+        // 1. Should have PHI in BB5 for sub2+sub3 lanes
+        bool FoundPHI = false;
+        for (MachineInstr &MI : *BB5) {
+          if (MI.isPHI()) {
+            Register PHIResult = MI.getOperand(0).getReg();
+            if (PHIResult.isVirtual()) {
+              LLVM_DEBUG(dbgs() << "Found PHI in BB5: ");
+              LLVM_DEBUG(MI.print(dbgs()));
+
+              // Check that it has 2 incoming values
+              unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
+              EXPECT_EQ(NumIncoming, 2u) << "PHI should have 2 incoming values";
+
+              // Check that one incoming is the new register from BB4
+              // and the other incoming from BB3 uses %0.sub2_3
+              bool HasNewRegFromBB4 = false;
+              bool HasCorrectSubregFromBB3 = false;
+              for (unsigned i = 1; i < MI.getNumOperands(); i += 2) {
+                Register IncomingReg = MI.getOperand(i).getReg();
+                unsigned IncomingSubReg = MI.getOperand(i).getSubReg();
+                MachineBasicBlock *IncomingMBB = MI.getOperand(i + 1).getMBB();
+
+                if (IncomingMBB == BB4) {
+                  HasNewRegFromBB4 = (IncomingReg == NewReg);
+                  LLVM_DEBUG(dbgs() << "  Incoming from BB4: %"
+                                    << IncomingReg.virtRegIndex() << "\n");
+                } else if (IncomingMBB == BB3) {
+                  // Should be %0.sub2_3 (the lanes we redefined)
+                  LLVM_DEBUG(dbgs() << "  Incoming from BB3: %"
+                                    << IncomingReg.virtRegIndex());
+                  if (IncomingSubReg) {
+                    LLVM_DEBUG(dbgs()
+                               << "."
+                               << TRI->getSubRegIndexName(IncomingSubReg));
+                  }
+                  LLVM_DEBUG(dbgs() << "\n");
+
+                  // Verify it's using sub2_3
+                  if (IncomingReg == OrigReg && IncomingSubReg == Sub2_3Idx) {
+                    HasCorrectSubregFromBB3 = true;
+                  }
                 }
               }
+              EXPECT_TRUE(HasNewRegFromBB4) << "PHI should use NewReg from BB4";
+              EXPECT_TRUE(HasCorrectSubregFromBB3)
+                  << "PHI should use %0.sub2_3 from BB3";
+              FoundPHI = true;
             }
-            EXPECT_TRUE(HasNewRegFromBB4) << "PHI should use NewReg from BB4";
-            EXPECT_TRUE(HasCorrectSubregFromBB3) << "PHI should use %0.sub2_3 from BB3";
-            FoundPHI = true;
           }
         }
-      }
-      EXPECT_TRUE(FoundPHI) << "Should find PHI in BB5 for sub2_3 lanes";
-      
-      // 2. Verify LiveIntervals
-      EXPECT_TRUE(LIS.hasInterval(NewReg));
-      EXPECT_TRUE(LIS.hasInterval(OrigReg));
-      
-      // 3. Verify LiveInterval for OrigReg has subranges for changed lanes
-      LiveInterval &OrigLI = LIS.getInterval(OrigReg);
-      EXPECT_TRUE(OrigLI.hasSubRanges()) << "OrigReg should have subranges after partial redef";
-      
-      // Verify the MachineFunction is still valid
-      EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
-          << "MachineFunction verification failed";
-    });
+        EXPECT_TRUE(FoundPHI) << "Should find PHI in BB5 for sub2_3 lanes";
+
+        // 2. Verify LiveIntervals
+        EXPECT_TRUE(LIS.hasInterval(NewReg));
+        EXPECT_TRUE(LIS.hasInterval(OrigReg));
+
+        // 3. Verify LiveInterval for OrigReg has subranges for changed lanes
+        LiveInterval &OrigLI = LIS.getInterval(OrigReg);
+        EXPECT_TRUE(OrigLI.hasSubRanges())
+            << "OrigReg should have subranges after partial redef";
+
+        // Verify the MachineFunction is still valid
+        EXPECT_TRUE(MF.verify(nullptr, nullptr, nullptr, false))
+            << "MachineFunction verification failed";
+      });
 }
 
-// Test 10: Non-contiguous lane mask - redefine sub1 of 128-bit, use full register
-// This specifically tests the multi-source REG_SEQUENCE code path for non-contiguous lanes
+// Test 10: Non-contiguous lane mask - redefine sub1 of 128-bit, use full
+// register This specifically tests the multi-source REG_SEQUENCE code path for
+// non-contiguous lanes
 //
 // CFG Structure:
 //        BB0 (entry)
@@ -2150,9 +2280,11 @@ body: |
 //         v
 //       BB6 (exit)
 //
-// Key Property: Redefining sub1 leaves LanesFromOld = sub0 + sub2 + sub3 (non-contiguous!)
-//               This requires getCoveringSubRegsForLaneMask to decompose into multiple subregs
-//               Expected REG_SEQUENCE: %RS = REG_SEQUENCE %6, sub1, %0.sub0, sub0, %0.sub2_3, sub2_3
+// Key Property: Redefining sub1 leaves LanesFromOld = sub0 + sub2 + sub3
+// (non-contiguous!)
+//               This requires getCoveringSubRegsForLaneMask to decompose into
+//               multiple subregs Expected REG_SEQUENCE: %RS = REG_SEQUENCE %6,
+//               sub1, %0.sub0, sub0, %0.sub2_3, sub2_3
 //
 TEST(MachineLaneSSAUpdaterTest, NonContiguousLaneMaskREGSEQUENCE) {
   SmallString<4096> S;
@@ -2197,171 +2329,185 @@ body: |
     dead %1:vreg_128 = COPY %0:vreg_128
     S_ENDPGM 0
 ...
-)MIR")).toNullTerminatedStringRef(S);
-
-  doTest<LiveIntervalsWrapperPass>(MIRString,
-             [](MachineFunction &MF, LiveIntervalsWrapperPass &LISWrapper) {
-      LiveIntervals &LIS = LISWrapper.getLIS();
-      MachineDominatorTree MDT(MF);
-      LLVM_DEBUG(dbgs() << "\n=== NonContiguousLaneMaskREGSEQUENCE Test ===\n");
-      
-      const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
-      const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
-      MachineRegisterInfo &MRI = MF.getRegInfo();
-      (void)MRI; // May be unused, suppress warning
-      
-      // Find blocks
-      // bb.0 = entry
-      // bb.1 = IMPLICIT_DEF + diamond split
-      // bb.2 = left path (no redef)
-      // bb.3 = right path (INSERT sub1 def here)
-      // bb.4 = diamond join (use full register)
-      MachineBasicBlock *BB3 = MF.getBlockNumbered(3);  // Right path - where we insert
-      MachineBasicBlock *BB4 = MF.getBlockNumbered(4);  // Join - where we need REG_SEQUENCE
-      
-      // Find %0 (the vreg_128)
-      Register OrigReg = Register::index2VirtReg(0);
-      ASSERT_TRUE(OrigReg.isValid()) << "Register %0 should be valid";
-      LLVM_DEBUG(dbgs() << "Using 128-bit register: %" << OrigReg.virtRegIndex() << "\n");
-      
-      // Find sub1 subregister index
-      unsigned Sub1Idx = 0;
-      for (unsigned Idx = 1; Idx < TRI->getNumSubRegIndices(); ++Idx) {
-        StringRef Name = TRI->getSubRegIndexName(Idx);
-        if (Name == "sub1") {
-          Sub1Idx = Idx;
-          break;
-        }
-      }
-      
-      ASSERT_NE(Sub1Idx, 0u) << "Should find sub1 subregister index";
-      
-      // Insert new definition in BB3 (right path): %0.sub1 = IMPLICIT_DEF
-      MachineInstrBuilder MIB = BuildMI(*BB3, BB3->getFirstNonPHI(), DebugLoc(), 
-                                         TII->get(TargetOpcode::IMPLICIT_DEF))
-        .addDef(OrigReg, RegState::Define, Sub1Idx);
-      
-      MachineInstr *NewDefMI = MIB.getInstr();
-      LLVM_DEBUG(dbgs() << "Inserted new def in BB3: ");
-      LLVM_DEBUG(NewDefMI->print(dbgs()));
-      
-      // Index the new instruction
-      LIS.InsertMachineInstrInMaps(*NewDefMI);
-      
-      // Set MachineFunction properties to allow PHI insertion
-      MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
-      MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
-      
-      // Create SSA updater and repair
-      MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
-      Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
-      
-      LLVM_DEBUG(dbgs() << "SSA repair created new register: %" << NewReg.virtRegIndex() << "\n");
-      
-      // Print final state
-      LLVM_DEBUG(dbgs() << "\nFinal BB4 (diamond join):\n");
-      for (MachineInstr &MI : *BB4) {
-        LLVM_DEBUG(MI.print(dbgs()));
-      }
-      
-      // Verify SSA repair results
-      
-      // 1. Should have PHI in BB4 for sub1 lane
-      bool FoundPHI = false;
-      Register PHIReg;
-      for (MachineInstr &MI : *BB4) {
-        if (MI.isPHI()) {
-          PHIReg = MI.getOperand(0).getReg();
-          if (PHIReg.isVirtual()) {
-            LLVM_DEBUG(dbgs() << "Found PHI in BB4: ");
-            LLVM_DEBUG(MI.print(dbgs()));
-            FoundPHI = true;
-            
-            // Check that it has 2 incoming values
-            unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
-            EXPECT_EQ(NumIncoming, 2u) << "PHI should have 2 incoming values";
-            
-            // One incoming should be the new register (vgpr_32 from BB3)
-            bool HasNewRegFromBB3 = false;
-            for (unsigned i = 1; i < MI.getNumOperands(); i += 2) {
-              if (MI.getOperand(i).isReg() && MI.getOperand(i).getReg() == NewReg) {
-                EXPECT_EQ(MI.getOperand(i + 1).getMBB(), BB3) << "NewReg should come from BB3";
-                HasNewRegFromBB3 = true;
-              }
-            }
-            EXPECT_TRUE(HasNewRegFromBB3) << "PHI should have NewReg from BB3";
-            
-            break;
-          }
-        }
+)MIR"))
+                            .toNullTerminatedStringRef(S);
+
+  doTest<LiveIntervalsWrapperPass>(MIRString, [](MachineFunction &MF,
+                                                 LiveIntervalsWrapperPass
+                                                     &LISWrapper) {
+    LiveIntervals &LIS = LISWrapper.getLIS();
+    MachineDominatorTree MDT(MF);
+    LLVM_DEBUG(dbgs() << "\n=== NonContiguousLaneMaskREGSEQUENCE Test ===\n");
+
+    const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
+    const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
+    MachineRegisterInfo &MRI = MF.getRegInfo();
+    (void)MRI; // May be unused, suppress warning
+
+    // Find blocks
+    // bb.0 = entry
+    // bb.1 = IMPLICIT_DEF + diamond split
+    // bb.2 = left path (no redef)
+    // bb.3 = right path (INSERT sub1 def here)
+    // bb.4 = diamond join (use full register)
+    MachineBasicBlock *BB3 =
+        MF.getBlockNumbered(3); // Right path - where we insert
+    MachineBasicBlock *BB4 =
+        MF.getBlockNumbered(4); // Join - where we need REG_SEQUENCE
+
+    // Find %0 (the vreg_128)
+    Register OrigReg = Register::index2VirtReg(0);
+    ASSERT_TRUE(OrigReg.isValid()) << "Register %0 should be valid";
+    LLVM_DEBUG(dbgs() << "Using 128-bit register: %" << OrigReg.virtRegIndex()
+                      << "\n");
+
+    // Find sub1 subregister index
+    unsigned Sub1Idx = 0;
+    for (unsigned Idx = 1; Idx < TRI->getNumSubRegIndices(); ++Idx) {
+      StringRef Name = TRI->getSubRegIndexName(Idx);
+      if (Name == "sub1") {
+        Sub1Idx = Idx;
+        break;
       }
-      
-      EXPECT_TRUE(FoundPHI) << "Should create PHI in BB4 for sub1 lane";
-      
-      // 2. Most importantly: Should have REG_SEQUENCE with MULTIPLE sources for non-contiguous lanes
-      // After PHI for sub1, we need to compose full register:
-      // LanesFromOld = sub0 + sub2 + sub3 (non-contiguous!)
-      // This requires multiple REG_SEQUENCE operands
-      bool FoundREGSEQUENCE = false;
-      unsigned NumREGSEQSources = 0;
-      
-      for (MachineInstr &MI : *BB4) {
-        if (MI.getOpcode() == TargetOpcode::REG_SEQUENCE) {
-          LLVM_DEBUG(dbgs() << "Found REG_SEQUENCE: ");
+    }
+
+    ASSERT_NE(Sub1Idx, 0u) << "Should find sub1 subregister index";
+
+    // Insert new definition in BB3 (right path): %0.sub1 = IMPLICIT_DEF
+    MachineInstrBuilder MIB = BuildMI(*BB3, BB3->getFirstNonPHI(), DebugLoc(),
+                                      TII->get(TargetOpcode::IMPLICIT_DEF))
+                                  .addDef(OrigReg, RegState::Define, Sub1Idx);
+
+    MachineInstr *NewDefMI = MIB.getInstr();
+    LLVM_DEBUG(dbgs() << "Inserted new def in BB3: ");
+    LLVM_DEBUG(NewDefMI->print(dbgs()));
+
+    // Index the new instruction
+    LIS.InsertMachineInstrInMaps(*NewDefMI);
+
+    // Set MachineFunction properties to allow PHI insertion
+    MF.getProperties().set(MachineFunctionProperties::Property::IsSSA);
+    MF.getProperties().reset(MachineFunctionProperties::Property::NoPHIs);
+
+    // Create SSA updater and repair
+    MachineLaneSSAUpdater Updater(MF, LIS, MDT, *TRI);
+    Register NewReg = Updater.repairSSAForNewDef(*NewDefMI, OrigReg);
+
+    LLVM_DEBUG(dbgs() << "SSA repair created new register: %"
+                      << NewReg.virtRegIndex() << "\n");
+
+    // Print final state
+    LLVM_DEBUG(dbgs() << "\nFinal BB4 (diamond join):\n");
+    for (MachineInstr &MI : *BB4) {
+      LLVM_DEBUG(MI.print(dbgs()));
+    }
+
+    // Verify SSA repair results
+
+    // 1. Should have PHI in BB4 for sub1 lane
+    bool FoundPHI = false;
+    Register PHIReg;
+    for (MachineInstr &MI : *BB4) {
+      if (MI.isPHI()) {
+        PHIReg = MI.getOperand(0).getReg();
+        if (PHIReg.isVirtual()) {
+          LLVM_DEBUG(dbgs() << "Found PHI in BB4: ");
           LLVM_DEBUG(MI.print(dbgs()));
-          FoundREGSEQUENCE = true;
-          
-          // Count sources (each source is: register + subregidx, so pairs)
-          NumREGSEQSources = (MI.getNumOperands() - 1) / 2;
-          LLVM_DEBUG(dbgs() << "  REG_SEQUENCE has " << NumREGSEQSources << " sources\n");
-          
-          // We expect at least 2 sources for non-contiguous case:
-          // 1. PHI result covering sub1
-          // 2. One or more sources from OrigReg covering sub0, sub2, sub3
-          EXPECT_GE(NumREGSEQSources, 2u) 
-              << "REG_SEQUENCE should have multiple sources for non-contiguous lanes";
-          
-          // Verify at least one source is the PHI result
-          bool HasPHISource = false;
+          FoundPHI = true;
+
+          // Check that it has 2 incoming values
+          unsigned NumIncoming = (MI.getNumOperands() - 1) / 2;
+          EXPECT_EQ(NumIncoming, 2u) << "PHI should have 2 incoming values";
+
+          // One incoming should be the new register (vgpr_32 from BB3)
+          bool HasNewRegFromBB3 = false;
           for (unsigned i = 1; i < MI.getNumOperands(); i += 2) {
-            if (MI.getOperand(i).isReg() && MI.getOperand(i).getReg() == PHIReg) {
-              HasPHISource = true;
-              break;
+            if (MI.getOperand(i).isReg() &&
+                MI.getOperand(i).getReg() == NewReg) {
+              EXPECT_EQ(MI.getOperand(i + 1).getMBB(), BB3)
+                  << "NewReg should come from BB3";
+              HasNewRegFromBB3 = true;
             }
           }
-          EXPECT_TRUE(HasPHISource) << "REG_SEQUENCE should use PHI result";
-          
+          EXPECT_TRUE(HasNewRegFromBB3) << "PHI should have NewReg from BB3";
+
           break;
         }
       }
-      
-      EXPECT_TRUE(FoundREGSEQUENCE) 
-          << "Should create REG_SEQUENCE to compose full register from non-contiguous lanes";
-      
-      // 3. The COPY use should now reference the REG_SEQUENCE result (not %0)
-      bool FoundRewrittenUse = false;
-      for (MachineInstr &MI : *BB4) {
-        if (MI.getOpcode() == TargetOpcode::COPY) {
-          MachineOperand &SrcOp = MI.getOperand(1);
-          if (SrcOp.isReg() && SrcOp.getReg().isVirtual() && SrcOp.getReg() != OrigReg) {
-            LLVM_DEBUG(dbgs() << "Found rewritten COPY: ");
-            LLVM_DEBUG(MI.print(dbgs()));
-            FoundRewrittenUse = true;
+    }
+
+    EXPECT_TRUE(FoundPHI) << "Should create PHI in BB4 for sub1 lane";
+
+    // 2. Most importantly: Should have REG_SEQUENCE with MULTIPLE sources for
+    // non-contiguous lanes After PHI for sub1, we need to compose full
+    // register: LanesFromOld = sub0 + sub2 + sub3 (non-contiguous!) This
+    // requires multiple REG_SEQUENCE operands
+    bool FoundREGSEQUENCE = false;
+    unsigned NumREGSEQSources = 0;
+
+    for (MachineInstr &MI : *BB4) {
+      if (MI.getOpcode() == TargetOpcode::REG_SEQUENCE) {
+        LLVM_DEBUG(dbgs() << "Found REG_SEQUENCE: ");
+        LLVM_DEBUG(MI.print(dbgs()));
+        FoundREGSEQUENCE = true;
+
+        // Count sources (each source is: register + subregidx, so pairs)
+        NumREGSEQSources = (MI.getNumOperands() - 1) / 2;
+        LLVM_DEBUG(dbgs() << "  REG_SEQUENCE has " << NumREGSEQSources
+                          << " sources\n");
+
+        // We expect at least 2 sources for non-contiguous case:
+        // 1. PHI result covering sub1
+        // 2. One or more sources from OrigReg covering sub0, sub2, sub3
+        EXPECT_GE(NumREGSEQSources, 2u) << "REG_SEQUENCE should have multiple "
+                                           "sources for non-contiguous lanes";
+
+        // Verify at least one source is the PHI result
+        bool HasPHISource = false;
+        for (unsigned i = 1; i < MI.getNumOperands(); i += 2) {
+          if (MI.getOperand(i).isReg() && MI.getOperand(i).getReg() == PHIReg) {
+            HasPHISource = true;
             break;
           }
         }
+        EXPECT_TRUE(HasPHISource) << "REG_SEQUENCE should use PHI result";
+
+        break;
+      }
+    }
+
+    EXPECT_TRUE(FoundREGSEQUENCE) << "Should create REG_SEQUENCE to compose "
+                                     "full register from non-contiguous lanes";
+
+    // 3. The COPY use should now reference the REG_SEQUENCE result (not %0)
+    bool FoundRewrittenUse = false;
+    for (MachineInstr &MI : *BB4) {
+      if (MI.getOpcode() == TargetOpcode::COPY) {
+        MachineOperand &SrcOp = MI.getOperand(1);
+        if (SrcOp.isReg() && SrcOp.getReg().isVirtual() &&
+            SrcOp.getReg() != OrigReg) {
+          LLVM_DEBUG(dbgs() << "Found rewritten COPY: ");
+          LLVM_DEBUG(MI.print(dbgs()));
+          FoundRewrittenUse = true;
+          break;
+        }
       }
-      
-      EXPECT_TRUE(FoundRewrittenUse) << "COPY should be rewritten to use REG_SEQUENCE result";
-      
-      // Print summary
-      LLVM_DEBUG(dbgs() << "\n=== Test Summary ===\n");
-      LLVM_DEBUG(dbgs() << "✓ Redefined sub1 (middle lane) of vreg_128\n");
-      LLVM_DEBUG(dbgs() << "✓ Created PHI for sub1 lane\n");
-      LLVM_DEBUG(dbgs() << "✓ Created REG_SEQUENCE with " << NumREGSEQSources
-                   << " sources to handle non-contiguous lanes (sub0 + sub2 + sub3)\n");
-      LLVM_DEBUG(dbgs() << "✓ This test exercises getCoveringSubRegsForLaneMask!\n");
-    });
+    }
+
+    EXPECT_TRUE(FoundRewrittenUse)
+        << "COPY should be rewritten to use REG_SEQUENCE result";
+
+    // Print summary
+    LLVM_DEBUG(dbgs() << "\n=== Test Summary ===\n");
+    LLVM_DEBUG(dbgs() << "✓ Redefined sub1 (middle lane) of vreg_128\n");
+    LLVM_DEBUG(dbgs() << "✓ Created PHI for sub1 lane\n");
+    LLVM_DEBUG(
+        dbgs()
+        << "✓ Created REG_SEQUENCE with " << NumREGSEQSources
+        << " sources to handle non-contiguous lanes (sub0 + sub2 + sub3)\n");
+    LLVM_DEBUG(
+        dbgs() << "✓ This test exercises getCoveringSubRegsForLaneMask!\n");
+  });
 }
 
 } // anonymous namespace

@vpykhtin
Copy link
Contributor

vpykhtin commented Oct 16, 2025

Please, merge this into a single commit.

Add a test on a loop with single BB for the following case (taken from SSAUpdater.h)

  /// Construct SSA form, materializing a value that is live in the
  /// middle of the specified block.
  ///
  /// \c GetValueInMiddleOfBlock is the same as \c GetValueAtEndOfBlock except
  /// in one important case: if there is a definition of the rewritten value
  /// after the 'use' in BB.  Consider code like this:
  ///
  /// \code
  ///      X1 = ...
  ///   SomeBB:
  ///      use(X)
  ///      X2 = ...
  ///      br Cond, SomeBB, OutBB
  /// \endcode
  ///
  /// In this case, there are two values (X1 and X2) added to the AvailableVals
  /// set by the client of the rewriter, and those values are both live out of
  /// their respective blocks.  However, the use of X happens in the *middle* of
  /// a block.  Because of this, we need to insert a new PHI node in SomeBB to
  /// merge the appropriate values, and this value isn't live out of the block.
  Value *GetValueInMiddleOfBlock(BasicBlock *BB);

…rNewDef

This change adds an optional Register parameter to repairSSAForNewDef(),
allowing callers to provide a pre-allocated virtual register instead of
having the updater create one automatically.

Motivation:
- Enables precise control over register class selection when needed
- Required for subregister spill/reload scenarios where target-specific
  register class constraints apply (e.g., reloading a 96-bit subregister
  requires vreg_96 class, not vreg_128)
- Maintains backward compatibility - default behavior unchanged

The change is backward compatible: if NewVReg is not provided (default),
the updater creates a register automatically as before.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants