You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The inline pass (as part of a `Walk` rewrite) has O(n^2) time complexity
which is a performance issue.
An extra factor of `n` is coming from `inline.py` where each block is
split in two, and the inline region is inserted in the middle. Splitting
blocks in two comes with the extra O(n) factor:
```python
after_block = ir.Block()
stmt = call_like.next_stmt
while stmt is not None:
stmt.detach()
after_block.stmts.append(stmt)
stmt = call_like.next_stmt
```
This PR introduces a partial workaround. Simple regions with just a
single block are inlined by inserting all of their statements directly.
Since statements form a linked list, this is O(1).
For my test case, I observe that this fix reduces runtime and brings the
time complexity of the inline pass back to O(n).
<img width="489" height="358" alt="image"
src="https://github.com/user-attachments/assets/3b11560f-670f-45d3-84b0-0959693f47b4"
/>
However, we should refactor the inline pass to scale linearly even in
the general case.
---------
Co-authored-by: Casey Duckering <[email protected]>
0 commit comments