You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[AMD] enhance range analysis for buffer ops (#8372)
This change fix bugs in range-analysis, and let buffer-ops use the
range-analysis result to decide if it's legal to convert memory-op to
buffer-ops.
The highlight are following:
* Range Analysis
- fix the way to use `tl.assume`. Previously, it does not consider the
control flow relationship between, say `tl.assume x > 0` and the
location of occurrence of x.
- correct the value range of `make_range(begin, end)`, previous vr is
[begin, end], now is [begin, end-1]. Small change in concept incur huge
change the regression test.
* Buffer-ops
- for large tensor (>2G), remove the ad-hoc, and mistaken range-analysis
in the pass. It only relies on the result of the range-analysis pass.
- previous, buffer-ops pass only check element-index > 0. The right
condition is byte-offset in [0, 2G-element-size].
- Previous there is a similar work here
#7908, contributed by
@njriasan . My change to this part is similar but fix some bugs in
PR7908 (.e.g. lattice could be nullptr), and update large number of
testings. That being said, now that @njriasan made the first change,
credit for the part belong to him.
---------
Co-authored-by: Shuxin Yang <[email protected]>
Co-authored-by: Shuxin Yang <[email protected]>
0 commit comments