-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[DA] Add tests for nsw doesn't hold on entire iteration space (NFC) #162281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
amehsan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The examples here seem very similar to the examples we have discussed in #159846
I still don't see relevance of loop guards here.
amehsan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is still a draft PR, I want to make sure my opinion on this is clear. I believe we still need to see a justificaiton and also details of how loop guards are relevant.
611229f to
9bfa9d5
Compare
0b8c29b to
5eeaf55
Compare
5eeaf55 to
5d7ebe3
Compare
5d7ebe3 to
774a239
Compare
kasuga-fj
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we need to handle loop guards properly in some way, and I believe the test case @nsw_under_loop_guard0 illustrates why. What I'm trying to say is:
- Monotonicity reasoning based on the nowrap property in addrecs may only be valid under loop guards
- Each dependence testing function in DA basically assumes monotonicity across the entire iteration space. E.g., it evaluates an addrec at a BTC and treats the result as the minimum or maximum value of the addrec.
That is, there is a gap between the information provided by the monotonicity check and the assumptions made by the dependence testing functions. To bridge this gap, we may need to either:
- Pessimistically reject cases where loop guards are present, or
- Prove that the gap does not lead to unsoundness in dependence testing.
It's also true that I haven't found any cases where this gap actually causes problems, so it might be possible to prove that the gap is harmless.
@Meinersbur What do you think?
| ; for (i = 0; i < INT64_MAX - 1; i++) | ||
| ; if (i < 1000) | ||
| ; for (j = 0; j < 2000; j++) | ||
| ; a[i + j] = 0; | ||
| ; | ||
| ; FIXME: This is not monotonic. The nsw flag is valid under | ||
| ; the condition i < 1000, not for all i. | ||
| define void @nsw_under_loop_guard0(ptr %a) { | ||
| ; CHECK-LABEL: 'nsw_under_loop_guard0' | ||
| ; CHECK-NEXT: Monotonicity check: | ||
| ; CHECK-NEXT: Inst: store i8 0, ptr %idx, align 1 | ||
| ; CHECK-NEXT: Expr: {{\{\{}}0,+,1}<nuw><nsw><%loop.i.header>,+,1}<nuw><nsw><%loop.j> | ||
| ; CHECK-NEXT: Monotonicity: MultivariateSignedMonotonic | ||
| ; CHECK-EMPTY: | ||
| ; CHECK-NEXT: Src: store i8 0, ptr %idx, align 1 --> Dst: store i8 0, ptr %idx, align 1 | ||
| ; CHECK-NEXT: da analyze - output [* *]! | ||
| ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i + j is monotonic under the condition i < 1000, but obviously not over the entire iteration space (0 <= i < INT64_MAX - 1)
|
Some background on how Polly handles this: DA doesn't have an equivalent of execution domains, it has to assume Src/Dst may be executed for any When I said "has to assume Src/Dst may be executed for any Although I wonder whether it is worth the effort. You get more complicated upper bound SCEVs such as |
I'd like to look into this in detail, but I may not be able to do it today. Will get back to you on this asap. |
Thanks for the details. I also think we don't need to bail out in all cases where loop guards exist. However, I believe some checks are necessary to justify the reasoning about the monotonicity based on the nowrap properties of addrecs, and I haven't found a good way to detect only conditions like
I think we can make DA more precise by taking context such as branch conditions into account. For example, simply replacing |
This is not different from the examples that we have discussed before. I really don't understand why you keep going back to this issue. All dependence testing, motonicity proof, etc. are only relevant when the loop is executed. If the guard condition is not satisfied the loop is not executed. for those values of Your concern about correctness of DA when loop guard exists is 100% unjustified. If this could result in a bug, it is basically impossible to write a correct DA. I gave you an example in the previous discussion. Your nsw flags may be correct under assumptions that you have practically no way to discover. |
@Meinersbur consider a loop like this I may have optimizations that infer nsw flags for computations in |
It looks to me that you are conflating "what dependence testing functions assume" with "when the result should be correct". It would be true that it’s sufficient for the result to be correct only when it is actually executed, but I’m talking about the preconditions which the testing functions rely on. For example, as for an addrec |
It is not clear to me where we make this assumption and what are the consequences of that. But before discussing that I have another question: I believe you are talking about this example from your testcase: and you are saying for DA to be correct we need to check that a loop guard exists (or may be check what the loop guard is) and bail out. Now I can change your example to the following: Your comment is applicable to this example as well. Here |
Wow, I didn't know about |
Two cases:
The presence of a loop guard does not guarantee that the property holds globally everywhere. For instance: for (i = 0; i < n; i++) {
if (i < 3000) {
for (j = .....) {
... j + i ...;
}
}
for (k = .....) {
... k + i ...
}
}Obviously we cannot assume that |
I believe this response misses a point in my objection, however right now I am doing some hands-on work on the example provided. Let me finish that and I will post an update here. I believe that will be a more constructive discussion. |
I think ; stride = INT64_MAX;
; for (i = 0; i < 10; i++)
; if (i % 2 == 0) {
; A[stride*i + 100] = 1; // A[100], A[98], A[96], ...
; A[stride*i + 102] = 2; // A[102], A[100], A[98], ...
; }
;
define void @f(ptr %A) {
entry:
br label %loop.header
loop.header:
%i = phi i64 [ 0, %entry ], [ %i.inc, %loop.latch ]
%offset.0 = phi i64 [ 100, %entry ], [ %offset.0.next, %loop.latch ]
%offset.1 = phi i64 [ 102, %entry ], [ %offset.1.next, %loop.latch ]
%odd = and i64 %i, 1
%cond = icmp ne i64 %odd, 1
br i1 %cond, label %if.then, label %loop.latch
if.then:
%gep.0 = getelementptr inbounds i8, ptr %A, i64 %offset.0
%gep.1 = getelementptr inbounds i8, ptr %A, i64 %offset.1
store i8 1, ptr %gep.0
store i8 2, ptr %gep.1
br label %loop.latch
loop.latch:
%i.inc = add nuw nsw i64 %i, 1
%offset.0.next = add i64 %offset.0, 9223372036854775807
%offset.1.next = add i64 %offset.1, 9223372036854775807
%ec = icmp eq i64 %i.inc, 10
br i1 %ec, label %exit, label %loop.header
exit:
ret void
}Both offsets are monotonic in the execution domain, and Strong SIV misses the dependency (godbolt). The culprit seems here. I think something strange happens if the execution domain is not consecutive. |
I think we can ensure it by checking the (post-)dominance for headers and/or latches between the outer loop and the inner loop. It might also be worth noting that |
We need something like this. However there are some subtleties to consider. Dominance directly does not work, because the inner loop might be But before getting into details of this, I believe we need to think about a couple of higher level issues: (1) I wonder if there is a proof of correctness of these tests in the literature that we can look into and figure out what are the assumptions for each one? It is much better if we can gather this issues in a systematic way. If not, it might be even better to try to write down the proofs and in the process we can find what do we need to assume. That is easier than discovering this bugs one by one. (2) Different tests might have different assumptions. We need to be careful about this. One way to handle this, could be to require loops to be in a more strict canonical form (perfect loop nests? maybe with no early exit in the innermost loop?), before applying any loop optimizations. However, this will be large amount of work and at this point I don't know whether we have a justification for it or not. But if this is something that we are going to be forced to do down the road, may be we need to pause now and think about it. I believe our best next step, would be to start gathering the list of assumptions required for each of the tests. (Unless we have a reason that missing assumption are very rare and it is not something to generally worry about). One last thing is that I still haven't done a careful debug of the example provided. I assume you have done this and there is no other issue (such as overflow in calculation, or other problems in DA computations) |
This book may have the proofs But first I will look more for online resources. |
Do you mean the issue is related to the conditional statement in the loop? The problem is reproducible without that, too. StrongSIV is a relatively simple test. I think we should be able to figure out the proof of this one and figure out what are the assumptions. (EDIT: BTW, this subscript doesn't pass your monotonicity check. But I think what you mean is that if we look at the values of subscript, there is no wrapping and the values are constantly increasing/decreasing. Please let me know if I misunderstood something) |
I don't think execution context adds anything to this example. It is monotonic, and has dependences, even without the conditional: for (i = 0; i < 10; i++) {
A[stride*i + 100] = 1; // A[100], A[99], A[98], ...
A[stride*i + 102] = 2; // A[102], A[101], A[100], ...
}DA does not find the dependency in either case. It is just another bug in DA. Did you mean |
That matches my intent. One thing I don’t quite understand, and which makes things more complex, is the existence of delinearization. For example, in the following case: int A[][10];
for (i = 0; i < INT64_MAX; i++)
for (j = 0; j < 10; j++)
A[i][j] = 0;The original offset will be lowered to |
|
I haven't read the full comment yet, but a point that I am going to mention in the issue that I have opened is this: What we need to focus on is "no wrapping" property which can be divided to signed and unsigned categories. monotonicity is not a fundamental property that we need to care about. It just happens that the two are related for linear functions. To demonstrate this assume we have a non-monotonic function; for example: assume we are checking the dependences between The question that we need to answer is this: are there values for This is the basic idea of the proofs that we need to show control flow is irrelevant. But monotonicity is not the issue here. What is important here is "no wrapping" property as long as it is used in a consistent way (signed or unsigned) for each proof. |
|
I have opened this issue and will write down proofs that control flow is irrelevant (i.e. if we have nowrappig flags, we don't need to worry about control flow). Will be updated gradually. |
I'm going to be direct for a moment. If this is a misunderstanding, I apologize in advance. Looking back at your comments in this thread, I got the impression that you haven't thoroughly read the DA code, and you are only focusing on the parts you have checked. If so, it's only natural that the discussion take time and/or is going in circles. I don’t think continuing the discussion with you in this state will lead to a good outcome. |
I suggest you show a specific technical mistake in my comments. That will be much more convincing. But now let me anwer to your comment above. |
They are not equivalent and that is important for your example because here it doesn't matter whether you interpret numbers as signed or unsigned Do I misunderstand something here?
I agree that this is not a practically interesting example, but the point is this: we don't need monotonicity for proving correctness of a given test. We just need existance of nowrap flags.
Yes, I had the same understanding for some time, but now I believe we don't need to distinguish between them. I will write down proofs soon. You can look into them and try to find a mistake in them. |
|
@Meinersbur I think it might be more productive if we have a loop meeting and discuss these issues in a meeting. Is that possible? |
To be clear, here I am using a test that requires solving a quadratic equation. Sure. We don't have this in DA and we don't want it. The point is that, even for non-monotonic functions we can devise a test and prove its corretness if needed. |
I forgot to mention a very important precondition: they would be equivalent if the domain is consective. I wasn’t really intending to discuss that extreme example. For the remaining parts, I'd strongly recommend you to read the code. I think Symbolic RDIV is a good one. It assumes that an addrec takes its minimum value at the first iteration and maximum value at the last iteration if the coefficient is non-negative. I think it's clear that "an addrec doesn't wrap when it's executed" is insufficient |
OK. I give you a proof for the case that you mentioned. Note that we are talking about a mathematical problem. If you believe my proof is incorrect, I would be more than happy to see a counterexample and learn from it. Note that in addition to “an addrec doesn’t wrap when it’s executed” we also need computations inside DA code to not overflow. We have already fixed a couple of bugs related to that. So that should not be a controversial assumption. Now let’s focus on the case of non-negative coefficients in Symbolic RDIV. Assume we have Also Note that Now let’s proceed the proof by contradiction. Let’s say there are two executed values of Case 1: I haven’t written down the proof for Case 2 but I believe it will work out similarly. The key point is this: All the computations used in the proof are either computations of executed subscript or computations inside the DA. Everything else is irrelevant. If this is incorrect, please give me a counterexmple. |
There is a minor mistake. Let me correct it: We know |
This assumption seems almost equivalent to requiring monotonicity over the entire domain. If we cannot prove monotonicity over the entire domain, then in most cases we cannot prove I believe Symbolic RDIV should be rewritten to just compare the minimum/maximum values of the two subscripts, after ensuring they are monotonic over the entire domain. This approach is more natural and doesn’t require any complicated proofs. |
Are you suggesting using arbitrary-precision integers for solving dependency equations? Because this is why Integer Set Library and https://github.com/llvm/llvm-project/tree/main/mlir/include/mlir/Analysis/Presburger use arbitrary-precision integers.
Monotonicity is supposed to give the additional guarantee that expression evaluations stay between the what it evaluates to in the first and last loop iterations, since otherwise to get the range of an arbitrary expression one would need to evaluate it for every in-between value. Everytime I see |
|
entier? |
entire iteration space EDIT: i assume you are asking about the word in the title of the PR |
Which isn't a word so fix the typo? |
Sure. I did not open the PR, but I believe fixing a typo should be fine. |
|
I didn't except to ask for opinions from others (including LLVM Area Team) on this one, because
I think summarizing the contents and issues is necessary before involving others.
I'm very bad at speaking and listening in English, so I'm not sure if arranging a meeting is a good idea. That said, I don't mind joining the next loop opt meeting. Anyway, I'll mark this PR as ready for review in a few days, as it's clear that either the definition of monotonicity or the result of its check needs to be corrected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The solution to this is to drop the notion of monotonicity and define everything in terms of "no-wrap".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in #159846 (comment), there are cases where entire monotonicity has benefits. My proposal is to prepare two types of domain. Have you read the previous reply?
Also, we have explained why we use the term monotonicity. Moreover, it's slightly off the topic of this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I mentioned in #159846 (comment), there are cases where entire monotonicity has benefits. My proposal is to prepare two types of domain. Have you read the previous reply?
And I responded to your comment there. Please, let's stop this never ending discussion and wait for the area team to help us resolve this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't expect the area team to assist even at this level of detail...
I'm crafting a prototype to show my approach. I'll share it once it's ready. If you still have any objections, please say so. I honestly have no idea what you are disagreeing with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I've implemented what I have in mind for Strong SIV. If you still see any issues, please let me know.
kasuga-fj/llvm-project@c0a7b15...f579161
I believe this approach is better than inserting overflow checks everywhere as we don't need complex proofs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1- looking at strong SIV code, you have removed 24 lines of code and added 36 lines of code. So your code is more complex.
But we don't need complex proofs like #162281 (comment). I really want to avoid relying on such a proof because:
- It's complex and easy to get wrong
- It strongly depends on specific parts of the implementation
- Some minor changes in the code can easily break the proof
2- How do you know nsw is valid for first or last iteration of the loop?
For the innermost addrec, if the exact BTC is computable, nowrap property should be valid for every iteration.
3- This is more compile time intensive than previous approach. Just a quick look at the SCEV related computations in your approach it seems more expensive than the previous approach
It's not trivial. If you haven't seen it, I'd strongly recommend taking a look at the implementation of SCEV. For example, willNotOverflow performs same operation twice, with two different bitwidth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the innermost addrec, if the exact BTC is computable, nowrap property should be valid for every iteration.
I don't think this is correct. I will give you an example in another comment later today. Will also respond to the rest of your comments later today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of quick comments for now:
But we don't need complex proofs like #162281 (comment).
The link is to the proof for Symbolic RDIV. Correct link to the proof for strong SIV is this: #168823 (comment) And I believe you talk about case 2, which is the test that you are talking about.
Case 1 which is about the other test in Strong SIV is quite simple.
For the innermost addrec, if the exact BTC is computable, nowrap property should be valid for every iteration.
I believe you rely on the SCEV's conservative decision to drop nsw flags for instructions that are under a condition. That is a deliberately conservative choice on SCEV's part, and I don't think we should rely on that. In future SCEV may change or be extended or modified otherwise. There is no reason to create potential future limitations.
Will add more comments later on.
EDIT: This conservative decision of SCEV will hurt us. If two loops have control flow inside their body, very likely they cannot be fused because they fail dependence check. We may need to get to the bottom of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The link is to the proof for Symbolic RDIV. Correct link to the proof for strong SIV is this: #168823 (comment) And I believe you talk about case 2, which is the test that you are talking about.
Similar approach can be applied to Symbolic RDIV as well. So both #162281 (comment) and Case 2 in #168823 (comment) are unnecessary with my approach.
I believe you rely on the SCEV's conservative decision to drop nsw flags for instructions that are under a condition. That is a deliberately conservative choice on SCEV's part, and I don't think we should rely on that. In future SCEV may change or be extended or modified otherwise. There is no reason to create potential future limitations.
It is guaranteed that the nowrap property holds within the defining scope of the SCEV. This means that the nowrap property of an addrec holds on all executed iterations of the loop. It is therefore reasonable to rely on this contract. I'd recommend to read this article and #159846 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is guaranteed that the nowrap property holds within the defining scope of the SCEV. This means that the nowrap property of an addrec holds on all executed iterations of the loop. It is therefore reasonable to rely on this contract. I'd recommend to read this article and #159846 (comment).
If we rely on SCEV behavior, I may be able to simplify my proofs too. Let me check if it is possible. Also I need to look into the compile time issue a little more. Give me some time for that.
Not directly related to your proposed change, but: the article explains the design, but it doesn't say why it is designed this way. By any chance do you have an example that something breaks if we keep IR's nsw flag for subscripts that are executed under a condition (Assume the load/store that uses the subscript is also under the same condition)
|
@llvm/pr-subscribers-llvm-analysis Author: Ryotaro Kasuga (kasuga-fj) ChangesThe monotonicity definition states its domain as follows: Current monotonicity check implementation doesn't match this definition because:
Therefore we need to fix either the definition or the implementation. This patch adds the test cases that demonstrate this mismatch. Full diff: https://github.com/llvm/llvm-project/pull/162281.diff 1 Files Affected:
diff --git a/llvm/test/Analysis/DependenceAnalysis/monotonicity-loop-guard.ll b/llvm/test/Analysis/DependenceAnalysis/monotonicity-loop-guard.ll
new file mode 100644
index 0000000000000..5f19ca96badcd
--- /dev/null
+++ b/llvm/test/Analysis/DependenceAnalysis/monotonicity-loop-guard.ll
@@ -0,0 +1,141 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 6
+; RUN: opt < %s -disable-output -passes="print<da>" -da-dump-monotonicity-report \
+; RUN: -da-enable-monotonicity-check 2>&1 | FileCheck %s
+
+; FIXME: These cases are not monotonic because currently we define the domain
+; of monotonicity as "entire iteration space". However, the nsw property is
+; actually valid only under the loop guard conditions.
+
+; for (i = 0; i < INT64_MAX - 1; i++)
+; if (i < 1000)
+; for (j = 0; j < 2000; j++)
+; a[i + j] = 0;
+;
+define void @nsw_under_loop_guard0(ptr %a) {
+; CHECK-LABEL: 'nsw_under_loop_guard0'
+; CHECK-NEXT: Monotonicity check:
+; CHECK-NEXT: Inst: store i8 0, ptr %idx, align 1
+; CHECK-NEXT: Expr: {{\{\{}}0,+,1}<nuw><nsw><%loop.i.header>,+,1}<nuw><nsw><%loop.j>
+; CHECK-NEXT: Monotonicity: MultivariateSignedMonotonic
+; CHECK-EMPTY:
+; CHECK-NEXT: Src: store i8 0, ptr %idx, align 1 --> Dst: store i8 0, ptr %idx, align 1
+; CHECK-NEXT: da analyze - none!
+;
+entry:
+ br label %loop.i.header
+
+loop.i.header:
+ %i = phi i64 [ 0 , %entry ], [ %i.next, %loop.i.latch ]
+ br label %loop.j.pr
+
+loop.j.pr:
+ %guard.j = icmp slt i64 %i, 1000
+ br i1 %guard.j, label %loop.j, label %loop.i.latch
+
+loop.j:
+ %j = phi i64 [ 0, %loop.j.pr ], [ %j.next, %loop.j ]
+ %offset = add nsw i64 %i, %j
+ %idx = getelementptr inbounds i8, ptr %a, i64 %offset
+ store i8 0, ptr %idx
+ %j.next = add nsw i64 %j, 1
+ %ec.j = icmp eq i64 %j.next, 2000
+ br i1 %ec.j, label %loop.i.latch, label %loop.j
+
+loop.i.latch:
+ %i.next = add nsw i64 %i, 1
+ %ec.i = icmp eq i64 %i.next, 9223372036854775807
+ br i1 %ec.i, label %exit, label %loop.i.header
+
+exit:
+ ret void
+}
+
+; for (i = 0; i < INT64_MAX; i++)
+; if (100 < i)
+; for (j = 0; j < 100; j++)
+; a[INT64_MAX - i + j] = 0;
+;
+define void @nsw_under_loop_guard1(ptr %a) {
+; CHECK-LABEL: 'nsw_under_loop_guard1'
+; CHECK-NEXT: Monotonicity check:
+; CHECK-NEXT: Inst: store i8 0, ptr %idx, align 1
+; CHECK-NEXT: Expr: {{\{\{}}9223372036854775807,+,-1}<nsw><%loop.i.header>,+,1}<nuw><nsw><%loop.j>
+; CHECK-NEXT: Monotonicity: MultivariateSignedMonotonic
+; CHECK-EMPTY:
+; CHECK-NEXT: Src: store i8 0, ptr %idx, align 1 --> Dst: store i8 0, ptr %idx, align 1
+; CHECK-NEXT: da analyze - none!
+;
+entry:
+ br label %loop.i.header
+
+loop.i.header:
+ %i = phi i64 [ 0 , %entry ], [ %i.next, %loop.i.latch ]
+ br label %loop.j.pr
+
+loop.j.pr:
+ %guard.j = icmp sgt i64 %i, 100
+ br i1 %guard.j, label %loop.j, label %exit
+
+loop.j:
+ %j = phi i64 [ 0, %loop.j.pr ], [ %j.next, %loop.j ]
+ %val.0 = sub nsw i64 9223372036854775807, %i
+ %val = add nsw i64 %val.0, %j
+ %idx = getelementptr inbounds i8, ptr %a, i64 %val
+ store i8 0, ptr %idx
+ %j.next = add nsw i64 %j, 1
+ %ec.j = icmp eq i64 %j.next, 100
+ br i1 %ec.j, label %loop.i.latch, label %loop.j
+
+loop.i.latch:
+ %i.next = add nsw i64 %i, 1
+ %ec.i = icmp eq i64 %i.next, 9223372036854775807
+ br i1 %ec.i, label %exit, label %loop.i.header
+
+exit:
+ ret void
+}
+
+; for (i = 0; i < n; i++)
+; if (i < m)
+; for (j = 0; j < k; j++)
+; a[i + j] = 0;
+;
+define void @nsw_under_loop_guard2(ptr %a, i64 %n, i64 %m, i64 %k) {
+; CHECK-LABEL: 'nsw_under_loop_guard2'
+; CHECK-NEXT: Monotonicity check:
+; CHECK-NEXT: Inst: store i8 0, ptr %idx, align 1
+; CHECK-NEXT: Expr: {{\{\{}}0,+,1}<nuw><nsw><%loop.i.header>,+,1}<nuw><nsw><%loop.j>
+; CHECK-NEXT: Monotonicity: MultivariateSignedMonotonic
+; CHECK-EMPTY:
+; CHECK-NEXT: Src: store i8 0, ptr %idx, align 1 --> Dst: store i8 0, ptr %idx, align 1
+; CHECK-NEXT: da analyze - output [* *]!
+;
+entry:
+ br label %loop.i.header
+
+loop.i.header:
+ %i = phi i64 [ 0 , %entry ], [ %i.next, %loop.i.latch ]
+ br label %loop.j.pr
+
+loop.j.pr:
+ %guard.j = icmp slt i64 %i, %m
+ br i1 %guard.j, label %loop.j, label %exit
+
+loop.j:
+ %j = phi i64 [ 0, %loop.j.pr ], [ %j.next, %loop.j ]
+ %val = phi i64 [ %i, %loop.j.pr ], [ %val.next, %loop.j ]
+ %j.next = add nsw i64 %j, 1
+ %idx = getelementptr inbounds i8, ptr %a, i64 %val
+ store i8 0, ptr %idx
+ %val.next = add nsw i64 %val, 1
+ %ec.j = icmp eq i64 %j.next, %k
+ br i1 %ec.j, label %loop.i.latch, label %loop.j
+
+loop.i.latch:
+ %i.next = add nsw i64 %i, 1
+ %ec.i = icmp eq i64 %i.next, %n
+ br i1 %ec.i, label %exit, label %loop.i.header
+
+exit:
+ ret void
+}
|
|
With respect to SIV tests and nowrap flag based monotonicity inference, I noticed that monotonicity over entire domain and effective domain are almost the same... |

The monotonicity definition states its domain as follows:
Current monotonicity check implementation doesn't match this definition because:
Therefore we need to fix either the definition or the implementation. This patch adds the test cases that demonstrate this mismatch.