-
Notifications
You must be signed in to change notification settings - Fork 1k
UNDERTOW-1794 DefaultAccessLogReceiver violates Closeable contract #1041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| if (stateUpdater.compareAndSet(this, 3, 1)) { | ||
| logWriteExecutor.execute(this); | ||
| for (int i = 0; i < 1000 && !this.closed && !pendingMessages.isEmpty(); ++i) { | ||
| final String msg = pendingMessages.poll(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to use peek()+remove() here. Though Id assume follow up might be better than to alter PRs ad hoc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@baranowb because this class has concurrent access, poll is better than peek() + remove(). Between the moment we invoke peek() and the moment we invoke remove, we risk having another thread executing flushAndTerminate. This is why we need to run poll() here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is trade off. It is possible that XNIO thread will be terminated between poll() and write(), leaking one message.
If its peek()+remove(), it might double entry, but not loose it. Its a trade off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right about it. My question is now is hot to prevent both scenarios. Is there some way we can synchronize just a little bit to prevent a duplicate entry in the log? I think we must prevent an output that will be considered buggy by users if someone finds a duplicate entry after shutting down the server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not unless we control threat termination, which we dont.
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Outdated
Show resolved
Hide resolved
| closed = true; | ||
| if (stateUpdater.compareAndSet(this, 0, 1)) { | ||
| logWriteExecutor.execute(this); | ||
| this.closed = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method must take into account that the state could change between one if and the other. So, there is a possibility that all stateUpdater checks return false.
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Outdated
Show resolved
Hide resolved
| try { | ||
| while (!this.pendingMessages.isEmpty()) { | ||
| final String msg = this.pendingMessages.poll(); | ||
| // TODO: clarify this, how is this possible? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might not be possible. Lets wait for the final version of the code of run() and close() to decide this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is very unlikely, but still somethiing to safeguard against. Because we are on a best effort basis in graceful shutdown when closing, we might leave the loop in line 372 still in state 2. If that occurs, we have a race when trying to write the last messages. So, it could happens that poll returns null (and it could also result in a duplicate write of the same message).
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Show resolved
Hide resolved
|
as this is a new feature, merging of this PR will require synchronization with the RFE process. I'll assist in having all the process followed as fast as we can to have this merged the soonest. |
febb374 to
7beb8be
Compare
bb266f8 to
8407e8d
Compare
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Show resolved
Hide resolved
| doRotate(); | ||
| private void writeMessage(final String message) { | ||
| //NOTE: is there a need to rotate on write? | ||
| //if (System.currentTimeMillis() > changeOverPoint) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but only once per run. Now that this method is called per message it is no longer the adequate place for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we should add the limit to 1000 here? I think it would perform a bit better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still stand my ground, I think the granularity of checking 1000 here performs better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what I meant by limit of 1000. I think it looks good the way it is because we are checking for rotation in run already, and, as I said, checking per message is not the ideal place for this.
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Show resolved
Hide resolved
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Outdated
Show resolved
Hide resolved
| void accessLogWorkerFailureOnTransition(); | ||
|
|
||
| @LogMessage(level = DEBUG) | ||
| @Message(id = 6000, value = "Access Log Worker failed to reschedule.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big brain math....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand - no one noticed :)
I cant find this one to mark it as outdated, lets reset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some pending issues here to consider.
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Show resolved
Hide resolved
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Show resolved
Hide resolved
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Show resolved
Hide resolved
| logWriteExecutor.execute(this); | ||
| for (int i = 0; i < 1000 && !this.closed && !pendingMessages.isEmpty(); ++i) { | ||
| final String msg = pendingMessages.peek(); | ||
| if (msg == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not supposed to be null, because you are checking it is not empty
| if (stateUpdater.compareAndSet(this, 3, 1)) { | ||
| logWriteExecutor.execute(this); | ||
| for (int i = 0; i < 1000 && !this.closed && !pendingMessages.isEmpty(); ++i) { | ||
| final String msg = pendingMessages.peek(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be looking too much into details here, but why peeking and removing instead of just polling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case there is some intermitent IO failure, it should not purge entries without writing them first. Though, in such case, some overflow mechanism should be in place as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also "i < 1000 && !this.closed &&" checking this in loop might not be best, should be enough to loop through and check at the end IMHO
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Show resolved
Hide resolved
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Outdated
Show resolved
Hide resolved
| } | ||
| return; | ||
| } else { | ||
| if (!pendingMessages.isEmpty() || forceLogRotation) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been thinking... we still need a state for being at the finally block, as it prevents us from having more than one thread scheduling a new execution of run at the same time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how? if there is nothing in pendingMessages, it does not need scheduling, if someone writes, it will be able to schedule. In reality, pendingMessages are filled before state change, so its just a race between logMessage thread and run() to schedule new execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a second thought, I because we verify if we can go from 1 to 2 in the beginning of the method, we can get rid of finally block because only one finally will be running
| //0 = not running | ||
| //1 = queued | ||
| //2 = running | ||
| //3 = final state of running (inside finally of run()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment down below that we will still need state 3 to be used to keep tracking of being at finally of run)
That being said what are we trying to achieve with the new state 3? (It will have to be moved to 4 if we decide to keep it)> Are we trying to prevent run from doing anything after closed is invoked? Is that its purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, it is clearly stated in the Jira: we want to make sure that close returns only after it is fully closed. The question that remains is if we need a new state for it, I think we need so we can fully skip the rest of the execution of run() in case it is closing, without having to resort to verifying the value of close.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think so. Its a bit different FSM/CAS that was present before. It relied only on worker threads to dump messages even when handler has been closed and once its done close. This cant be done now since those worker threads may be disabled( upstream issue)
| doRotate(); | ||
| private void writeMessage(final String message) { | ||
| //NOTE: is there a need to rotate on write? | ||
| //if (System.currentTimeMillis() > changeOverPoint) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still stand my ground, I think the granularity of checking 1000 here performs better.
3b217e9 to
65e2b2b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @baranowb
Thanks! The PR looks good, I have added some minor comments.
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/undertow/server/handlers/accesslog/DefaultAccessLogReceiver.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approved it by mistake, I oversaw the second commit. Leaving as changes requested until all Flavia's concerns are sorted out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase the PR against main and I think you can delete the question on how is poll returning null possible. The other notes I think we can leave for future reference, unless you are certain the questions have been fully answered, in that case, the cleaner the better. Thanks for looking into this!
Issue: https://issues.redhat.com/browse/UNDERTOW-1794
main: #1041
2.3.x: #1825
2.2.x: #1593