-
Notifications
You must be signed in to change notification settings - Fork 15
fix(s3-archive): Fixed multiple uploadTask attempts both when range available and unavailable #2003
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
28a3ee0 to
7ed4d4f
Compare
block-node/s3-archive/src/main/java/org/hiero/block/node/archive/s3/S3ArchivePlugin.java
Outdated
Show resolved
Hide resolved
block-node/s3-archive/src/main/java/org/hiero/block/node/archive/s3/S3ArchivePlugin.java
Outdated
Show resolved
Hide resolved
block-node/s3-archive/src/main/java/org/hiero/block/node/archive/s3/S3ArchivePlugin.java
Outdated
Show resolved
Hide resolved
| requires org.hiero.block.node.base; | ||
| requires org.hiero.block.protobuf.pbj; | ||
| requires com.github.spotbugs.annotations; | ||
| requires java.logging; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be needed. I think this could have been mistakenly added?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I should have add it to the testModuleInfo on the gradle file. 😅
✅ Fixed!
| package org.hiero.block.node.archive.s3; | ||
|
|
||
| import static java.util.concurrent.locks.LockSupport.parkNanos; | ||
| import static java.util.logging.Level.FINEST; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this import is possibly wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no its used to define the level of the TestLogHandler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, this is a dependency we do not use. I do not think we should start including it. We use the java system logger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed it, seems you are right, lets see what the CI says 🤞
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import static java.util.logging.Level.FINEST;
^
(package java.util.logging is declared in module java.logging, but module org.hiero.block.node.archive.s3cloud does not read it)
reverting back to include the dep in the testModuleInfo in the gradle build for the module.
✅ Ready to resolve ;)
| * Unit tests for the {@link S3ArchivePlugin} class. | ||
| */ | ||
| @Timeout(value = 5, unit = TimeUnit.SECONDS) | ||
| @Timeout(value = 15, unit = TimeUnit.SECONDS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, why the increase? This is somewhat worry-some.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were some flaky-ness on the test but it was due to timeout, due to the elaborate nature of these tests, at CI there was a high probability of timeout (at the time) but it also has happened to me as well locally, this is due to CPU and IO constraints and the setup that this test requires, still 15s is not worry some.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange, that should not be the case at all. Everything has always passed with the previous setup. Are we absolutely certain we do not observe some async issue that would cause a stall or otherwise we have broken something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but I believe it has not happened again, I haven't seen it happen again on other PR UTs recently. I will revert and if it happens again in the future we know what is the solution.
✅ Removed increased timeout on test...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is, we should use system logger, not the util logging. We use system logger everywhere. I do not believe we have ever used util logging. Doing so, we will add code we should probably not be adding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
System logger is a facade that (currently) we write to the JUL implementation. We can write to other implementations (e.g. Log4J), though, which would probably cause this log-based verification to fail completely.
That's another concern with log-based test verification; changing out the preferred implementation breaks the tests.
It's probably OK to leave this code in for now, but we should have an issue filed to work out a more correct test approach and remove all of the log-based test verification from the system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that asserting on logs is convenient but not ideal, however is a good trade-off for having good coverage and prevent regressions when time constrained as we are right now.
I don't think we will be moving to Log4J anytime soon, and maybe that whatever we migrate to, if we migrate to, is compatible with current approach, so if we ever cross that bridge we deal with this tests then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole reason we use System.Logger is that a build can use Log4J (or other implementation) with only a dependency change.
It's not a major concern at the moment, but it is important to be aware we don't migrate between implementations; any given build could use a different implementation without code change.
| // set-up logger | ||
| Logger logger = Logger.getLogger(S3ArchivePlugin.class.getName()); | ||
| System.out.println("logger = " + logger); | ||
| logHandler = new TestLogHandler(); | ||
| logger.addHandler(logHandler); | ||
| logger.setLevel(FINEST); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have this logger setup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to be able to assert over logs, I know is not pretty but is effective and cheap, may be brittle but is also easy to fix...
I've done it other ways, but found this approach by Jasper at the codebase and decided that we can have it in testFixtures to re-use it across test suites...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asserting on logs is a really bad idea. Just because we have done a test like so, does not mean we should be doing it. Also, adding custom logic, handlers and more just so that we can assert over logs is code that we have to be responsible for, maintain and stand by. I do not think we necessarily want to do that. Is there really no way to assert otherwise? I feel like there should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your concern, and I also try to avoid using log assertions, however Jasper has used them and even created this TestLogHandler that encapsulates the code, is a matter of having it on testFixtures so is easily re-usable.
Sometimes the alternative to assertions is worse, and the compromise is necessary, in this case those being:
- Mocking the executor, Mocks aren't pretty either and need more maintenance long term, they only useful for one scenario as opposed at the TestLogHandler.
- Exposing internal state, which breaks encapsulation and adds production code solely for testing purposes, so also ugly don't you think?
We need to prevent regressions and assert that the behaviour is no longer happening, I believe that even if is not "Perfect" nothing is and for the matter is the best choice available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have to mock an executor, we must simply use the blocking executor we have. It's whole purpose is to make tests non flaky and give us the ability to assert.
To your second point, I agree we should not be asserting based on exposing internal state. This begs the question, what are we actually testing for? What is the user story behind the test? If our code is good, we should always be able to easily assert end results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this particular test assertion is preventing re-scheduling of a job task to prevent a regression, is good to have good coverage that covers previous bugs. Using a real executor is not gonna cut it, since we need to be able to access its internal state. I guess we could use reflection. but why overcomplicate things? I don't see us changing of Log Facade and this being an issue any time soon.
...node/app/src/testFixtures/java/org/hiero/block/node/app/fixtures/logging/TestLogHandler.java
Show resolved
Hide resolved
Prevents scheduling an archive task while the batch is not completely available. Improves UTs and Adds a new Not Available range scenario Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>
… its entry in a finally block within call(). This keeps the return type as Void (reserved for future use) and ensures cleanup happens immediately when the task completes. Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>
Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>
Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>
bacbb29 to
596e51e
Compare
Signed-off-by: Alfredo Gutierrez Grajeda <[email protected]>
Codecov Report❌ Patch coverage is
@@ Coverage Diff @@
## main #2003 +/- ##
============================================
+ Coverage 78.99% 79.08% +0.08%
- Complexity 1241 1242 +1
============================================
Files 130 130
Lines 5952 5961 +9
Branches 646 648 +2
============================================
+ Hits 4702 4714 +12
+ Misses 955 952 -3
Partials 295 295
... and 3 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Reviewer Notes
Related Issue(s)
Fixes #1941