-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Refactor: Rename DataSourceTaskIdHolder and extract load spec holder to its own class #18732
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ss to avoid confusion.
server/src/main/java/org/apache/druid/query/lookup/LookupListeningAnnouncerConfig.java
Fixed
Show fixed
Hide fixed
kfaraz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together, @abhishekrb19 !
I had a slightly different approach in mind.
In the original class, DatasourceTaskIdHolder all the fields are derived from Task itself, so we should probably just:
- Rename it to
PeonTaskHolderto denote that this class is meaningful only for a peon and that it contains the task that the peon is running - Just keep one field inside it
Task taskand add methodgetTask() - Retain the original utility methods
getDatasource(),getTaskId(),getLookupLoadinSpec()etc. - Create the
PeonTaskHolderin a@Providesmethod inCliPeon. - In non peon servers, bind the
PeonTaskHolderto null.
Let me know if this would make sense.
|
Thanks for the review, @kfaraz! I think the approach of keeping one renamed class would work well and makes sense. Just one caveat: the |
|
Ah, I see, thanks for clarifying that, @abhishekrb19 ! All the current usages of In that case, I think we should add an interface Sound good? |
|
@kfaraz sorry I ran into some hiccups with the bindings last week and didn't get a chance to take a closer look, but I'm hoping to revive this change again soon. Btw I noticed this deprecated comment on EmbeddedMiddleManager, so most (or all?) the tests seem to use the indexer for good reasons. Given that there are some subtle differences between indexers and MMs, do you have any suggestions for embedded tests that use MMs? At the very least, I was thinking of updating some of the embedded tests to use |
No worries, @abhishekrb19 ! 🙂
If the purpose is to test out peons in an embedded test setup, you could do a couple of things:
For local testing, please feel free to update any test to use an MM instead of Indexer. Most of them should work fine for both Indexer and MM. Please let me know if this meets yours needs. |
|
Sounds good, thanks for the pointer @kfaraz! I will give that a try |
98ff33e to
6fd60d6
Compare
| "Cannot specify both `lookupTier` and `lookupTierIsDatasource`" | ||
| ); | ||
| final String lookupTier = lookupTierIsDatasource ? dataSourceTaskIdHolder.getDataSource() : this.lookupTier; | ||
| final String lookupTier = lookupTierIsDatasource ? taskHolder.getDataSource() : this.lookupTier; |
Check notice
Code scanning / CodeQL
Possible confusion of local and field Note
getLookupTier
lookupTier
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
The best fix is to rename the local variable lookupTier in the getLookupTier() method to a different name, such as tierToReturn or effectiveLookupTier. This removes any ambiguity between the field and the local variable. This change should only be performed in the method getLookupTier() within LookupListeningAnnouncerConfig.java, specifically where the local variable is declared and used. No imports or additional definitions are necessary, as it's a straightforward renaming.
-
Copy modified line R59 -
Copy modified line R62
| @@ -56,10 +56,10 @@ | ||
| "Cannot specify both `lookupTier` and `lookupTierIsDatasource`" | ||
| ); | ||
|
|
||
| final String lookupTier = lookupTierIsDatasource ? taskHolder.getDataSource() : this.lookupTier; | ||
| final String effectiveLookupTier = lookupTierIsDatasource ? taskHolder.getDataSource() : this.lookupTier; | ||
|
|
||
| return Preconditions.checkNotNull( | ||
| lookupTier == null ? DEFAULT_TIER : StringUtils.emptyToNullNonDruidDataString(lookupTier), | ||
| effectiveLookupTier == null ? DEFAULT_TIER : StringUtils.emptyToNullNonDruidDataString(effectiveLookupTier), | ||
| "Cannot have empty lookup tier from %s", | ||
| lookupTierIsDatasource ? "bound value" : LookupModule.PROPERTY_BASE | ||
| ); |
9faa21f to
9513f0c
Compare
9513f0c to
89e3418
Compare
| final Injector injector = Initialization.makeInjectorWithModules( | ||
| GuiceInjectors.makeStartupInjector(), | ||
| ImmutableList.of( | ||
| (Module) binder -> JsonConfigProvider.bindInstance( | ||
| binder, | ||
| Key.get(DruidNode.class, Self.class), | ||
| new DruidNode("test-inject", null, false, null, null, true, false) | ||
| ), | ||
| new LookupModule() | ||
| )); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note test
Initialization.makeInjectorWithModules
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 1 month ago
To address the issue, we need to replace the usage of the deprecated Initialization.makeInjectorWithModules method (line 174) with its recommended alternative. According to the Druid source code and upgrade notes, the recommended replacement is generally the static utility method GuiceInjectors.makeInjectorWithModules, which provides the same API for creating an Injector from an existing startup injector and extra modules.
Specifically, you should:
- Replace the invocation of
Initialization.makeInjectorWithModules(...)withGuiceInjectors.makeInjectorWithModules(...)in the test methodtestLookupTierDefaultsForNonPeonServers. - No changes to method parameters should be needed, as both methods accept the same arguments.
- All necessary imports are already present (
GuiceInjectorsis imported at line 27). - No further changes are required elsewhere in the provided snippet.
-
Copy modified line R174
| @@ -171,7 +171,7 @@ | ||
| @Test | ||
| public void testLookupTierDefaultsForNonPeonServers() | ||
| { | ||
| final Injector injector = Initialization.makeInjectorWithModules( | ||
| final Injector injector = GuiceInjectors.makeInjectorWithModules( | ||
| GuiceInjectors.makeStartupInjector(), | ||
| ImmutableList.of( | ||
| (Module) binder -> JsonConfigProvider.bindInstance( |
kfaraz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some final suggestions, otherwise looks good. Thanks for incorporating the changes, @abhishekrb19 !
server/src/main/java/org/apache/druid/query/lookup/LookupReferencesManager.java
Outdated
Show resolved
Hide resolved
| // CliPeon.runTask and TaskHolder.getDataSource()/TaskHolder.getTaskId(). The reason for this is unclear | ||
| // but by injecting TaskPropertiesHolder early this cycle is avoided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true even after the changes in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, good point, they're no longer required. I've removed the changes originally included in #16140 and added some tests that would otherwise fail if we remove this "hack".
server/src/main/java/org/apache/druid/server/metrics/MetricsModule.java
Outdated
Show resolved
Hide resolved
| @Nullable final String datasource, | ||
| @Nullable final String taskId | ||
| ) | ||
| public static Map<String, String[]> mapOfTaskHolderDimensions(final TaskHolder taskHolder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would now make sense to move this method into TaskHolder itself. The map would be created just once when the TaskHolder instance is created.
Then we can just call taskHolder.getMetricDimensions() and get the same immutable wherever needed.
| private Provider<Task> taskProvider; | ||
|
|
||
| @Inject | ||
| public PeonTaskHolder( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: Do not use @Inject with this constructor and just pass a Task into it instead of a Provider.
Use an @Provides method in CliPeon to bind it correctly.
@Provides
public TaskHolder getTaskHolder(Task task)
{
return new PeonTaskHolder(task);
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are cyclic dependencies involving Task when it contains a TransformSpec, an AggregatorFactory or a monitor, which will pull in LookupModule, MetricsModule, etc, which depend on these holders. I’ve expanded the Javadoc to explain this in more detail. To break the cycle, this binding needs to use a Provider
instead of direct injection. Switching to @Provides pattern will break UTs and some ITs with a cyclic dependency Guice instantiation error:
End of classname legend:
========================
at org.apache.druid.cli.CliPeon.run(CliPeon.java:413)
at org.apache.druid.cli.Main.main(Main.java:111)
Caused by: java.lang.RuntimeException: com.google.inject.CreationException: Unable to create injector, see the following errors:
1) [Guice/ErrorInCustomProvider]: RuntimeException: ValueInstantiationException: Cannot construct instance of `ExpressionTransform`, problem: Unable to provision, see the following errors:
1) Problem parsing object at prefix[druid.lookup]: Cannot construct instance of `LookupListeningAnnouncerConfig`, problem: Unable to provision, see the following errors:
1) [Guice/ErrorInCustomProvider]: IllegalStateException: This is a proxy used to support circular references.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the clarification!
| private Provider<Task> taskProvider; | ||
|
|
||
| @Inject | ||
| public PeonLoadSpecHolder( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to PeonTaskHolder, this constructor should not need @Inject. Use a @Provides method in CliPeon instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentional - please see #18732 (comment)
server/src/test/java/org/apache/druid/server/metrics/GroupByStatsMonitorTest.java
Outdated
Show resolved
Hide resolved
kfaraz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only pending comment is moving the mapOfDimensions method to TaskHolder, but that is not a blocker for this PR.
| } | ||
|
|
||
| @Test | ||
| public void testTaskWithMonitorsAndMetricsSpecDoNotCauseCyclicDependency() throws IOException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for adding these tests!
|
Thanks, @kfaraz! |
Refactor:
DataSourceTaskIdHolderin favor of:TaskHolderinterface: implemented byPeonTaskHolderfor peon processes and byNoopTaskHolderfor all other servers, where the methods return null. The alternative was to throwUnsupportedExceptions inNoopTaskHolderand have callers verify type, but that made the usages a bit difficult and the callers would still have to effectively assign null anyway.LoadSpecHolderinterface: implemented byPeonLoadSpecHolderfor peon processes andDefaultLoadSpecHolderfor all other servers.DefaultServerHolderModulethat binds the above modules as part of the core injector module for all servers except CliPeon. For CliPeon, the binding toPeonTaskHolderandPeonLoadSpecHolderis explicitly done in CliPeon itself.Some implementation specific choices have been documented as javadocs in the code. In the future, we can extend
TaskHolderto also includegroupId,taskType, etc., from the Task and add those dimensions to all the metric modules as applicable.This PR has: