- 
                Notifications
    You must be signed in to change notification settings 
- Fork 25.6k
Make data stream lifecycle project-aware #125476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Pinging @elastic/es-data-management (Team:Data Management) | 
        
          
                server/src/main/java/org/elasticsearch/action/ResultDeduplicator.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
      | FTR, DLM mostly relies on internal cluster tests. Currently, we don't have a way to run internal cluster tests in MP mode. I made some local hacks to be able to run the DLM internal cluster tests run in MP mode anyway and they all passed. Those hacks aren't for committing purposes, but the MP team is working with ES Delivery to make proper changes to the testing infrastructure to allow running all kinds of tests in MP mode. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some comments
        
          
                server/src/main/java/org/elasticsearch/action/ResultDeduplicator.java
              
                Outdated
          
            Show resolved
            Hide resolved
        
              
          
                ...treams/src/main/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleService.java
          
            Show resolved
            Hide resolved
        
      | for (var projectId : state.metadata().projects().keySet()) { | ||
| // We catch inside the loop to avoid one broken project preventing DLM to run on other projects. | ||
| try { | ||
| projectResolver.executeOnProject(projectId, () -> run(state.projectState(projectId))); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we also pass projectId to all the downstream methods, I wonder whether the effect of using executeOnProject is a bit too far away for understanding the code, i.e. it was not clear to me that the later client.execute is going to do the right thing. An alternative would be using executeOnProject on the spots where client is invoked. We could have a method that returns dedicated client similar to the following
Client projectClient(ProjectId projectId) {
    return new FilterClient(client) {
        @Override
        protected <...> void doExecute(...) {
            projectResolver.executeOnProject(projectId, () -> super.doExecute(action, request, listener));
        }
    };
}and use it like
projectClient(projectId).admin().indices()...The downside is creating a client for each request. But that's probably not too bad since we create many other objects in request/response situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm I think a FilterClient like that is interesting, but I'm also worried about the overhead. I could just wrap the current client calls in this class with executeOnProject, but I intentionally didn't do that to reduce the changes. If we do go for the FilterClient, I would like to try to add it in a generic place (e.g. ProjectResolver), as other places will benefit from it. Maybe we could even store a map of project clients to avoid re-creating them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How often this code runs? I'd avoid creating a map since that brings in new questions about expiration and size control and memory consumptions.
Overall I don't feel strongly between your change and my suggestion. I am OK with it as is if that is your preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DataStreamLifecycle#run(ClusterState) is invoked every 5 minutes (by default). I was suggesting creating a map in some generic place, that lives throughout the entire life of the node/cluster. Having one map of Map<ProjectId, Client> in the MultiProjectProjectResolver sounds pretty manageable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having a method like projectClient(ProjectId) on the ProjectResolver sounds like a good idea. I'd start out without maintaining a map. Not because I think it is problematic to maintain such map but because it is not obviously a meaningful optimization without more data. We can always add the map when it becomes more obviously necessary in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a projectClient method in 10758c2. The extra object creation makes me feel a little bit uneasy. How do you feel about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't want the default method to always create a new client since that is not necessary for DefaultProjectResolver. My original suggestion of having a local projectClient method had the same issue. A quick fix might be checking supportsMultipleProjects() and return the client as is if it is false, e.g. something like:
if (supportsMultipleProjects() == false && projectId.equals(getProjectId())) {
    return client;
} else {
    // return a new filterClient
}This will bypass DefaultProjectResolver#executeOnProject and its projectId check. I think that's OK since the new method provides the same check and error cases will still go through the original executeOnProject.
Ideally it would be great if the method takes only a projectId and the client is provided internally by the ProjectResolver. This would require ProjectResolver to hold a reference to client. It is not an issue for the actual ProjectResolver implementations but maybe a bit fiddly for those TestProjectResolvers, unless we are happy to use NoOpClient by default. These can be tackled separate if deemed worthwhile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the shortcut to the projectClient method. I thought some more about how we can deal with that client, but having a map of clients or passing a reference of the client to ProjectResolver itself might be difficult, as some places make use of OriginSettingClient - not in DataStreamLifecycleService though.
I think I do prefer this approach over my initial approach, as this specifies the project ID way more explicitly, but I do still feel like there is room for improvement. We can improve later/separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OriginSettingClient is a good point. I wonder whether the relationship should be actually be opposite, i.e. Client has a reference to projectResolver and has a new method like Client.projectClient(ProjectId) similar to filterWithHeaders. Definitely something outside of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great suggestion. I'll explore the possibilities there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
        
          
                ...treams/src/main/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleService.java
          
            Show resolved
            Hide resolved
        
      | for (var projectId : state.metadata().projects().keySet()) { | ||
| // We catch inside the loop to avoid one broken project preventing DLM to run on other projects. | ||
| try { | ||
| projectResolver.executeOnProject(projectId, () -> run(state.projectState(projectId))); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OriginSettingClient is a good point. I wonder whether the relationship should be actually be opposite, i.e. Client has a reference to projectResolver and has a new method like Client.projectClient(ProjectId) similar to filterWithHeaders. Definitely something outside of this PR.
Now that all actions that DLM depends on are project-aware, we can make DLM itself project-aware.
There still exists only one instance of
DataStreamLifecycleService, it just loops over all the projects - which matches the approach we've taken for similar scenarios thus far.