Skip to content

Conversation

PeteGillinElastic
Copy link
Member

@PeteGillinElastic PeteGillinElastic commented Apr 2, 2025

This enhances DataStreamAutoShardingService so that it periodically logs at INFO level the most 'interesting' results it has produced in the last period.

In this PR, the most 'interesting' results are considered to be the ones with the highest load, keeping track separately of the top 10 which resulting in an increase decision and the top 10 which did not. In practice, increase recommendations are sufficiently rare that the top 10 will often be 'all of them', and they are all potentially interesting (but we cap it to protect against taking up an unbounded amount of memory). We keep the high load non-increase recommendations as well, since these are likely to be the interesting ones to look at when investigating why some data stream did not get an increase shards recommendation when we might have expected it.

The mechanism would be easily extended to add in other categories. For example, we may find that there are some categories of decrease decisions we consider 'interesting'. (N.B. The rollover service will not roll over just because the auto-sharding service recommended down-sharding — down-sharding only happens if the rollover was going to happen for some other reason (age, size, etc.) So it's normal for the auto-sharding service to return decrease recommendations for the same data streams every 5 minutes until those other conditions are met. Which makes things a bit more complicated.) This PR just covers the cases that seem likely to be useful in the support cases we have seen.

The existing DEBUG and TRACE log lines in the service are replaced with a single DEBUG log which pulls together all the data. This is an improvement, since at the moment it is hard to figure out from the logs which lines refer to the same data stream (they are interleaved, and don't all include the data stream name).

The write load field in the AutoShardingResult was unused, and is removed.

@PeteGillinElastic
Copy link
Member Author

Hey @dakrone. I'm adding you here to flag it for your attention even though it's in draft — I've slacked you with context.

@PeteGillinElastic PeteGillinElastic force-pushed the ES-10037-data-stream-autosharding-extra-logging branch 4 times, most recently from b290bb9 to cb6e524 Compare April 3, 2025 12:12
This enhances DataStreamAutoShardingService so that it periodically
logs at INFO level the most 'interesting' results it has produced in
the last period.

In this PR, the most 'interesting' results are considered to be the
ones with the highest load, keeping track separately of the top 10
which resulting in an increase decision and the top 10 which did
not. In practice, increase recommendations are sufficiently rare that
the top 10 will often be 'all of them', and they are all potentially
interesting (but we cap it to protect against taking up an unbounded
amount of memory). We keep the high load non-increase recommendations
as well, since these are likely to be the interesting ones to look at
when investigating why some data stream did not get an increase shards
recommendation when we might have expected it.

The mechanism would be easily extended to add in other categories. For
example, we may find that there are some categories of decrease
decisions we consider 'interesting'. (N.B. The rollover service will
not roll over just because the auto-sharding service recommended
down-sharding — down-sharding only happens if the rollover was going
to happen for some other reason (age, size, etc.) So it's normal for
the auto-sharding service to return decrease recommendations for the
same data streams every 5 minutes until those other conditions are
met. Which makes things a bit more complicated.) This PR just covers
the cases that seem likely to be useful in the support cases we have
seen.

The existing DEBUG and TRACE log lines in the service are replaced
with a single DEBUG log which pulls together all the data. This is an
improvement, since at the moment it is hard to figure out from the
logs which lines refer to the same data stream (they are interleaved,
and don't all include the data stream name).

The write load field in the AutoShardingResult was unused, and is
removed.
@PeteGillinElastic PeteGillinElastic force-pushed the ES-10037-data-stream-autosharding-extra-logging branch from cb6e524 to 866aa07 Compare April 3, 2025 12:24
@PeteGillinElastic PeteGillinElastic added :Data Management/Data streams Data streams and their lifecycles >non-issue labels Apr 3, 2025
@PeteGillinElastic PeteGillinElastic marked this pull request as ready for review April 3, 2025 12:31
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Apr 3, 2025
@gmarouli gmarouli self-requested a review April 4, 2025 11:01
Copy link
Contributor

@gmarouli gmarouli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left some minor comments but nothing blocking

}

private AutoShardingResult getDecreaseShardsResult(
private Decision.DecreaseCalculation calculateIncreaseShardsDecision(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably calculateIncreaseShardsDecision -> calculateDecreaseShardsDecision.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, snap! Good spot. See, I told you that the copy-and-paste errors were my forte...

) {}

record DecreaseCalculation(
MaxLoadWithinCooldown maxLoadWithinCooldownForDecrease,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits:

  • Considering that MaxLoadWithinCooldown is an inner class of DecreaseCalculation, I am wondering if we could change the name to maxLoadWithinCooldown to make it a bit less verbose.
  • Maybe adding a comment about the previousIndexWithMaxLoad. It's not clear to me when I see this record what it represents. I understand it's an index with max load, but the previous is a bit confusing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Yeah, that's fair enough. For the increase one, I included the extra words at the end of writeIndexLoadForIncrease to differentiate it from the writeIndexLoad* fields in the Inputs record. But it's not really necessary here.
  • Good point. And I just learnt how to write javadoc on a record, which I'd never had to do before.

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I also left some minor comments

Comment on lines 188 to 193
if (decisionLogger == null) {
PeriodicDecisionLogger periodicDecisionLogger = new PeriodicDecisionLogger(nowSupplier);
this.decisionLogger = periodicDecisionLogger::maybeLogDecision;
} else {
this.decisionLogger = decisionLogger;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the treatment of null as a default of "log the message." I think I'd rather make the constructor take a required non-null argument (with a version that leaves out the argument and passes in the maybeLogDecision method), and then leave it to the caller to override with a no-op if necessary. Otherwise someone would likely assume that they can pass in a null and have the logging disabled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the reason I'd done that was because the naive way of doing the alternative doesn't work, because of the restrictions on what you can do before the this call in a a delegating constructor. I figured it was okay since the constructor with the extra nullable argument is package-private and labelled as for testing, so the scope for misuse was narrow.

But I think I've found a way of doing it with a static method, so I'll do that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I've changed the PeriodicLogger constructor to do something similar (only no need for a static method there).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I guess the alternative would be to junk the public ctor in favour of a static factory method, but this is less invasive of a change.)

}

// package-private for testing
record Decision(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's useful for these records (this one and its nested children) to document what the fields are, especially since devs may come here not aware of the interior bits of how autoscaling works.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, Mary already had me put javadoc on one of the records, I might as well do it for the rest...

Comment on lines +251 to +253
"For data stream %s: %s based on [inc/dec cooldowns %s/%s, %d-%d threads, "
+ "write index %s has all-time/recent/peak loads %g/%g/%g, current shards %d, "
+ "using %s value %g for increase gives %d shards%s]",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I understand that this log message encapsulates a lot more metrics and info than the previous one, I have to say it's a bit harder to read than "Data stream auto-sharding service recommends increasing the number of shards from [2] to [3] after [5m] cooldown for data stream [logs-foo-bar]". Is it possible to keep the information, but make it a little more user-friendly? (I'm concerned that no one would be able to understand it without knowing details about how data stream autoscaling works, and thus the team would be pulled into SDHs just to decipher the message)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the current behaviour is like this: Data stream auto-sharding result: For data stream my-awesome-data-stream: Deferred recommendation to decrease shards from 3 to 2 after cooldown period 5h based on [...]. The nice human-readable bit comes from the toString() I added to AutoShardingResult. I think this is pretty much on a par with the old thing. The information which was present before is all presented in the same order as before and the additional information is at the end. (In fact, I have tweaked it so that it no longer says after [0s] cooldown or whatever in the cases where there's no cooldown, which I argue is a readability improvement.)

If there are further tweaks you'd like to make, let me know!

Copy link
Member Author

@PeteGillinElastic PeteGillinElastic Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing I quoted above was the debug logging. The periodic info logging looks like this:

Data stream auto-sharding decisions in the last 5m with highest load without an increase shards recommendation:
 - For data stream my-data-stream: Recommendation to leave shards unchanged at 1 based on [inc/dec cooldowns 4.5m/3d, 2-32 threads, write index .ds-my-data-stream-2025.04.04-000001 has all-time/recent/peak loads 0.000530127/0.000525674/0.00123915, current shards 1, using ALL_TIME value 0.000530127 for increase gives 1 shards, and using ALL_TIME value 0.000530127 for dec based on write index gives 1 shards]
 - For data stream my-other-data-stream: Recommendation to leave shards unchanged at 1 based on [inc/dec cooldowns 4.5m/3d, 2-32 threads, write index .ds-my-data-stream-2025.04.04-000001 has all-time/recent/peak loads 0.000128001/0.000123547/0.00123915, current shards 1, using ALL_TIME value 0.000128001 for increase gives 1 shards, and using ALL_TIME value 0.000128001 for dec based on write index gives 1 shards]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The … using ALL_TIME value 0.000530127 for dec based on write index gives 1 shards is the most unclear to me, do you think that one could be clarified?

Copy link
Member Author

@PeteGillinElastic PeteGillinElastic Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say more about how this is confusing? Is it the ordering of the parts? The use of the abbreviation dec? (I'm conscious of the width of these lines, but that's probably less important than comprehensibility.) Is it that it's not clear that "based on write index" is telling you where that load comes from?

Would decrease calculation gives 3 shards based on PEAK load of 2.71828 for write index be better?

I don't think I can confidently change it without knowing more about the nature of your confusion... Or, even better, you could propose an alternative wording!

I'm running out of time to get this in before I have to leave for the weekend. Unless you say otherwise, I'm going to assume that you don't hate this so much that you can't bear to see it go out into the world, and merge the change which is currently going through CI, and we can wordsmith it in a follow-up PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential follow-up PR for word-smithing: #126339.

private static class DecisionBuffer {

private final Comparator<Decision> comparator;
private final PriorityQueue<Decision> queue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using Lucene's PriorityQueue here only for the insertWithOverflow? Can you add a comment to explain the reasoning so someone doesn't revert it to the JDK version without understanding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JDK version isn't bounded. I'll add a comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Based on what I saw on SO, there are people out there who are depending on Lucene just to get their bounded PQ implementation.)

@PeteGillinElastic
Copy link
Member Author

Thanks @dakrone and @gmarouli . I plan to merge this once it's got through CI, unless you have any more comments.

private static Consumer<Decision> createPeriodicLoggingDecisionConsumer(LongSupplier nowSupplier) {
PeriodicDecisionLogger periodicDecisionLogger = new PeriodicDecisionLogger(nowSupplier);
return periodicDecisionLogger::maybeLogDecision;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This static method isn't strictly necessary. It could be inlined as new PeriodicDecisionLogger(nowSupplier)::maybeLogDecision. But I think that's bad readability. If I saw that, it would make me stop and think about what it's doing, and I don't like to have to think.

Also, fun fact, if you write new PeriodicDecisionLogger(nowSupplier)::maybeLogDecision, IntelliJ will offer to "refactor" it into decision -> new PeriodicDecisionLogger(nowSupplier).maybeLogDecision(decision), which would be super bad news. (It does pop up a warning about possible side effects, but still.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow that refactoring (Intellij's) is bad. This static method works for me.

@PeteGillinElastic PeteGillinElastic merged commit 78aff25 into elastic:main Apr 4, 2025
17 checks passed
@PeteGillinElastic PeteGillinElastic deleted the ES-10037-data-stream-autosharding-extra-logging branch April 4, 2025 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Data Management/Data streams Data streams and their lifecycles >non-issue Team:Data Management Meta label for data/management team v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants