-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Add new compaction policy to prioritize fragmented intervals #18802
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @JsonCreator | ||
| public MostFragmentedIntervalFirstPolicy( | ||
| @JsonProperty("minUncompactedCount") @Nullable Integer minUncompactedCount, | ||
| @JsonProperty("minUncompactedBytes") @Nullable Long minUncompactedBytes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use HumanReadableBytes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do.
| * compaction. Default value is {@link #SIZE_2_GB}. | ||
| */ | ||
| @JsonProperty | ||
| public long getMaxUncompactedSize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The names of minUncompactedBytes and maxUncompactedSize are confusing, because they look very similar but one refers to total and one refers to average. How about minUncompactedBytes and maxAverageUncompactedBytesPerSegment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I wish this could be specified in terms of row counts rather than byte counts. The target segment size rowsPerSegment is specified as a number of rows, so it's most natural to specify the max average uncompacted size in terms of rows as well. Like, I might say that the target is 3M rows per segment but don't bother compacting if the segments average 2.5M already.
I recognize this requires additional metadata to be available that may not currently be available, so it doesn't have to be done in this PR. I'm just having a thought.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree that average number of rows per segment would be a better parameter for this function.
I have used the average bytes here as a proxy until we have the row count for each segment available in the metadata.
| final long avgUncompactedSize = Math.max(1, uncompacted.getTotalBytes() / uncompacted.getNumSegments()); | ||
|
|
||
| // Priority increases as size decreases and number increases | ||
| final double normalizingFactor = 1000f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this factor? seems like it wouldn't affect the results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think this doesn't serve any purpose any more.
I had started out keeping the priority as a long and used a multiplication factor in an effort to get whole number values. But I later realized that it would be difficult to have a universal factor that works for all cases, so best to use a double instead.
|
|
||
| /** | ||
| * Computes the priority of the given compaction candidate by checking the | ||
| * total number and average size of uncompacted segments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intuitively what does the priority "mean"? It's helpful to add that to the javadocs.
Reading through, it seems like the priority boils down to pow(uncompacted.getNumSegments, 2) / uncompacted.getTotalBytes. Why that particular formula?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unit tests for this function would be helpful, to illustrate its behavior in various situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will update the javadocs.
I wanted it to be directly proportional to the number of uncompacted segments and inversely proportional to the size of uncompacted segments.
So
priority
= uncompactedCount / avgUncompactedSize
= uncompactedCount * (uncompactedCount / uncompactedBytes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other alternative would be to have something like
priority = factor1 * uncompactedCount + factor2 * avgUncompactedSize
but that seemed like it would be difficult to tune as no set of factors 1 and 2 would apply universally to all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intuitively what does the priority "mean"? It's helpful to add that to the javadocs.
Would it also make sense to rename the method as computeFragmentationIndex(), since essentially that is what we are trying to achieve here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the javadocs. Also updated the fragmentation index formula. Details in the PR description.
gianm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a good place to start. We can evolve it based on experience in the real world.
|
Thanks for the review, @gianm ! |
Description
Druid currently supports two compaction policies to prioritize intervals for compaction.
newestSegmentFirst: prioritizes compaction of intervals with the latest datafixedIntervalOrder: explicitly specifies the datasources and intervals which should be compactedProposed policy
mostFragmentedFirst- prioritizes compaction of the most fragmented intervalsAn interval is eligible for compaction if and only if:
minUncompatedCount.minUncompatedBytesmaxAverageUncompactedBytesPerSegmentPrioritizes an interval based on the "fragmentation index" of the interval which is computed as follows:
Changes
CompactionStatus.compute()to track the number of compacted and uncompacted segmentsmostFragmentedFirstwhich prioritizes intervals with most small uncompacted segmentsThis PR has: