Skip to content

Conversation

@kamalcph
Copy link
Contributor

@kamalcph kamalcph commented Dec 26, 2025

  • RemoteLogManager already calculates the segmentSizeInBytes of all the
    remote-log segments for a partition including the stale segments.
  • Once all the segments are validated, then the computed segmentSize for
    listRemoteLogSegments(tpId) and listRemoteLogSegments(tpId, epoch) will
    match as the deletion proceeds, then the subsequent calls to
    buildRetentionSizeData should not validate / invoke the
    listRemoteLogSegments(tpId, epoch) again.
  • It helps to reduce the unnecessary listRemoteLogSegments(tpId, epoch)
    calls to RLMM and saves CPU resources.
  • Added UTs

Note that COPY_SEGMENT_STARTED state is excluded from the
remoteLogSizeInBytes metric calculation as it pollutes the metric
during the upload retries.

Reviewers: Luke Chen showuon@gmail.com

@github-actions github-actions bot added triage PRs from the community storage Pull requests that target the storage module tiered-storage Related to the Tiered Storage feature labels Dec 26, 2025
@kamalcph kamalcph requested review from chia7712 and showuon December 26, 2025 17:22
@kamalcph kamalcph force-pushed the lazy-remote-size-eval branch from 95a8ab0 to 6ddd6b7 Compare December 29, 2025 09:19
@kamalcph kamalcph changed the title MINOR: Lazily evaluate retention size during remote log cleanup MINOR: Reduce the list metadata calls to RLMM during segment cleanup Dec 29, 2025
@kamalcph kamalcph force-pushed the lazy-remote-size-eval branch from 6ddd6b7 to b9e6545 Compare December 29, 2025 09:33
@kamalcph kamalcph changed the title MINOR: Reduce the list metadata calls to RLMM during segment cleanup KAFKA-20026: Reduce the list metadata calls to RLMM during segment cleanup Dec 30, 2025
@github-actions
Copy link

github-actions bot commented Jan 3, 2026

A label of 'needs-attention' was automatically added to this PR in order to raise the
attention of the committers. Once this issue has been triaged, the triage label
should be removed to prevent this automation from happening again.

@kamalcph
Copy link
Contributor Author

kamalcph commented Jan 6, 2026

@satishd @chia7712 @showuon

Ping for review. PTAL. Thanks!

@kamalcph kamalcph requested a review from satishd January 6, 2026 05:09
@kamalcph kamalcph force-pushed the lazy-remote-size-eval branch from 0b6a339 to 8ca1488 Compare January 6, 2026 09:28
Copy link
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the improvement!

@kamalcph kamalcph merged commit da9e0dd into apache:trunk Jan 6, 2026
47 of 62 checks passed
@kamalcph kamalcph deleted the lazy-remote-size-eval branch January 6, 2026 17:13
@github-actions github-actions bot removed needs-attention triage PRs from the community labels Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

storage Pull requests that target the storage module tiered-storage Related to the Tiered Storage feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants