Skip to content

Conversation

@TakaHiR07
Copy link
Contributor

@TakaHiR07 TakaHiR07 commented Feb 13, 2025

Motivation

ledger zk path is like "/ledgers/00/0601/L7170". But currently it exist pattern regex error cause zk data notification can not execute.

ledgerPathRegex.matcher(n.getPath()).matches() is always false.

企业微信截图_c58d5c24-43ee-4bb1-b1b5-9d9fbb3747e2

Modifications

use correct pattern.

Alternative modification: remove the judgement in handleDataNotification(), since getLedgerId(n.getPath()) would throw error if the path is not ledger path.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment
  • The public API

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Feb 13, 2025
@thetumbled
Copy link
Member

Some test is requried to ensure the notification logic is triggered and works correct.

@lhotari
Copy link
Member

lhotari commented Feb 18, 2025

ledger zk path is like "/ledgers/00/0601/L7170". But currently it exist pattern regex error cause zk data notification can not execute.

Great catch @TakaHiR07. What is the current impact of this in Pulsar & Bookkeeper (which is using PulsarLedgerManager in the Pulsar distribution of Bookkeeper)?

@TakaHiR07
Copy link
Contributor Author

Great catch @TakaHiR07. What is the current impact of this in Pulsar & Bookkeeper (which is using PulsarLedgerManager in the Pulsar distribution of Bookkeeper)?

One impact is all the asyncOpenLedgerNoRecovery in pulsar can not register successful MetadataListener. The code is here: https://github.com/apache/bookkeeper/blob/606db747eae9856fed0aeb3f16ef01e7c9254e26/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadOnlyLedgerHandle.java#L95-L105

I am not sure whether other place use PulsarLedgerManager and register zk listener.

@lhotari
Copy link
Member

lhotari commented Feb 18, 2025

Some test is requried to ensure the notification logic is triggered and works correct.

@thetumbled That's right that there should be tests, but this just shows that the original code didn't have proper test coverage if it's currently broken.

One possible resolution would be to add an issue report about the missing test coverage and add the tests later. That moment usually never comes, but it's also bad to have this issue around.

@lhotari
Copy link
Member

lhotari commented Feb 18, 2025

Great catch @TakaHiR07. What is the current impact of this in Pulsar & Bookkeeper (which is using PulsarLedgerManager in the Pulsar distribution of Bookkeeper)?

One impact is all the asyncOpenLedgerNoRecovery in pulsar can not register successful MetadataListener. The code is here: https://github.com/apache/bookkeeper/blob/606db747eae9856fed0aeb3f16ef01e7c9254e26/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadOnlyLedgerHandle.java#L95-L105

I am not sure whether other place use PulsarLedgerManager and register zk listener.

I wonder what parts of the metadata could change. My guess is LAC (lastAddConfirmed) and length based on this:
https://github.com/apache/bookkeeper/blob/54bdc0d60b32830b513089167cee67f52f4735eb/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L367-L370 .
I would assume that this would be relevant when the ledger is in recovery state.
States:
https://github.com/apache/bookkeeper/blob/2192caaf9738cf4efb799647cc5a5f68bf1823b2/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/api/LedgerMetadata.java#L154-L167

@TakaHiR07
Copy link
Contributor Author

TakaHiR07 commented Feb 18, 2025

Great catch @TakaHiR07. What is the current impact of this in Pulsar & Bookkeeper (which is using PulsarLedgerManager in the Pulsar distribution of Bookkeeper)?

One impact is all the asyncOpenLedgerNoRecovery in pulsar can not register successful MetadataListener. The code is here: https://github.com/apache/bookkeeper/blob/606db747eae9856fed0aeb3f16ef01e7c9254e26/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/ReadOnlyLedgerHandle.java#L95-L105
I am not sure whether other place use PulsarLedgerManager and register zk listener.

I wonder what parts of the metadata could change. My guess is LAC (lastAddConfirmed) and length based on this: https://github.com/apache/bookkeeper/blob/54bdc0d60b32830b513089167cee67f52f4735eb/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerHandle.java#L367-L370 . I would assume that this would be relevant when the ledger is in recovery state. States: https://github.com/apache/bookkeeper/blob/2192caaf9738cf4efb799647cc5a5f68bf1823b2/bookkeeper-server/src/main/java/org/apache/bookkeeper/client/api/LedgerMetadata.java#L154-L167

@lhotari I think if ledger is in recover state, LAC would be changed. But we should not use asyncOpenLedgerNoRecovery to register zk metadata listener, instead should use asyncOpenLedger to update ledgerHandler's metadata. This is no problem since it do not rely on zk.

But if ledger is already closed, and then trigger bookkeeper auto-recovery because of disk error, ledger's quorum would be changed, ledger's zk node would also be changed.

Actually, the issue is found when I fix another issue, you can see here. #21552

@codecov-commenter
Copy link

codecov-commenter commented Feb 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.25%. Comparing base (829df71) to head (88d0d80).

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #23977      +/-   ##
============================================
- Coverage     74.26%   74.25%   -0.02%     
+ Complexity    33213    32847     -366     
============================================
  Files          1885     1885              
  Lines        146953   146953              
  Branches      16928    16928              
============================================
- Hits         109136   109119      -17     
- Misses        29116    29125       +9     
- Partials       8701     8709       +8     
Flag Coverage Δ
inttests 26.66% <100.00%> (+0.07%) ⬆️
systests 22.66% <100.00%> (+0.03%) ⬆️
unittests 73.75% <100.00%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...ulsar/metadata/bookkeeper/PulsarLedgerManager.java 57.45% <100.00%> (+7.01%) ⬆️

... and 79 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@lhotari lhotari requested a review from Technoboy- August 27, 2025 06:50
Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The pattern looks correct although there aren't tests. The problem isn't caused by this PR since there doesn't seem to be existing tests for PulsarLedgerManager.

@lhotari lhotari changed the title [fix][broker] pattern regex error in PulsarLedgerManager cause zk data notification can not execute [fix][broker] Invalid regex in PulsarLedgerManager causes zk change notification to be ignored Aug 27, 2025
@lhotari lhotari changed the title [fix][broker] Invalid regex in PulsarLedgerManager causes zk change notification to be ignored [fix][broker] Invalid regex in PulsarLedgerManager causes zk data notification to be ignored Aug 27, 2025
@lhotari lhotari merged commit a532798 into apache:master Aug 27, 2025
100 of 102 checks passed
lhotari pushed a commit that referenced this pull request Aug 28, 2025
…ification to be ignored (#23977)

Co-authored-by: fanjianye <[email protected]>
(cherry picked from commit a532798)
lhotari pushed a commit that referenced this pull request Aug 28, 2025
…ification to be ignored (#23977)

Co-authored-by: fanjianye <[email protected]>
(cherry picked from commit a532798)
lhotari pushed a commit that referenced this pull request Aug 28, 2025
…ification to be ignored (#23977)

Co-authored-by: fanjianye <[email protected]>
(cherry picked from commit a532798)
manas-ctds pushed a commit to datastax/pulsar that referenced this pull request Aug 28, 2025
…ification to be ignored (apache#23977)

Co-authored-by: fanjianye <[email protected]>
(cherry picked from commit a532798)
(cherry picked from commit 3da4c7a)
ganesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Aug 29, 2025
…ification to be ignored (apache#23977)

Co-authored-by: fanjianye <[email protected]>
(cherry picked from commit a532798)
(cherry picked from commit a54cb1a)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Sep 3, 2025
…ification to be ignored (apache#23977)

Co-authored-by: fanjianye <[email protected]>
(cherry picked from commit a532798)
(cherry picked from commit a54cb1a)
nodece pushed a commit to ascentstream/pulsar that referenced this pull request Sep 8, 2025
…ification to be ignored (apache#23977)

Co-authored-by: fanjianye <[email protected]>
(cherry picked from commit a532798)
Technoboy- pushed a commit to Technoboy-/pulsar that referenced this pull request Sep 10, 2025
…ification to be ignored (apache#23977)

Co-authored-by: fanjianye <[email protected]>
(cherry picked from commit a532798)
srinath-ctds pushed a commit to datastax/pulsar that referenced this pull request Sep 12, 2025
…ification to be ignored (apache#23977)

Co-authored-by: fanjianye <[email protected]>
(cherry picked from commit a532798)
(cherry picked from commit 3da4c7a)
nborisov pushed a commit to nborisov/pulsar that referenced this pull request Sep 12, 2025
…ification to be ignored (apache#23977)

Co-authored-by: fanjianye <[email protected]>
(cherry picked from commit a532798)
KannarFr pushed a commit to CleverCloud/pulsar that referenced this pull request Sep 22, 2025
walkinggo pushed a commit to walkinggo/pulsar that referenced this pull request Oct 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants