-
Notifications
You must be signed in to change notification settings - Fork 302
sync_diff_inspector checkpoint integration test flakes on resumed chunk selection #12553
Copy link
Copy link
Open
Labels
component/testUnit tests and integration tests component.Unit tests and integration tests component.severity/minortype/bugThe issue is confirmed as a bug.The issue is confirmed as a bug.
Description
Which jobs are flaking?
pingcap/tiflow/pull_syncdiff_integration_test
Which test(s) are flaking?
sync_diff_inspector/tests/sync_diff_inspector/checkpoint/run.sh- The flaky assertion is in the bucket-checkpoint resume path, where the script picks the first resumed chunk by sorting
lowerBoundsonly.
Jenkins logs or GitHub Actions link
https://do.pingcap.net/jenkins/blue/organizations/jenkins/pingcap%2Ftiflow%2Fpull_syncdiff_integration_test/detail/pull_syncdiff_integration_test/815/pipeline
Observed failure from build 815:
- resumed candidate logged as `39 upperBounds= indexCode=0:1-250:0:1`
- old assertion expected a fixed resumed index pattern and failed on `first_chunk_index`
Anything else we need to know
- Does this test exist for other branches as well?
- Likely yes, if the same checkpoint test script is present.
- Has there been a high frequency of failure lately?
- The same symptom has been seen in other PRs.
- Related parent tracking issue:
- Subtask of Tracking issue for flaky tests #2246
- Root cause summary:
- The checkpoint test assumes the first resumed chunk selected from logs is stable after sorting by
lowerBoundsonly. In practice, multiple resumed chunks can share the same lower bound, so the script may pick a different valid resumed chunk and fail nondeterministically.
- The checkpoint test assumes the first resumed chunk selected from logs is stable after sorting by
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
component/testUnit tests and integration tests component.Unit tests and integration tests component.severity/minortype/bugThe issue is confirmed as a bug.The issue is confirmed as a bug.