[DSIP-95][API] Complete the functionality of using dependencies in the complement data#18003
[DSIP-95][API] Complete the functionality of using dependencies in the complement data#18003det101 wants to merge 7 commits intoapache:devfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Implements downstream workflow triggering for complement/backfill runs in the API layer, adding support for “trigger dependent workflows” behavior and accompanying unit tests.
Changes:
- Implemented
doBackfillDependentWorkflowto fetch downstream workflow definitions and trigger backfill runs for them. - Added visited-code tracking intended to prevent self/cyclic triggering and duplicate downstream triggers.
- Added
BackfillWorkflowExecutorDelegateTestwith basic scenarios for downstream triggering and filtering.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| dolphinscheduler-api/src/main/java/org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegate.java | Adds dependent workflow backfill triggering logic and wiring for lineage + workflow definition lookup. |
| dolphinscheduler-api/src/test/java/org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegateTest.java | Adds unit tests for the new dependent backfill triggering logic (single-hop scenarios). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...java/org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegate.java
Show resolved
Hide resolved
.../org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegateTest.java
Show resolved
Hide resolved
...java/org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegate.java
Show resolved
Hide resolved
SbloodyS
left a comment
There was a problem hiding this comment.
You didn't fill in the content according to the PR template, please fix it.
...java/org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegate.java
Outdated
Show resolved
Hide resolved
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
.../org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegateTest.java
Show resolved
Hide resolved
...java/org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegate.java
Show resolved
Hide resolved
...java/org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegate.java
Show resolved
Hide resolved
…endent workflow - Problem 1: Replace single-level self-dep check with ThreadLocal visited set to detect and skip indirect circular dependencies (A→B→A), preventing StackOverflowError when allLevelDependent=true - Problem 2: Set startNodes=null for downstream workflows; upstream task node codes are not valid in a different workflow definition - Add tests for OFFLINE skip, not-found skip, and startNodes null assertion Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tion ThreadLocal is unnecessary here since the call chain is synchronous and private. Passing visitedCodes as a parameter is simpler, clearer, and avoids ThreadLocal lifecycle management. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Align Java source and test formatting with project spotless rules to keep style checks consistent. Made-with: Cursor
Ensure dependent backfill triggering is evaluated per parallel date chunk, and batch-load downstream workflow definitions to avoid N+1 queries. Add regression coverage for parallel mode visited-code isolation. Made-with: Cursor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
dolphinscheduler-api/src/main/java/org/apache/dolphinscheduler/api/executor/workflow/BackfillWorkflowExecutorDelegate.java:119
- In parallel mode,
expectedParallelismNumbercan be 0 (the validator only rejects values < 0). If it is 0,splitDateTime(listDate, expectedParallelismNumber)will divide by zero and throwArithmeticException. Treat 0 the same as null (default tolistDate.size()), or explicitly guard against<= 0before callingsplitDateTime.
final BackfillWorkflowDTO.BackfillParamsDTO backfillParams = backfillWorkflowDTO.getBackfillParams();
Integer expectedParallelismNumber = backfillParams.getExpectedParallelismNumber();
List<ZonedDateTime> listDate = backfillParams.getBackfillDateList();
if (expectedParallelismNumber != null) {
expectedParallelismNumber = Math.min(listDate.size(), expectedParallelismNumber);
} else {
expectedParallelismNumber = listDate.size();
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| void doBackfillDependentWorkflowForTesting(final BackfillWorkflowDTO backfillWorkflowDTO, | ||
| final List<String> backfillTimeList, | ||
| final Set<Long> visitedCodes) { | ||
| doBackfillDependentWorkflow(backfillWorkflowDTO, backfillTimeList, visitedCodes); | ||
| } |
There was a problem hiding this comment.
doBackfillDependentWorkflowForTesting introduces a production method purely as a test seam. Consider making the underlying dependency-triggering method package-private/protected (or extract a collaborator) and annotate with something like @VisibleForTesting, instead of keeping a "ForTesting" method in the main class API.
| public void testDoParallelBackfillWorkflow_ShouldIsolateVisitedCodesAcrossChunks() { | ||
| long upstreamCode = 500L; | ||
| WorkflowDefinition upstreamWorkflow = | ||
| WorkflowDefinition.builder().code(upstreamCode).releaseState(ReleaseState.ONLINE).build(); | ||
| List<ZonedDateTime> dates = Arrays.asList( | ||
| ZonedDateTime.parse("2026-02-01T00:00:00Z"), | ||
| ZonedDateTime.parse("2026-02-02T00:00:00Z"), | ||
| ZonedDateTime.parse("2026-02-03T00:00:00Z")); | ||
| BackfillWorkflowDTO.BackfillParamsDTO params = BackfillWorkflowDTO.BackfillParamsDTO.builder() | ||
| .runMode(RunMode.RUN_MODE_PARALLEL) | ||
| .backfillDateList(dates) | ||
| .expectedParallelismNumber(2) | ||
| .backfillDependentMode(ComplementDependentMode.ALL_DEPENDENT) | ||
| .allLevelDependent(true) | ||
| .executionOrder(ExecutionOrder.ASC_ORDER) | ||
| .build(); | ||
| BackfillWorkflowDTO dto = BackfillWorkflowDTO.builder() | ||
| .workflowDefinition(upstreamWorkflow) | ||
| .backfillParams(params) | ||
| .build(); | ||
| Set<Long> baseVisitedCodes = new HashSet<>(Collections.singleton(upstreamCode)); | ||
| List<Set<Long>> visitedSnapshotPerChunk = new java.util.ArrayList<>(); | ||
|
|
||
| doAnswer(invocation -> { | ||
| Set<Long> chunkVisited = invocation.getArgument(2); | ||
| visitedSnapshotPerChunk.add(new HashSet<>(chunkVisited)); | ||
| chunkVisited.add(9000L + visitedSnapshotPerChunk.size()); | ||
| return null; | ||
| }).when(backfillWorkflowExecutorDelegate).doBackfillDependentWorkflowForTesting(any(), any(), any()); | ||
|
|
||
| List<Integer> result = backfillWorkflowExecutorDelegate.executeWithVisitedCodes(dto, baseVisitedCodes); | ||
|
|
There was a problem hiding this comment.
This test calls executeWithVisitedCodes, which will run the real doBackfillWorkflow and attempt to use registryClient / Clients.withService(IWorkflowControlClient) to contact a master. Since neither is mocked/stubbed in this test, it will fail with NPE or a ServiceException before exercising the visited-codes isolation assertions. Consider refactoring to unit-test the chunk visited-code cloning without invoking the master trigger, or add a test seam/mocking for the backfill trigger step.
| Method method = BackfillWorkflowExecutorDelegate.class.getDeclaredMethod( | ||
| "doBackfillDependentWorkflow", BackfillWorkflowDTO.class, List.class, Set.class); | ||
| method.setAccessible(true); | ||
|
|
There was a problem hiding this comment.
These tests rely on reflection to access the private doBackfillDependentWorkflow method, which is brittle (renames/signature changes will silently break tests) and forces setAccessible(true). Since the behavior is now part of the public backfill flow, consider exposing a package-private/protected method (or extracting a helper class) so tests can call it directly without reflection.
Made-with: Cursor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // 2) Convert upstream backfill time from string to ZonedDateTime as the base business dates for downstream | ||
| // backfill | ||
| final List<ZonedDateTime> upstreamBackfillDates = backfillTimeList.stream() | ||
| .map(DateUtils::stringToZoneDateTime) | ||
| .collect(Collectors.toList()); |
There was a problem hiding this comment.
doBackfillDependentWorkflow converts backfillTimeList (strings) to ZonedDateTime via DateUtils.stringToZoneDateTime, then later downstream execution converts those ZonedDateTimes back to strings via DateUtils.dateToString. Because stringToZoneDateTime interprets the input using the thread-local/default timezone but returns a ZonedDateTime in ZoneId.systemDefault(), this round-trip can shift the local date/time when the request timezone differs from the JVM default, causing downstream workflows to be backfilled for the wrong business dates. Avoid the string->ZonedDateTime->string round-trip by propagating the original ZonedDateTime chunk list into downstream params, or ensure parsing/formatting uses the same timezone consistently (e.g., the request/thread-local timezone).
| @Override | ||
| public List<Integer> execute(final BackfillWorkflowDTO backfillWorkflowDTO) { | ||
| return executeWithVisitedCodes(backfillWorkflowDTO, null); | ||
| } | ||
|
|
||
| List<Integer> executeWithVisitedCodes(final BackfillWorkflowDTO backfillWorkflowDTO, | ||
| final Set<Long> visitedCodes) { | ||
| // todo: directly call the master api to do backfill | ||
| if (backfillWorkflowDTO.getBackfillParams().getRunMode() == RunMode.RUN_MODE_SERIAL) { | ||
| return doSerialBackfillWorkflow(backfillWorkflowDTO); | ||
| return doSerialBackfillWorkflow(backfillWorkflowDTO, visitedCodes); | ||
| } else { | ||
| return doParallelBackfillWorkflow(backfillWorkflowDTO); | ||
| return doParallelBackfillWorkflow(backfillWorkflowDTO, visitedCodes); | ||
| } | ||
| } |
There was a problem hiding this comment.
The PR description currently states "This pull request is code cleanup without any test coverage", but this change introduces new dependency-triggering behavior and adds a new test class. Please update the PR description/verify section to reflect the actual behavior change and how it is tested (or why tests are sufficient).


Was this PR generated or assisted by AI?
close #17748
Purpose of the pull request
Brief change log
Verify this pull request
This pull request is code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(or)
Pull Request Notice
Pull Request Notice
If your pull request contains incompatible change, you should also add it to
docs/docs/en/guide/upgrade/incompatible.md