fix(executions): loop until restart and replay with max duration reached limit#14354
Conversation
|
@MilosPaunovic It seems like the last PR action is buggy, since I’m not a new contributor 😂 |
|
@MTarek165 It's alright, as if you were new contributor, it would send you an automatic welcome message, for example, see #14352 (comment). |
|
Nice but it seems like for this PR the tests are disabled, no triggered CI as it is the case for first time contributors. |
|
It's unrelated, we seem to have some other CI problems, it will be sorted swiftly, thanks for pointing it out! 🍀 |
|
Hi @brian-mulier-p, could you give some time and give a look on this PR? |
|
Hi @loicmathieu , sorry about that but could you please review this PR? |
|
@MTarek165, @brian-mulier-p is already reviewing this PR. |
…d and keep track of it in state history
…licable max duration state
edaaf84 to
2875d31
Compare
|
@MTarek165 sorry it's been really busy weeks lately, I'm into it rn, I had to rebase everything but I'm not exactly sure I pulled all your latest changes, can you double check all your changes are there? Otherwise I recommand you to force push again from your local changes and I'll rebase it better. |
brian-mulier-p
left a comment
There was a problem hiding this comment.
LGTM except one of the change where I'm not confident about it.
Sorry for the review delay...
Also I tested locally and the issue example works perfectly, well done!
core/src/main/java/io/kestra/core/models/executions/TaskRun.java
Outdated
Show resolved
Hide resolved
core/src/main/java/io/kestra/core/services/ExecutionService.java
Outdated
Show resolved
Hide resolved
| } else { | ||
| Task task = flow.findTaskByTaskId(originalTaskRun.getTaskId()); | ||
| if (!task.isFlowable() || task instanceof WorkingDirectory) { | ||
| if (!task.isFlowable() || task instanceof WorkingDirectory || task instanceof LoopUntil) { |
There was a problem hiding this comment.
@brian-mulier-p Before merging this PR I have opinion on the change here beyond the context of the PR, related to Flowable transition to Running directly for restart/replay except for WorkingDirectory and LoopUntil tasks here as I see if there is no specific reason for that it is better to give all flowables a RESTARTED state as it is more natural and may have benefits like we have for LoopUntil.
There was a problem hiding this comment.
idk, is there an impact on the resolution to remove the LoopUntil exception here? I can see why WorkingDirectory is there because if a subtask is restarted then the WorkingDirectory is also restarted because it's a special wrapper task that forces any subtask to be performed under the same worker so it means if any subtask or the WorkingDirectory task itself is restarted, the whole is restarted so it makes more sense to see the WorkingDirectory task also as restarted
There was a problem hiding this comment.
Keeping LoopUntil as Restarted is essential step as it is a reference for the max duration when checking reaching maximum at LoopUntil between iterations.
I was wondering if we should make all flowables being Restarted, but since we don't know, let's keep aligned with the scope of the PR only.
You can merge it now 😀
|
@brian-mulier-p I agree also with your point about clearing history, so after change now all history is preserved between iterations. |

closes #14326
Description
1- The problem is the same for replay and restart and it happens at LoopUntil.reachedMaximums() when checking creationDate with maxDuration against current time, it always gets the first time the task is created adding maxDuration and compares against current time which is always passed since the last executed time that is why it is failing instantly after restart.
2- For now the state is changed to Running directly when restarting, keeping it difficult to track the first start time since Running states is cleaned from history between LoopUntil iterations.
Solution
1- This can be done by adding Restart state for LoopUntil task before changing to Running to record time of restart and keeping it at the history as like Created to be easy to use it at comparison at LoopUntil.reachedMaximums() for subsequent iterations for the new restarted loop until execution.
2- Preserve Created and Restarted cases only for flowable task run history and get the recent to use it in checks (which is Restarted in our case)
Testing
1- Added a LoopUntil restart and replay test which should assert that the final restarted or replayed execution is failed and we have restarted state for the LoopUntil task and ends with Failed state and the time elapsed between both is the same for max duration to make sure that the max duration period is respected after restart/replay of LoopUntil task
Screen after fix
1- This screen shows the state history for replayed LoopUntil task with Max duration 3M and Interval 1M
2- A small note: the old history of running and failed doesn't appear between created and restarted after an iteration is done and starting a new one, since we are cleaning flowable task run history each time a new iteration start keeping only created and restarted for avoiding large history as stated but max duration is respected now as shown.