-
Notifications
You must be signed in to change notification settings - Fork 1.8k
feat: resolve steps referencing StepActions concurrently #8925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: resolve steps referencing StepActions concurrently #8925
Conversation
|
The following is the coverage report on the affected files.
|
/kind feature |
fbe1cd7
to
e48f5d1
Compare
The following is the coverage report on the affected files.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, it looks good!
A few minor comments, nothing blocking.
/approve
pkg/apis/config/testdata/config-defaults-step-action-parallelism-limit-err.yaml
Outdated
Show resolved
Hide resolved
pkg/apis/config/testdata/config-defaults-step-action-parallelism-limit.yaml
Outdated
Show resolved
Hide resolved
} | ||
|
||
// HasStepRefs provides a fast check to see if any steps in a TaskSpec contain a reference to a StepAction. | ||
func HasStepRefs(taskSpec *v1.TaskSpec) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this exported only so that it may have dedicated unit tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, initially the function was private but not wanting to modify the test file too much, I made it public, thinking that it could be a function that could be useful in other contexts.
However, I introduced this commit to make it private if that makes more sense to you.
Let me know what your preference is and I'd be happy to squash or remove that commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Maximilien-R for the extra commit. Either way is probably ok.
We usually don't export functions unless we need to, but we also have a policy (which we don't always honour), to only tests for exported functions, meaning that other functions can only be tested indirectly through their calling function.
In this case, I'm not sure which of the two policies would win, it seems reasonable to have unit tests specifically for that function. @vdemeester any preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@afrittoli @vdemeester should we document this under the contribution guide if not documented already ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default-step-action-parallelism-limit
doesn't specify exactly what is being parallelized, and don't think these would be running in parallel as your PR adds changes to throttle the concurrency of StepAction resolution go routines (g.Go() usage) which doesn't guarantee parallel CPU execution. I believe using the term concurrency
here is much better.
What do you think about updating the config key to another identifier which reflects this better default-step-ref-concurrency-limit
(considering this is to resolve the ref
erences), or default-step-action-concurrency-limit
(replacing the existing parallemism
from the key mentioned in this PR to concurrency
) or maybe something else ...
/retest |
c3c4e4a
to
c2ae737
Compare
The following is the coverage report on the affected files.
|
The following is the coverage report on the affected files.
|
@afrittoli, I have applied the various proposed corrections as well as converting the public @waveywaves As suggested, I replaced the various occurrences of the notion of "parallelism" with "concurrency" and "step action" with "step ref", indeed, this seems more coherent to me. I also took the opportunity to expand the documentation around the configuration key to make things more explicit and detailed. Note: I've pushed unit commits to make it easier to review and identify the changes made. When you're satisfied with the results, I'd be glad to squash everything into one commit and update the body and message of my initial commit. |
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on #8925 (comment), I see a -2.9% delta on one of the files wrt unit test coverage. Can we try having a smaller delta <0.3-0.5%?
Apart from that it looks good, I can lgtm after we have this one update 👼
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: afrittoli, waveywaves The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
c2ae737
to
72fc43f
Compare
The following is the coverage report on the affected files.
|
Hi @waveywaves, I added two more commits:
I also took the opportunity to rebase my branch on If the two added commits are fine with you I can squash them too 👍 |
@Maximilien-R thank you for your work, I'll review it soon, if you can squash it that would be great |
Avoids unnecessary DeepCopy operations on steps that do not reference a StepAction. Introduces concurrent resolution of steps that reference StepActions to improve the performance of TaskRun reconciliation, especially when using remote resolvers like git. The key changes include: - `hasStepRefs` function: A new function that quickly checks if a `TaskSpec` contains any steps referencing `StepActions`. This allows for an early exit if no resolution is needed, avoiding unnecessary work. - `resolveStepRef` function: This new function encapsulates the logic for resolving a single `StepAction` reference. It handles fetching the remote resource, merging the `StepAction` with the step's specification, and returning the resolved step - Two-phase resolution: The `GetStepActionsData` function is now split into two distinct phases: - Concurrent Resolution: All `StepAction` references are resolved concurrently using an `errgroup`. - Sequential Merging: The resolved steps and their provenance are merged into the final step list and the `TaskRun` status sequentially. - `updateTaskRunProvenance` function: A dedicated function for updating the TaskRun's status with provenance information. The maximum number of StepActions that can be resolved concurrently is defined by the default config and its `default-step-ref-concurrency-limit` key.
72fc43f
to
9b1f2e7
Compare
The following is the coverage report on the affected files.
|
/ok-to-test |
/lgtm thank you for your work on this very useful feature! |
@waveywaves It seems the e2e-tests failed, is it possible to rerun it ? Thank you |
/retest |
This PR improves the performance of
TaskRun
reconciliation by resolvingStepActions
concurrently and refactors the resolution logic for better efficiency.The problem
Currently, when a
Task
contains multiple steps that referenceStepActions
, the resolution of these references is performed sequentially. This can lead to significant delays in starting aTaskRun
, particularly when using remote resolvers like git, as each resolution adds to the total time.Additionally, the existing code performs a deep copy of every step, regardless of whether it references a
StepAction
, leading to unnecessary memory allocations.The changes
This pull request introduces two main improvements to
StepAction
resolution:Concurrent resolution:
StepActions
are now resolved concurrently using anerrgroup
. This reduces the time required to processTaskRuns
that contain multiple steps with remoteStepAction
references, such as those from a git repository.Code refactoring: The resolution logic in
taskspec.go
has been refactored for clarity and maintainability. This includes:HasStepRefs
function for an early exit if noStepActions
need to be resolved.resolveStepRef
function to encapsulate the logic of resolving a singleStepAction
.updateTaskRunProvenance
function to handle status updates cleanly.DeepCopy
to only occur when astep.Ref
is present./kind feature