Skip to content

GREP-291: OnDelete update strategy for PodCliqueSet#403

Open
renormalize wants to merge 7 commits intoai-dynamo:mainfrom
renormalize:291-ondelete-update-standalone-podclique
Open

GREP-291: OnDelete update strategy for PodCliqueSet#403
renormalize wants to merge 7 commits intoai-dynamo:mainfrom
renormalize:291-ondelete-update-standalone-podclique

Conversation

@renormalize
Copy link
Collaborator

@renormalize renormalize commented Feb 6, 2026

What type of PR is this?

What this PR does / why we need it:

/kind feature
/kind api

Introduces GREP-291.

#291 mentions certain realities of performing day 2 operations on a PodCliqueSet. The author mentions changing nodeAffinity in the pod template of the PodCliqueSet to ensure that pods are not scheduled onto failed nodes.

However, changing nodeAffinity will cause all pods of the PodClique to roll, which is undesirable since there might only be a small subset of pods that were on failed nodes. Therefore, to avoid the rolling of all pods in the PodClique, taking inspiration from Kubernetes StatefulSet, I would like to introduce an OnDelete update strategy for PodCliqueSet, where users can then update the nodeAffinity, and avoid rolling of all the pods in a PodClique.

Which issue(s) this PR fixes:

Fixes #291

Special notes for your reviewer:

N/A

Does this PR introduce a API change?

Introduce the new `spec.updateStrategy` field to specify `OnDelete` update strategy for `PodCliqueSet`

Additional documentation e.g., enhancement proposals, usage docs, etc.:

Introduce GREP-291 for the `OnDelete` update strategy for `PodCliqueSet`

@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 6, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@renormalize
Copy link
Collaborator Author

In a discussion with the maintainers, they had expressed the following opinions:

  • Change the scope of the GREP to define OnDelete for both PodCliques and ScalingGroups. Implementation can be split into different stages.
  • Rename RollingUpdate to RollingRecreate, and define what RollingRecreate is. The update strategy currently implemented in grove is not strictly RollingUpdate as referred to by the community, and therefore it would be incorrect to call it that.
  • Keep update strategy only at the PodCliqueSet level. In the future, update strategy can be defined at each hierarchy to specify different behavior for different PodCliques/ScalingGroups. This would align with the convention used for topology configuration in PCS.
  • ScalingGroups and (standalone) PodCliques both will follow the same strategy. This can be configured hierarchically at all levels later on if needed, as mentioned above.

@unmarshall @sanjaychatterjee @Ronkahn21 @gflarity had suggested the above.

Please refrain from reviews before I adapt the GREP with the above suggestions, thanks.

@renormalize renormalize marked this pull request as draft February 9, 2026 17:52
Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
@renormalize renormalize force-pushed the 291-ondelete-update-standalone-podclique branch from 9bbebcf to 1aed557 Compare February 10, 2026 13:20
@renormalize renormalize changed the title GREP-291: OnDelete update strategy for standalone PodCliques. GREP-291: OnDelete update strategy for PodCliqueSet Feb 10, 2026
@renormalize renormalize marked this pull request as ready for review February 10, 2026 13:22

Introduce a new `spec.updateStrategy` field in the `PodCliqueSet` specification. This field will accept an `UpdateStrategy` object that defines how replicas should be updated when templates change.

Two update strategy types will be supported:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now since the default RollingRecreate strategy is the currently implemented strategy. At some point, we also need to introduce a rolling recreate strategy that rolls entire PCS replicas. How would you name that, PCSRecreate? Just trying to ensure we name these strategies correctly. cc @nvrohanv

Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
Copy link
Collaborator

@sanjaychatterjee sanjaychatterjee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good to me. Not convinced about the need for the extra condition.

Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
Copy link
Collaborator

@sanjaychatterjee sanjaychatterjee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks!

* Remove (unnecessary) API changes proposed to `PodClique` and `PodCliqueScalingGroup`

* Rename `*RollingUpdateProgress` status fields to `*UpdateProgress` status fields to
  more generically track update information.

Signed-off-by: Saketh Kalaga <saketh.kalaga@sap.com>
@renormalize
Copy link
Collaborator Author

After a proof of concept implementation, I had second thoughts about certain fields. I've reworked the GREP based on these opinions.

  • I've removed the .spec changes to PodClique and PodCliqueScalingGroup. I feel we should not include the update strategy as a part of the spec for these resources now, and only include them when absolutely necessary. The update strategy can be fetched from the PodCliqueSet resource. I think this aligns with the original comments from the codeowners as well.
  • I've renamed the rollingUpdateProgress status fields to updateProgress, to more generically track information about updates no matter what the update strategy is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enhance the rolling update approach for PodClique

2 participants