Skip to content

Conversation

@FAUST-BENCHOU
Copy link
Contributor

What type of PR is this?
/kind documentation

What this PR does / why we need it:

Which issue(s) this PR fixes:
Part of #603

@volcano-sh-bot volcano-sh-bot added the kind/documentation Improvements or additions to documentation label Jan 25, 2026
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign git-malu for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gemini-code-assist
Copy link

Summary of Changes

Hello @FAUST-BENCHOU, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new documentation proposal that aims to significantly improve the observability of ModelServing resources. The core change is the addition of a RoleStatuses field to ModelServing.Status, which will expose granular status information for individual roles within ServingGroups. This enhancement will allow users to easily monitor the activation state of all roles, helping to quickly identify and troubleshoot issues where roles might be stuck in Creating or Deleting states, thereby providing a clearer understanding of a ModelServing's overall readiness and health.

Highlights

  • Enhanced ModelServing Status: A proposal to add a new RoleStatuses field to ModelServing.Status to provide detailed visibility into the status of individual roles across all ServingGroups.
  • New API Structure: Introduction of a RoleStatusInfo struct, which will contain ServingGroupName, RoleName, RoleID, and Status to uniquely identify and report on each role instance.
  • Implementation Strategy: Outlines how role statuses will be collected from ServingGroups and updated in ModelServing.Status only when changes are detected, minimizing API calls.
  • Test Plan: Specifies unit tests (TestUpdateModelServingStatusRoleStatuses and TestUpdateModelServingStatusRoleStatusesChangeDetection) to validate the new status collection and update logic under various scenarios.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a design proposal to add role-level status to the ModelServing.Status object. The proposal is well-structured and clearly outlines the motivation and design. However, it is missing some critical sections. I've added comments to suggest populating the 'Non-Goals', 'Risks and Mitigations', and 'Alternatives' sections. In particular, the risk of the status object becoming too large with many roles and replicas needs to be carefully considered and mitigated.

Comment on lines +107 to +109
#### Risks and Mitigations


Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The Risks and Mitigations section is currently empty. It's crucial to identify potential risks and outline mitigation strategies. Please consider adding risks such as:

  • Increased size of the ModelServing status object: With a large number of ServingGroups and Roles, the RoleStatuses array could become very large, potentially exceeding etcd's object size limits and increasing load on the API server. A mitigation could be to set a limit on the number of statuses reported or to summarize them if the list grows too large.
  • Increased controller load: Collecting and comparing role statuses for every reconciliation loop could increase the controller's CPU and memory usage, especially for large-scale deployments.

- Enable users to identify which roles are not yet activated (Creating/Deleting states)
- Maintain consistency with existing ServingGroup status display pattern

#### Non-Goals

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Non-Goals section is currently empty. It would be beneficial to explicitly state what is out of scope for this proposal to clarify its boundaries. For example:

  • Exposing detailed pod-level information (e.g., pod IPs, node names).
  • Modifying the role lifecycle management, as this proposal focuses only on status reporting.
  • Providing metrics or events for role status changes (if that's handled separately).

- `TestUpdateModelServingStatusRoleStatuses`: Validates `status.roleStatuses` population across scenarios (Running, Creating, Deleting states; multiple ServingGroups; skipping Deleting groups; empty cases)
- `TestUpdateModelServingStatusRoleStatusesChangeDetection`: Validates roleStatuses update only when status changes

### Alternatives

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Alternatives section is empty. Discussing alternative designs and explaining why the proposed solution was chosen strengthens the proposal. For example, you could consider and discuss the trade-offs of:

  • Exposing status via a separate CRD: A new ModelServingRoleStatus CRD could hold this information. This would avoid bloating the main ModelServing object but might be harder for users to discover and correlate.
  • Using Kubernetes events: Role status changes could be published as events. This is less declarative and might be harder to inspect for the current state.

// RoleStatuses track the status of roles across all ServingGroups.
// This allows users to view which roles have not been activated.
// +optional
RoleStatuses []RoleStatusInfo `json:"roleStatuses,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very unefficient to update the slice

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about emitting k8s events

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a recorder to emit role status events

Signed-off-by: zhoujinyu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/documentation Improvements or additions to documentation size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants