Skip to content

Conversation

@ejaifeobuks
Copy link
Contributor

This PR enhances deployment diagnostics by providing more specific and actionable error reporting during rollout failures. Key improvements include:

Improved Manifest Stability: Adds detailed error reporting and logging to help identify deployment issues more precisely.
Aggregated Rollout Errors: Collects and throws a comprehensive error message summarizing all rollout failures.
Container Diagnostics: Introduces getContainerErrors to extract container status for better pod-level troubleshooting.
Verbose Failure Logging: Captures detailed kubectl describe outputs for rollout status, pod, and service checks to aid debugging.
Additionally, this PR includes unit tests to verify:

Proper aggregation of detailed error messages.
Accurate logging of describe outputs during failure scenarios.
This PR will close issue #288

@ejaifeobuks ejaifeobuks requested a review from a team as a code owner July 15, 2025 16:56
@benjaminbob21 benjaminbob21 requested a review from Copilot July 15, 2025 17:29

This comment was marked as outdated.

@ejaifeobuks ejaifeobuks self-assigned this Jul 15, 2025
Copy link
Member

@Tatsinnit Tatsinnit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for this, all changes look fairly comprehensive, so it will be super great to get regular eyes for review like @bosesuneha or @davidgamero , I noticed something which you could simplify but its for inspiration only: https://github.com/Azure/k8s-deploy/pull/440/files#r2213055224

Thank you once again for this PR. ❤️

@ejaifeobuks ejaifeobuks requested a review from davidgamero July 18, 2025 17:53
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances deployment diagnostics by adding comprehensive error reporting, detailed logging, and container-level troubleshooting capabilities to the manifest stability checking functionality. The improvements focus on providing actionable feedback when Kubernetes deployments fail.

  • Aggregated rollout error collection with detailed error messages including resource type, name, and namespace
  • Enhanced container diagnostics with a new getContainerErrors function to extract specific container failure reasons
  • Verbose failure logging that captures and displays kubectl describe outputs for failed resources

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/utilities/manifestStabilityUtils.ts Adds error aggregation, detailed logging, container diagnostics, and improved error handling for deployment failures
src/utilities/manifestStabilityUtils.test.ts Comprehensive test suite covering error scenarios, resource-specific behaviors, and the new container error extraction functionality

Copy link
Collaborator

@davidgamero davidgamero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ejaifeobuks ejaifeobuks requested a review from Tatsinnit July 23, 2025 14:50
@ReinierCC ReinierCC merged commit bf3422c into Azure:main Jul 29, 2025
13 checks passed
@bosesuneha bosesuneha mentioned this pull request Aug 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants