fix: pipeline execution failures and hide dev features#1216
Conversation
- fix(budcluster): handle explicit None scaleToZeroConfig value
The scaleToZeroConfig could be explicitly set to None, causing
None.get() to fail. Changed to use `or {}` pattern.
- fix(budpipeline): finalize execution status on step failure
When a step failed, the execution status was never marked as FAILED
because the failure wasn't committed and _check_and_finalize_execution()
was never called. Added commit and finalization after both failure paths.
- feat(budadmin): hide Tools and Pipelines behind dev mode flag
Added hide: !enableDevMode to Tools and Pipelines menu items so they
are only visible when NEXT_PUBLIC_ENABLE_DEV_MODE=true at build time.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Summary of ChangesHello @dittops, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses critical bug fixes in the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces fixes for pipeline execution failures, handles a potential NoneType error, and hides development features behind a feature flag. The changes in budadmin and budcluster look good and correctly address the described issues. However, the fix in budpipeline for handling pipeline failures is not complete and may still result in pipelines getting stuck. I've added a critical comment explaining the issue. I've also added a medium-severity comment about code duplication in the same file.
| # Commit the failure and finalize execution | ||
| await session.commit() | ||
| updated_steps = await step_crud.get_by_execution_id(execution_id) | ||
| await _check_and_finalize_execution(session, execution_id, updated_steps) | ||
| continue |
There was a problem hiding this comment.
This change attempts to finalize the pipeline on step failure, but it is unlikely to work as intended. The _check_and_finalize_execution function will only set the final status if there are no PENDING or RUNNING steps. When a step fails, other steps in the pipeline will likely remain PENDING, so _check_and_finalize_execution will return early, and the pipeline will remain stuck instead of being marked as FAILED. The continue also means other parallel steps will be executed even after a failure, which may be undesirable.
There was a problem hiding this comment.
Fixed in commit 22afccb. Implemented fail-fast behavior:
- When a step fails and no steps are running, all pending steps are marked as SKIPPED
- The execution is then finalized as FAILED
This ensures pipelines don't get stuck when a step fails.
| # Commit the failure and finalize execution | ||
| await session.commit() | ||
| updated_steps = await step_crud.get_by_execution_id(execution_id) | ||
| await _check_and_finalize_execution(session, execution_id, updated_steps) |
There was a problem hiding this comment.
This block of failure handling logic is duplicated from lines 756-758. To improve maintainability, consider extracting this logic into a private helper function that can be called from both failure paths (here and in the except block).
For example:
async def _handle_failure_and_finalize(session: AsyncSession, step_crud: StepExecutionCRUD, execution_id: UUID):
"""Commits session changes and finalizes the execution status after a step failure."""
await session.commit()
# A more robust failure handling logic should be implemented here
# to ensure the pipeline doesn't get stuck.
updated_steps = await step_crud.get_by_execution_id(execution_id)
await _check_and_finalize_execution(session, execution_id, updated_steps)This would also be a good place to implement a more robust failure handling strategy that addresses the issue of stuck pipelines mentioned in the other comment.
There was a problem hiding this comment.
Fixed in commit 22afccb. Extracted the duplicated logic into _commit_and_finalize_on_failure() helper function which is now used in both failure paths.
Address Gemini code review feedback:
1. Critical fix: Implement fail-fast behavior in _check_and_finalize_execution
- When a step fails and no steps are running, mark all pending steps
as SKIPPED and finalize the execution as FAILED
- Prevents pipelines from getting stuck when a step fails
2. Medium fix: Extract duplicated failure handling into helper function
- Created _commit_and_finalize_on_failure() to eliminate code duplication
- Used in both exception path and result.success=False path
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
NoneforscaleToZeroConfig- preventsNoneType.get()error when scaling deployments to 0 replicasstatus=FAILEDinstead of staying stuckChanges
budcluster/deployment/handler.py
Changed
budaiscaler.get("scaleToZeroConfig", {})tobudaiscaler.get("scaleToZeroConfig") or {}to handle the case wherescaleToZeroConfigis explicitly set toNone.budpipeline/handlers/event_router.py
Added
session.commit()and_check_and_finalize_execution()calls after both failure paths (exception andresult.success=False) to ensure the execution status is properly marked as FAILED.budadmin/pages/home/layout.tsx
Added
hide: !enableDevModeto Tools and Pipelines menu items so they are only visible in dev builds.Test plan
NEXT_PUBLIC_ENABLE_DEV_MODE- verify Tools/Pipelines hiddenNEXT_PUBLIC_ENABLE_DEV_MODE=true- verify Tools/Pipelines visibleDeployment notes
docker build --build-arg NEXT_PUBLIC_ENABLE_DEV_MODE=true ...docker build ...(default hides dev features)🤖 Generated with Claude Code