Skip to content

Conversation

@dittops
Copy link
Member

@dittops dittops commented Jan 25, 2026

Summary

  • Add Scale Deployment action to set fixed replica count for deployments
  • Refactor deployment actions (create, delete, ratelimit) with proper service invocation
  • Add internal authentication support to budapp autoscale endpoints for pipeline calls
  • Fix empty endpoint dropdown in budadmin pipeline editor

Changes

BudPipeline Actions

  • Scale Deployment: New action to set minReplicas = maxReplicas = target_replicas
  • Deploy Model: Refactored with simulation support and smart mode
  • Delete Deployment: Refactored with proper cleanup
  • Rate Limit: Refactored with multiple algorithm support (token_bucket, fixed_window, sliding_window)
  • Removed deprecated autoscale action (replaced by scale)
  • Removed cluster create/delete actions (not yet implemented)
  • Added CREATING_ACTIONS.md documentation guide

BudApp Internal Authentication

  • Added internal auth support to autoscale endpoints (/endpoints/{id}/autoscale)
  • Uses require_permissions_or_internal decorator pattern
  • Uses get_current_active_user_or_internal dependency
  • Enables budpipeline to call autoscale endpoints via Dapr service invocation

BudAdmin UI

  • Fixed empty endpoint dropdown in pipeline editor
  • Added endpoints to dataSources for ActionConfigPanel
  • Fetches endpoints when entering edit mode

ActionContext Enhancement

  • Added invoke_service() method for Dapr service invocation
  • Supports dapr-api-token header for internal authentication

Test plan

  • Verified scale action works via pipeline execution
  • Verified endpoint dropdown populates in pipeline editor (localhost:3000)
  • Pre-commit hooks pass (ruff, mypy, bandit, eslint, typecheck)

🤖 Generated with Claude Code

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @dittops, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the BudPipeline service by introducing a new action for precise deployment scaling and extensively refactoring existing deployment-related actions. It integrates internal authentication for secure service-to-service communication, particularly for autoscale endpoints, and improves the model deployment process with a "Smart Mode" that leverages simulation for optimal configuration. Additionally, the changes address a critical UI bug in the BudAdmin pipeline editor and streamline the action set by removing deprecated functionalities, all while providing detailed documentation for future action development.

Highlights

  • New Deployment Scaling Action: Introduced a deployment_scale action to set a fixed replica count for deployments by configuring minReplicas and maxReplicas in the BudAIScaler specification.
  • Refactored Deployment Actions: The deployment_create, deployment_delete, and deployment_ratelimit actions have been extensively refactored to use proper service invocation via budapp and handle asynchronous event-driven workflows for improved reliability and responsiveness.
  • Internal Authentication Support: Budapp's autoscale endpoints now support internal authentication, allowing budpipeline to securely invoke these services using Dapr service invocation and require_permissions_or_internal decorator.
  • "Smart Mode" for Model Deployment: The deployment_create action now includes a "Smart Mode" that automatically runs budsim simulations to determine optimal parser configurations (tool calling, reasoning, chat template) for local model deployments with SLO targets, before proceeding with the actual deployment.
  • Enhanced Rate Limiting: The deployment_ratelimit action now supports multiple algorithms (token bucket, fixed window, sliding window) and allows configuration of requests per second, minute, or hour.
  • Improved BudAdmin UI for Pipelines: Fixed an issue where the endpoint dropdown in the budadmin pipeline editor was empty, and added support for displaying credentials and endpoints as data sources.
  • Removed Deprecated Actions: The deployment_autoscale action and placeholder cluster_create/cluster_delete actions have been removed from budpipeline, streamlining the available action set.
  • New Documentation for Action Creation: A comprehensive CREATING_ACTIONS.md guide has been added to document best practices for developing new BudPipeline actions, covering structure, sync/event-driven actions, registration, external service calls, common pitfalls, and testing.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant enhancements to the budpipeline service, adding a new scale action and refactoring existing deployment actions for better service invocation and asynchronous handling. The introduction of internal authentication for budapp endpoints is a great security improvement, and the fix for the budadmin UI dropdown enhances usability.

My review has identified a critical security vulnerability due to hardcoded credentials in a new documentation file, which must be addressed immediately. I've also pointed out a high-risk issue with how database migrations are being run at application startup, and I've provided a recommendation for a safer approach using an init container. Additionally, there's a minor suggestion to improve code consistency in the budapp service.

Overall, the refactoring of the pipeline actions is well-executed, with robust event handling and validation. The new documentation for creating actions is also a valuable addition to the repository.

Comment on lines 22 to 23
Admin Email: admin@bud.studio
Admin Password: Budadmin@stud!0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-critical critical

Hardcoding credentials, especially passwords, in source code is a major security risk. The admin password Budadmin@stud!0 is exposed in this file. These credentials should be removed from the repository and managed securely using a secret management tool (e.g., Kubernetes Secrets, HashiCorp Vault, AWS Secrets Manager). The application should then be configured to read these secrets from the environment at runtime.

Suggested change
Admin Email: admin@bud.studio
Admin Password: Budadmin@stud!0
Admin Email: admin@bud.studio
Admin Password: [REDACTED - Load from a secure secret store]

image: {{ .Values.microservices.budpipeline.image }}
imagePullPolicy: {{ .Values.imagePullPolicy }}
command: ["sh", "-c", "uvicorn budpipeline.main:app --host 0.0.0.0 --port 8010"]
command: ["sh", "-c", "alembic upgrade head && uvicorn budpipeline.main:app --host 0.0.0.0 --port 8010"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Running database migrations (alembic upgrade head) as part of the application's startup command can be problematic in a distributed environment like Kubernetes. If multiple pods start simultaneously during a deployment or scaling event, they could all attempt to run migrations concurrently, leading to race conditions, database locks, and potentially a corrupted schema. A safer pattern is to run migrations as a separate, one-off task before the application starts.

A recommended approach is to use a Kubernetes initContainer to run the migration. This ensures the migration completes successfully once before any application containers start. Here is an example of how you could structure it:

spec:
  template:
    spec:
      initContainers:
      - name: budpipeline-migrations
        image: {{ .Values.microservices.budpipeline.image }}
        command: ["alembic", "upgrade", "head"]
        envFrom:
        - secretRef:
            name: {{ .Values.secrets.dbSecretName }}
        - configMapRef:
            name: {{ .Values.configmaps.budConfig }}
      containers:
      - name: budpipeline
        ...
        command: ["sh", "-c", "uvicorn budpipeline.main:app --host 0.0.0.0 --port 8010"]
        command: ["sh", "-c", "uvicorn budpipeline.main:app --host 0.0.0.0 --port 8010"]

)

# Build source_topic - include callback_topic for direct budpipeline notification
source_topics = [app_settings.source_topic, callback_topic] if callback_topic else app_settings.source_topic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The source_topics variable is assigned either a list or a string depending on whether callback_topic is present. This can lead to inconsistent types being sent in the payload to the budcluster service, which might not be robust enough to handle both. For better type safety and maintainability, it's recommended to always use a list.

        source_topics = [app_settings.source_topic]
        if callback_topic:
            source_topics.append(callback_topic)

app_id=settings.budapp_app_id,
method_path=f"endpoints/{endpoint_id}/delete-workflow",
http_method="POST",
data={"callback_topic": CALLBACK_TOPIC},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It's good practice to include the force parameter in the payload to the downstream service if it's intended to be used. Although the parameter is not yet implemented as per the description, forwarding it now will make future implementation easier without requiring another change in this action.

Suggested change
data={"callback_topic": CALLBACK_TOPIC},
data={"callback_topic": CALLBACK_TOPIC, "force": force},

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8d7dbb13de

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +99 to +105
budaiscaler_spec: dict[str, Any] = {
"enabled": True, # Must be enabled for scaling to work
"minReplicas": target_replicas,
"maxReplicas": target_replicas,
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Disallow zero replicas or map to valid autoscale spec

When target_replicas is 0 (explicitly allowed/mentioned as valid), the action sends minReplicas = maxReplicas = 0. Budapp’s BudAIScalerSpecification enforces maxReplicas >= 1, so /endpoints/{id}/autoscale will reject this with a 422 and the scale action will fail on any “scale to zero” request. Either disallow 0 in validation or translate it into a supported “disable autoscaling” shape so the API accepts it.

Useful? React with 👍 / 👎.

Comment on lines 157 to 160
@require_permissions(permissions=[PermissionEnum.ENDPOINT_MANAGE])
async def delete_endpoint(
current_user: Annotated[User, Depends(get_current_active_user)],
current_user: Annotated[User, Depends(get_current_active_user_or_internal)],
session: Annotated[Session, Depends(get_session)],

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Allow internal Dapr auth for delete-workflow

This route is still wrapped by require_permissions, whose dependency is get_current_active_user, so internal Dapr calls (like the new budpipeline delete action that uses service invocation without a JWT) will be rejected before your get_current_active_user_or_internal parameter ever runs. In practice, pipeline-driven deletes will 401/403 unless you switch the decorator to require_permissions_or_internal (or otherwise allow internal auth for this endpoint).

Useful? React with 👍 / 👎.

@dittops dittops force-pushed the feat/pipeline-deployment-actions branch 2 times, most recently from ea7b6b9 to d30cca7 Compare January 25, 2026 12:15
## Changes

### BudPipeline Actions
- Add Scale Deployment action to set fixed replica count
- Refactor Deploy Model action with simulation support and smart mode
- Refactor Delete Deployment action with proper cleanup
- Refactor Rate Limit action with multiple algorithm support
- Remove deprecated autoscale action (replaced by scale)
- Remove cluster create/delete actions (not yet implemented)
- Add CREATING_ACTIONS.md documentation guide

### BudApp Internal Authentication
- Add internal auth support to autoscale endpoints for pipeline calls
- Add internal auth support to delete-workflow endpoint
- Use require_permissions_or_internal decorator pattern
- Use get_current_active_user_or_internal dependency

### BudAdmin UI
- Fix empty endpoint dropdown in pipeline editor
- Add endpoints to dataSources for ActionConfigPanel
- Fetch endpoints when entering edit mode

### ActionContext Enhancement
- Add invoke_service method for Dapr service invocation
- Support dapr-api-token header for internal auth

### Helm Chart
- Add alembic migration to budpipeline startup

### Fixes from Review
- Fix source_topics type consistency (always use list)
- Add force param to delete action
- Update cluster tests for removed actions
- Fix scale to 0: ensure maxReplicas >= 1 per schema constraint

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@dittops dittops force-pushed the feat/pipeline-deployment-actions branch from d30cca7 to 9940b41 Compare January 25, 2026 12:18
dittops and others added 5 commits January 26, 2026 09:27
…ormat

The health.py action expects response in format {"data": {...}} but the
tests were mocking {"health": {...}}. This caused the test to fail because
the code would not find any health data and mark all checks as unavailable.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update deployment action tests to use endpoint_id instead of deployment_id
- Update test mocks to properly simulate successful execution
- Update action count expectation from 20 to 17 (actions were consolidated)
- Add credential_ref to expected ParamType values

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@dittops dittops merged commit cf4208a into master Jan 26, 2026
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants