Skip to content

poc/kagenti-integration#2354

Draft
jezekra1 wants to merge 2 commits intomainfrom
poc/kagenti-integration
Draft

poc/kagenti-integration#2354
jezekra1 wants to merge 2 commits intomainfrom
poc/kagenti-integration

Conversation

@jezekra1
Copy link
Collaborator

@jezekra1 jezekra1 commented Mar 9, 2026

Signed-off-by: Radek Ježek radek.jezek@ibm.com

Summary

Refs #2304

Linked Issues

Documentation

  • No Docs Needed:

"skopeo",
"copy",
*(["--src-username", "x-access-token", "--src-password", github_token] if github_token else []),
*(["--src-username", "x-access-token", "--src-password", github_token] if github_token and image.startswith("ghcr.io/") else []),

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High

The string
ghcr.io/
may be at an arbitrary position in the sanitized URL.

Copilot Autofix

AI 3 days ago

In general, the fix is to avoid treating the entire image string as a URL-like blob and instead extract the registry/host portion using a proper parser or a dedicated image-reference parsing helper. Once we have the registry part (e.g., ghcr.io), we compare that exact value against the allowlisted host, rather than using startswith("ghcr.io/") on the whole string.

For this code, the most targeted change is to replace the image.startswith("ghcr.io/") checks with a helper that derives the registry from the image string in a Docker/OCI-compatible way. A simple and safe approach without changing behavior is:

  • Split on the first / to separate a potential registry from the remainder.
  • If the part before the / contains a . or : or equals localhost, treat it as a registry (matching Docker’s reference rules). Otherwise, fall back to the default Docker registry, which we can treat as not ghcr.io for our purposes.
  • Compare that derived registry to ghcr.io.

This keeps semantics consistent with how Docker interprets image references but removes the fragile prefix check. To keep the change minimal and localized, we’ll:

  1. Add a small internal function _is_ghcr_image(image: str) -> bool near the top of the file (after the globals), which implements registry extraction logic.
  2. Replace both uses of image.startswith("ghcr.io/") (in the *(...) expansion for Skopeo arguments and in the env={...} expression) with _is_ghcr_image(image).

No external dependencies are needed; we can implement the parsing with basic string operations.

Suggested changeset 1
apps/agentstack-cli/src/agentstack_cli/commands/platform.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/apps/agentstack-cli/src/agentstack_cli/commands/platform.py b/apps/agentstack-cli/src/agentstack_cli/commands/platform.py
--- a/apps/agentstack-cli/src/agentstack_cli/commands/platform.py
+++ b/apps/agentstack-cli/src/agentstack_cli/commands/platform.py
@@ -44,6 +44,30 @@
 configuration = Configuration()
 
 
+def _is_ghcr_image(image: str) -> bool:
+    """
+    Return True if the given image reference is hosted on ghcr.io.
+
+    This parses the image reference in a Docker/OCI-compatible way:
+    - If the part before the first '/' contains a '.' or ':' or is 'localhost',
+      it is treated as the registry host.
+    - Otherwise, the default Docker registry is assumed, which is not ghcr.io.
+    """
+    if not image:
+        return False
+    # Split "<registry>/<rest>" once
+    first, sep, rest = image.partition("/")
+    if not sep:
+        # No '/' at all; implicit default registry
+        return False
+    registry_candidate = first
+    # Docker treats components with '.' or ':' or equal to 'localhost' as registries
+    if "." in registry_candidate or ":" in registry_candidate or registry_candidate == "localhost":
+        return registry_candidate == "ghcr.io"
+    # Otherwise, this is a namespace on the default Docker registry
+    return False
+
+
 @functools.cache
 def detect_driver() -> typing.Literal["lima", "wsl"]:
     has_lima = (importlib.resources.files("agentstack_cli") / "data" / "bin" / "limactl").is_file() or shutil.which("limactl")
@@ -778,12 +802,12 @@
                     [
                         "skopeo",
                         "copy",
-                        *(["--src-username", "x-access-token", "--src-password", github_token] if github_token and image.startswith("ghcr.io/") else []),
+                        *(["--src-username", "x-access-token", "--src-password", github_token] if github_token and _is_ghcr_image(image) else []),
                         f"docker://{image}",
                         f"containers-storage:{image}",
                     ],
                     f"Pulling image {image}",
-                    env={"GITHUB_TOKEN": github_token} if github_token and image.startswith("ghcr.io/") else None,
+                    env={"GITHUB_TOKEN": github_token} if github_token and _is_ghcr_image(image) else None,
                 )
 
         # --- Kagenti platform installation ---
EOF
@@ -44,6 +44,30 @@
configuration = Configuration()


def _is_ghcr_image(image: str) -> bool:
"""
Return True if the given image reference is hosted on ghcr.io.

This parses the image reference in a Docker/OCI-compatible way:
- If the part before the first '/' contains a '.' or ':' or is 'localhost',
it is treated as the registry host.
- Otherwise, the default Docker registry is assumed, which is not ghcr.io.
"""
if not image:
return False
# Split "<registry>/<rest>" once
first, sep, rest = image.partition("/")
if not sep:
# No '/' at all; implicit default registry
return False
registry_candidate = first
# Docker treats components with '.' or ':' or equal to 'localhost' as registries
if "." in registry_candidate or ":" in registry_candidate or registry_candidate == "localhost":
return registry_candidate == "ghcr.io"
# Otherwise, this is a namespace on the default Docker registry
return False


@functools.cache
def detect_driver() -> typing.Literal["lima", "wsl"]:
has_lima = (importlib.resources.files("agentstack_cli") / "data" / "bin" / "limactl").is_file() or shutil.which("limactl")
@@ -778,12 +802,12 @@
[
"skopeo",
"copy",
*(["--src-username", "x-access-token", "--src-password", github_token] if github_token and image.startswith("ghcr.io/") else []),
*(["--src-username", "x-access-token", "--src-password", github_token] if github_token and _is_ghcr_image(image) else []),
f"docker://{image}",
f"containers-storage:{image}",
],
f"Pulling image {image}",
env={"GITHUB_TOKEN": github_token} if github_token and image.startswith("ghcr.io/") else None,
env={"GITHUB_TOKEN": github_token} if github_token and _is_ghcr_image(image) else None,
)

# --- Kagenti platform installation ---
Copilot is powered by AI and may make mistakes. Always verify output.
],
f"Pulling image {image}",
env={"GITHUB_TOKEN": github_token} if github_token else None,
env={"GITHUB_TOKEN": github_token} if github_token and image.startswith("ghcr.io/") else None,

Check failure

Code scanning / CodeQL

Incomplete URL substring sanitization High

The string
ghcr.io/
may be at an arbitrary position in the sanitized URL.
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant architectural shift by integrating Kagenti into the Agent Stack CLI. This integration simplifies agent management, enhances scalability, and provides a more streamlined local development experience. The changes involve removing custom Kubernetes management components and adopting Kagenti's deployment and discovery mechanisms.

Highlights

  • Kagenti Integration: This PR integrates Kagenti for agent scaling, deployment, and discovery, replacing the custom Kubernetes provider management.
  • Architecture Refactoring: The architecture is refactored to delegate agent lifecycle management to Kagenti, streamlining the local developer experience and enabling optional enterprise features.
  • Dependency Updates: Removes dependencies related to the previous Kubernetes provider management and updates configurations to align with Kagenti's architecture.
  • UI and API Changes: Updates the UI and API endpoints to reflect the new architecture, including changes to agent URLs and authentication methods.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • apps/agentstack-cli/src/agentstack_cli/init.py
    • Removed the 'build' command and related UI elements.
    • Updated UI URL to agentstack.localtest.me:8080.
  • apps/agentstack-cli/src/agentstack_cli/api.py
    • Modified the OpenAI client to remove the Authorization header from default headers.
  • apps/agentstack-cli/src/agentstack_cli/auth_manager.py
    • Added 'login_with_password' method for authentication using resource owner password grant.
  • apps/agentstack-cli/src/agentstack_cli/commands/agent.py
    • Removed GitHub repository related logic and Dockerfile options from the 'add' and 'update' commands.
    • Modified 'add' and 'update' commands to accept network URLs instead of Docker images or GitHub URLs.
    • Removed discovery timeout constants.
    • Removed environment variable management commands.
    • Modified agent listing to remove state sorting and missing environment variable display.
  • apps/agentstack-cli/src/agentstack_cli/commands/build.py
    • Removed the entire file, as build functionality is now handled by Kagenti.
  • apps/agentstack-cli/src/agentstack_cli/commands/platform.py
    • Modified the 'start' command to include Kagenti dependencies and configuration.
    • Added logic to parse chart-scoped values from YAML files and command-line flags.
    • Modified image pulling and Helm installation processes to accommodate Kagenti.
  • apps/agentstack-cli/src/agentstack_cli/commands/self.py
    • Updated the UI URL to agentstack.localtest.me:8080.
  • apps/agentstack-cli/src/agentstack_cli/commands/server.py
    • Added a shortcut for local development login using resource owner password grant.
  • apps/agentstack-cli/src/agentstack_cli/configuration.py
    • Removed 'agent_registry' and added auto-recovery for local dev authentication.
    • Removed HttpUrl import.
  • apps/agentstack-cli/src/agentstack_cli/data/vm/common/etc/systemd/system/kubectl-port-forward@.service
    • Modified the kubectl port-forward service to support namespace specification.
  • apps/agentstack-cli/src/agentstack_cli/utils.py
    • Removed 'print_log' function and GitHub URL related logic.
  • apps/agentstack-cli/uv.lock
    • Updated the required Python version to 3.14.
  • apps/agentstack-sdk-py/src/agentstack_sdk/a2a/extensions/services/platform.py
    • Updated the default platform URL to agentstack-api.localtest.me:8080.
  • apps/agentstack-sdk-py/src/agentstack_sdk/platform/init.py
    • Removed imports related to provider builds and discovery.
  • apps/agentstack-sdk-py/src/agentstack_sdk/platform/client.py
    • Updated the default platform URL to agentstack-api.localtest.me:8080.
  • apps/agentstack-sdk-py/src/agentstack_sdk/platform/common.py
    • Removed GithubVersionType, ResolvedGithubUrl, and ResolvedDockerImageID classes.
  • apps/agentstack-sdk-py/src/agentstack_sdk/platform/provider.py
    • Removed auto_stop_timeout, version_info, registry, and related logic.
    • Added source_type and simplified the Provider model.
    • Removed environment variable management methods.
  • apps/agentstack-sdk-py/src/agentstack_sdk/server/server.py
    • Removed AgentExtension import and environment variable reloading logic.
    • Modified the serve method to handle self-registration without a client factory.
  • apps/agentstack-sdk-ts/src/experimental/server/core/config/schemas.ts
    • Updated the default platform URL to agentstack-api.localtest.me:8080.
  • apps/agentstack-server/src/agentstack_server/api/auth/auth.py
    • Removed provider_builds and provider_variables permissions.
  • apps/agentstack-server/src/agentstack_server/api/dependencies.py
    • Removed ProviderBuildServiceDependency and ProviderDiscoveryServiceDependency.
  • apps/agentstack-server/src/agentstack_server/api/routes/provider_builds.py
    • Removed the entire file, as build functionality is now handled by Kagenti.
  • apps/agentstack-server/src/agentstack_server/api/routes/provider_discovery.py
    • Removed the entire file, as discovery functionality is now handled by Kagenti.
  • apps/agentstack-server/src/agentstack_server/api/routes/providers.py
    • Removed auto_stop_timeout, version_info, registry, and related logic.
    • Modified the create_provider and patch_provider methods to align with the new architecture.
  • apps/agentstack-server/src/agentstack_server/application.py
    • Removed imports and routes related to provider builds and discovery.
  • apps/agentstack-server/src/agentstack_server/bootstrap.py
    • Removed KubernetesProviderBuildManager and KubernetesProviderDeploymentManager injection.
  • apps/agentstack-server/src/agentstack_server/configuration.py
    • Removed AgentRegistryConfiguration and added KagentiConfiguration.
    • Updated the default issuer URLs for Keycloak.
  • apps/agentstack-server/src/agentstack_server/domain/constants.py
    • Removed DOCKER_MANIFEST_LABEL_NAME and DEFAULT_AUTO_STOP_TIMEOUT.
  • apps/agentstack-server/src/agentstack_server/domain/models/permissions.py
    • Removed provider_variables and provider_builds permissions.
  • apps/agentstack-server/src/agentstack_server/domain/models/provider.py
    • Removed auto_stop_timeout, version_info, registry, and related logic.
    • Added source_type and simplified the Provider model.
  • apps/agentstack-server/src/agentstack_server/domain/repositories/provider.py
    • Removed type and unmanaged_state parameters from the list method.
    • Removed update_unmanaged_state method and replaced it with update_state.
  • apps/agentstack-server/src/agentstack_server/exceptions.py
    • Removed BuildAlreadyFinishedError and MissingConfigurationError.
  • apps/agentstack-server/src/agentstack_server/infrastructure/kagenti/init.py
    • Added an empty init file.
  • apps/agentstack-server/src/agentstack_server/infrastructure/kagenti/client.py
    • Added a KagentiClient class for interacting with the Kagenti API.
  • apps/agentstack-server/src/agentstack_server/infrastructure/persistence/migrations/alembic/versions/c0095389475b_.py
    • Removed managed provider features and added Kagenti sync support.
  • apps/agentstack-server/src/agentstack_server/infrastructure/persistence/repositories/provider.py
    • Removed type and unmanaged_state parameters from the list method.
    • Removed update_unmanaged_state method and replaced it with update_state.
  • apps/agentstack-server/src/agentstack_server/service_layer/build_manager.py
    • Removed the entire file, as build functionality is now handled by Kagenti.
  • apps/agentstack-server/src/agentstack_server/service_layer/deployment_manager.py
    • Removed the entire file, as deployment functionality is now handled by Kagenti.
  • apps/agentstack-server/src/agentstack_server/service_layer/services/a2a.py
    • Removed deployment manager and related logic.
    • Simplified the ensure_agent method.
  • apps/agentstack-server/src/agentstack_server/service_layer/services/provider_build.py
    • Removed the entire file, as build functionality is now handled by Kagenti.
  • apps/agentstack-server/src/agentstack_server/service_layer/services/provider_discovery.py
    • Removed the entire file, as discovery functionality is now handled by Kagenti.
  • apps/agentstack-server/tasks.toml
    • Removed provider_build and provider_discovery queues.
  • apps/agentstack-ui/src/utils/constants.ts
    • Updated the default API URL to agentstack-api.localtest.me:8080.
  • apps/agentstack-ui/tasks.toml
    • Updated the schema generation script to use the new API URL.
  • apps/agentstack-ui/template.env
    • Updated the default API URL to agentstack-api.localtest.me:8080 and NEXTAUTH_URL to agentstack.localtest.me:8080.
Activity
  • Removed KubernetesProviderDeploymentManager and KubernetesProviderBuildManager.
  • Integrated Kagenti for agent scaling, deployment, and discovery.
  • Updated configurations to align with Kagenti's architecture.
  • Modified UI and API endpoints to reflect the new architecture.
  • Removed environment variable management commands.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant refactoring to integrate kagenti for agent lifecycle management, replacing the custom Kubernetes provider deployment and build system. The changes are extensive, touching many parts of the codebase from the CLI to the server-side services and database models. Key changes include removing the build command, simplifying the add and update agent commands, and introducing a new kagenti client for agent discovery. The provider model has been greatly simplified, and a new database migration reflects these changes. Overall, the changes are consistent with the goal of delegating agent management to kagenti. I've identified a few areas where maintainability could be improved by reducing hardcoded values in the platform setup scripts.

Note: Security Review did not run due to the size of the PR.

Comment on lines +771 to +782
await run_in_vm(
vm_name,
[
"bash",
"-c",
textwrap.dedent("""\
kubectl --kubeconfig=/kubeconfig apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.0/standard-install.yaml
kubectl --kubeconfig=/kubeconfig apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.17.2/cert-manager.yaml
kubectl --kubeconfig=/kubeconfig wait --for=condition=Available deployment -n cert-manager cert-manager-webhook --timeout=120s
"""),
],
"Installing kagenti prerequisites (Gateway API CRDs, cert-manager)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The URLs for installing Gateway API CRDs and cert-manager are hardcoded with specific versions (v1.4.0 and v1.17.2 respectively). While this ensures reproducibility, it makes updates require code changes. To improve maintainability, consider defining these versions as constants at the top of the file.

Comment on lines +785 to +802
await run_in_vm(
vm_name,
[
"bash",
"-c",
textwrap.dedent("""\
ISTIO_VERSION=1.28.0
ISTIO_REPO=https://istio-release.storage.googleapis.com/charts/
helm repo add istio "$ISTIO_REPO" 2>/dev/null || true
helm repo update istio
kubectl --kubeconfig=/kubeconfig create namespace istio-system --dry-run=client -o yaml | kubectl --kubeconfig=/kubeconfig apply -f -
helm upgrade --install istio-base istio/base --version=$ISTIO_VERSION --namespace=istio-system --kubeconfig=/kubeconfig --wait --force-conflicts
helm upgrade --install istiod istio/istiod --version=$ISTIO_VERSION --namespace=istio-system --kubeconfig=/kubeconfig --wait --force-conflicts \
--set pilot.resources.requests.cpu=50m \
--set pilot.resources.requests.memory=256Mi
"""),
],
"Installing Istio (Gateway API controller)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The Istio version (1.28.0) is hardcoded within this shell script block. This could lead to maintenance issues if kagenti's Istio dependency changes in the future. It would be more maintainable to define this version as a constant at the top of the file, making it easier to update.

agent_namespace = agent.get("namespace", namespace)

# Construct service URL from k8s naming convention (service port 8080)
url = f"http://{name}.{agent_namespace}.svc.cluster.local:8080"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The agent service URL is constructed with a hardcoded port 8080. If kagenti allows agents to run on different ports, this could cause connection issues. If this is a fixed convention from kagenti, adding a comment to clarify this would be helpful. For more robustness, consider if the port can be discovered from the service definition rather than being hardcoded.

- Replace Docker/registry-based providers with network-only providers
- Add kagenti agent sync cron and provider health check refresh
- Expose otel-collector via HTTPRoute for local agent telemetry
- Upgrade Phoenix image to 12.31.2 for GraphQL API compatibility
- Fix server traces reaching otel-collector (port 4318→8335)
- Set default OTEL endpoint in Python SDK for local deployments
- Simplify provider model: ProviderState (online/offline) replaces
  ProviderType/ProviderStatus/ProviderUnmanagedStatus
- Remove Docker image labels, GitHub version resolving, provider builds
- Fix checkbox selection UX in agent remove command

Signed-off-by: Radek Ježek <radek.jezek@ibm.com>
@jezekra1 jezekra1 force-pushed the poc/kagenti-integration branch from a95917a to 898c888 Compare March 12, 2026 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant