Skip to content

feat: add instance upgrade endpoint#186

Merged
nickpismenkov merged 7 commits intomainfrom
feat/instance-upgrade
Feb 27, 2026
Merged

feat: add instance upgrade endpoint#186
nickpismenkov merged 7 commits intomainfrom
feat/instance-upgrade

Conversation

@PierreLeGuen
Copy link
Contributor

@PierreLeGuen PierreLeGuen commented Feb 23, 2026

Summary

  • Adds POST /v1/agents/instances/{id}/upgrade endpoint that upgrades an instance to the latest image for its service type
  • The endpoint fetches current images from the owning compose-api's /version endpoint, maps service_type to the correct image key (worker/ironclaw), and restarts with that digest
  • Includes OpenAPI docs and follows existing restart endpoint patterns

Test plan

  • Deploy chat-api and verify POST /v1/agents/instances/{id}/upgrade returns 200
  • Verify the instance restarts with the latest image digest from compose-api
  • Verify ownership check — upgrading another user's instance returns 403

Users can upgrade their instance to the latest image via
POST /v1/agents/instances/{id}/upgrade. The endpoint fetches
the current images from the owning compose-api and restarts
the instance with the latest digest for its service type.
@gemini-code-assist
Copy link

Summary of Changes

Hello @PierreLeGuen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new API endpoint and underlying service logic to enable users to upgrade their agent instances. The functionality ensures that instances can be updated to the latest available image for their specific service type, enhancing maintainability and allowing for seamless deployment of new versions. It integrates with existing systems to fetch image information and performs necessary ownership checks before proceeding with the upgrade.

Highlights

  • New Upgrade Endpoint: A new POST /v1/agents/instances/{id}/upgrade endpoint has been added to allow users to upgrade their instances to the latest available image for their service type.
  • Image Fetching Logic: The upgrade process involves fetching current image versions from the owning compose-api's /version endpoint, mapping the instance's service_type to the correct image key (e.g., 'worker' or 'ironclaw'), and then initiating a restart with the newly acquired image digest.
  • API Integration and Documentation: The new endpoint includes comprehensive OpenAPI documentation and follows the established patterns of existing instance management endpoints like restart, ensuring consistency and ease of use.
Changelog
  • crates/api/src/openapi.rs
    • Registered the new upgrade_instance endpoint with the OpenAPI specification.
  • crates/api/src/routes/agents.rs
    • Defined a new POST endpoint /v1/agents/instances/{id}/upgrade.
    • Implemented the upgrade_instance handler, including instance ID parsing, ownership verification, and delegation to the agent service.
    • Added OpenAPI documentation annotations for the new upgrade endpoint.
  • crates/services/src/agent/ports.rs
    • Introduced upgrade_instance method to the AgentService trait, defining its signature and purpose.
  • crates/services/src/agent/service.rs
    • Implemented the upgrade_instance logic within AgentServiceImpl.
    • Added functionality to fetch the latest image digests from the compose-api /version endpoint.
    • Included logic to map instance service_type to the appropriate image key (e.g., 'ironclaw' to 'ironclaw', others to 'worker').
    • Integrated a call to the Agent API's restart endpoint, passing the newly fetched image digest.
    • Implemented comprehensive error handling and logging for the upgrade process.
Activity
  • The author has outlined a test plan to verify the new upgrade endpoint's functionality, including checking for a 200 OK response, successful instance restarts with the latest image, and correct 403 Forbidden responses for unauthorized upgrade attempts.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@claude
Copy link

claude bot commented Feb 23, 2026

PR Review: feat: add instance upgrade endpoint

⚠️ Issues found — please address before merging.


1. Duplicate ownership check and redundant DB queries

The route handler (agents.rs) fetches the instance and checks ownership, then passes both IDs to agent_service.upgrade_instance(), which fetches the instance and checks ownership again. This results in two DB round-trips per request and duplicated logic.

crates/api/src/routes/agents.rs — fetches instance + checks user_id:

let instance = app_state.agent_repository.get_instance(instance_uuid).await...?
    .ok_or_else(|| ApiError::not_found("Instance not found"))?;

if instance.user_id != user.user_id {
    return Err(ApiError::forbidden(...));
}

crates/services/src/agent/service.rs — then does the same:

let instance = self.repository.get_instance(instance_id).await?.ok_or_else(...)?;
if instance.user_id != user_id { return Err(anyhow!("Access denied")); }

Fix: Remove the ownership check from the route handler (trust the service layer, consistent with how restart_instance works), or remove it from the service layer if all callers always pre-check. Pick one layer — the service layer is preferable since it enforces the invariant regardless of caller.


2. No timeouts on external HTTP calls

Both HTTP calls to compose-api have no timeout:

self.http_client.get(&version_url).bearer_auth(&manager.token).send().await
self.http_client.post(&restart_url).bearer_auth(&manager.token).json(...).send().await

If compose-api is slow or unresponsive, these requests hang indefinitely, blocking the async executor and potentially exhausting connection pool resources under load.

Fix: Add a timeout, e.g.:

self.http_client
    .get(&version_url)
    .bearer_auth(&manager.token)
    .timeout(std::time::Duration::from_secs(30))
    .send()
    .await

(Check how restart_instance handles this — apply the same pattern for consistency.)


3. Privacy: instance.name logged in production

Per CLAUDE.md logging rules, metadata that could reveal customer activity must not be logged at info level (which runs in production). instance.name is a user-visible identifier that could expose customer information:

tracing::info!(
    "Instance upgraded successfully: instance_id={}, name={}, image={}",
    instance_id,
    instance.name,  // ⚠️ may reveal customer data
    image_key
);

Fix: Remove name={} from the log, or move to debug! level:

tracing::info!(
    "Instance upgraded successfully: instance_id={}, image_key={}",
    instance_id,
    image_key
);

Minor

  • The unwrap_or("openclaw") default for service_type is inconsistent with the comment and PR description (which say the fallback maps to "worker"). Consider unwrap_or("worker") directly, or add a comment explaining why "openclaw" is the sentinel value here.

Reviewed by Claude Sonnet 4.6

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new endpoint to upgrade agent instances. The implementation is consistent with existing patterns for instance management. My review focuses on improving maintainability by reducing code duplication, enhancing logging for better observability, and addressing a performance issue caused by a redundant database query. I've also pointed out opportunities to centralize string representations, particularly for enum variants using the std::fmt::Display trait, and to replace other magic strings with constants.

@henrypark133
Copy link
Contributor

Code review

Found 2 issues:

  1. instance.name is logged at warn level (production-enabled) in check_upgrade_available, violating CLAUDE.md which says to never log "Metadata that reveals customer information" and to log "IDs only" at production log levels.

if instance_resp.status() == reqwest::StatusCode::NOT_FOUND {
tracing::warn!(
"Instance not found on Agent Manager: instance_id={}, instance_name={}. Blocking upgrade until instance is synced.",
instance_id,
instance.name
);

  1. check_upgrade_available is missing from the OpenAPI spec registration in openapi.rs. upgrade_instance was added but its paired endpoint was not, so it won't appear in generated API docs.

crate::routes::agents::start_instance,
crate::routes::agents::stop_instance,
crate::routes::agents::restart_instance,
crate::routes::agents::upgrade_instance,
crate::routes::admin::admin_create_backup,
crate::routes::admin::admin_list_backups,
crate::routes::admin::admin_get_backup,

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

@nickpismenkov nickpismenkov merged commit 408f9e5 into main Feb 27, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants