feat: Add tool to deploy LLM models to Cloud Run #124

oded996 · 2025-08-25T21:04:50Z

This PR introduces a new tool, cloud-run-deploy-model, which simplifies the deployment of Large Language Models (LLMs) to Google Cloud Run.

Key Features:

New Tool: A cloud-run-deploy-model tool that supports deploying models using the Ollama and vLLM frameworks.
Writable GCS Mounts: The GCS volume is now mounted with write permissions, allowing Ollama to download and store models.
Improved Error Handling: The tool now provides clearer error messages for authentication issues (403) and service unavailability (503) during model download.
Retry Mechanism: A retry mechanism has been added to handle transient 503 errors, making the deployment process more robust.

Example

Prompt:

please deploy llama2 on cloud run, use ollama, project my-gcp-project, region europe-west1

Output:

Cloud Run service llama2 deployed in project my-gcp-project
Cloud Console: https://console.cloud.google.com/run/detail/europe-west1/llama2?project=my-gcp-project
Service URL: https://llama2-....a.run.app

steren

Thanks for the PR
While I like the idea. I wonder if we should add it. What are the use cases for an Agent deploying a model?

Note that later, we will have presets that will cover this use case.

steren · 2025-08-28T19:33:59Z

.DS_Store

remove file

steren · 2025-08-28T19:34:14Z

progress.01.md

remove file

steren · 2025-08-28T19:36:31Z

tools.js

  );
+
+  server.registerTool(
+    'cloud_run_deploy_model',


remove the cloud_run_ prefix.

Suggestion: deploy_ai_model

steren · 2025-08-28T19:37:01Z

tools.js

+        framework:
+          z.enum(['ollama', 'vllm']).describe('The framework to use for serving the model.'),
+        model:
+          z.string().describe('The model to deploy from Ollama library or Hugging Face Hub.'),


Can you add more details. give examples of accepted formats

Refactors the vLLM deployment strategy to use a dedicated Cloud Function for streaming models from Hugging Face to GCS. This avoids slow local downloads and network bottlenecks. Also includes: - Hardening the vLLM container with --max-model-len and HF_HUB_OFFLINE. - Correcting the container port to 8000. - Cleaning up unused code in the deployment scripts.

fix: Allow writable GCS mounts for Ollama models

872531f

oded996 changed the title ~~fix: Allow writable GCS mounts for Ollama models~~ feat: Add tool to deploy LLM models to Cloud Run Aug 25, 2025

steren requested changes Aug 28, 2025

View reviewed changes

husainhirani added the kokoro:run label Sep 18, 2025

kokoro-team removed the kokoro:run label Sep 18, 2025

husainhirani added the kokoro:run label Sep 18, 2025

kokoro-team removed the kokoro:run label Sep 18, 2025

husainhirani added the kokoro:run label Sep 18, 2025

kokoro-team removed the kokoro:run label Sep 18, 2025

husainhirani added the kokoro:run label Sep 18, 2025

kokoro-team removed the kokoro:run label Sep 18, 2025

husainhirani added the kokoro:run label Sep 29, 2025

steren closed this Jan 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add tool to deploy LLM models to Cloud Run #124

feat: Add tool to deploy LLM models to Cloud Run #124

Uh oh!

oded996 commented Aug 25, 2025 •

edited

Loading

Uh oh!

steren left a comment

Uh oh!

steren Aug 28, 2025

Uh oh!

steren Aug 28, 2025

Uh oh!

steren Aug 28, 2025

Uh oh!

steren Aug 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: Add tool to deploy LLM models to Cloud Run #124

feat: Add tool to deploy LLM models to Cloud Run #124

Uh oh!

Conversation

oded996 commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key Features:

Example

Uh oh!

steren left a comment

Choose a reason for hiding this comment

Uh oh!

steren Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

steren Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

steren Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

steren Aug 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

oded996 commented Aug 25, 2025 •

edited

Loading