feat: add example of deploying Owen 3 with Ollama in Cloud Run for Agents (#2044)

vladkol · holtskinner · web-flow · commit a1c715085941 · 2025-05-30T10:18:47.000-07:00
Co-authored-by: Holt Skinner &lt;holtskinner@google.com&gt;
diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
@@ -62,3 +62,4 @@
 /generative-ai/vision/gradio/gradio_image_generation_sdk.ipynb                                                                   @GoogleCloudPlatform/generative-ai-devrel @jbrache
 /generative-ai/vision/use-cases                                                                                                  @GoogleCloudPlatform/generative-ai-devrel @iamthuya
 /generative-ai/vision/use-cases/hey_llm                                                                                          @GoogleCloudPlatform/generative-ai-devrel @tushuhei
+/generative-ai/open-models/serving/cloud_run_ollama_qwen3_inference.ipynb                                                        @GoogleCloudPlatform/generative-ai-devrel @vladkol
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
@@ -1139,6 +1139,7 @@ quadrotor
 qubit
 qubits
 quippy
+qwen
 rag
 ragas
 ragdemos
diff --git a/open-models/README.md b/open-models/README.md
@@ -8,6 +8,7 @@ This repository contains examples for deploying and fine-tuning open source mode
 
 - [serving/cloud_run_ollama_gemma3_inference.ipynb](./serving/cloud_run_ollama_gemma3_inference.ipynb) - This notebook showcase how to deploy Google Gemma 3 in Cloud Run using Ollama, with the objective to build a simple API for chat.
 - [serving/cloud_run_vllm_gemma3_inference.ipynb](./serving/cloud_run_vllm_gemma3_inference.ipynb) - This notebook showcase how to deploy Google Gemma 3 in Cloud Run using vLLM, with the objective to build a simple API for chat.
+- [serving/cloud_run_ollama_qwen3_inference.ipynb](./serving/cloud_run_ollama_qwen3_inference.ipynb) - This notebook shows how to deploy Qwen 3 in Cloud Run using Ollama, with the objective to build a simple AI Agent.
 - [serving/vertex_ai_ollama_gemma2_rag_agent.ipynb](./serving/vertex_ai_ollama_gemma2_rag_agent.ipynb) - This notebooks provides steps and code to deploy an open source agentic RAG pipeline to Vertex AI Prediction using Ollama and a Gemma 2 model adapter.
 - [serving/vertex_ai_pytorch_inference_paligemma_with_custom_handler.ipynb](./serving/vertex_ai_pytorch_inference_paligemma_with_custom_handler.ipynb) - This notebooks provides steps and code to deploy Google PaliGemma with the Hugging Face Python Inference DLC using a custom handler on Vertex AI.
 - [serving/vertex_ai_pytorch_inference_pllum_with_custom_handler.ipynb](./serving/vertex_ai_pytorch_inference_pllum_with_custom_handler.ipynb) - This notebook shows how to deploy Polish Large Language Model (PLLuM) from the Hugging Face Hub on Vertex AI using the Hugging Face Deep Learning Container (DLC) for Pytorch Inference in combination with a custom handler.
diff --git a/open-models/serving/cloud_run_ollama_qwen3_inference.ipynb b/open-models/serving/cloud_run_ollama_qwen3_inference.ipynb

-Original file line number
+Diff line change
 qubit
 qubits
 quippy
 +qwen
 rag
 ragas
 ragdemos