diff --git a/app/_data/entity_examples/gateway/routes/gemini-route.yaml b/app/_data/entity_examples/gateway/routes/gemini-route.yaml new file mode 100644 index 0000000000..789cb975a6 --- /dev/null +++ b/app/_data/entity_examples/gateway/routes/gemini-route.yaml @@ -0,0 +1,5 @@ +name: gemini-route +paths: + - /gemini +service: + name: gemini-service \ No newline at end of file diff --git a/app/_data/entity_examples/gateway/services/gemini-service.yaml b/app/_data/entity_examples/gateway/services/gemini-service.yaml new file mode 100644 index 0000000000..4120571b93 --- /dev/null +++ b/app/_data/entity_examples/gateway/services/gemini-service.yaml @@ -0,0 +1,2 @@ +name: gemini-service +url: http://httpbin.konghq.com/gemini \ No newline at end of file diff --git a/app/_how-tos/use-gemini-sdk-chat.md b/app/_how-tos/use-gemini-sdk-chat.md new file mode 100644 index 0000000000..0e0656c3e8 --- /dev/null +++ b/app/_how-tos/use-gemini-sdk-chat.md @@ -0,0 +1,154 @@ +--- +title: Use Google Generative AI SDK for Gemini AI service chats with Kong AI Gateway +content_type: how_to +related_resources: + - text: AI Gateway + url: /ai-gateway/ + - text: AI Proxy Advanced + url: /plugins/ai-proxy-advanced/ + - text: Google Generative AI SDK + url: https://ai.google.dev/gemini-api/docs/sdks + +description: "Configure the AI Proxy plugin for Gemini and test with the Google Generative AI SDK using the standard Gemini API format." + +products: + - gateway + - ai-gateway + +works_on: + - on-prem + - konnect + +min_version: + gateway: '3.10' + +plugins: + - ai-proxy-advanced + +entities: + - service + - route + - plugin + +tags: + - ai + +tldr: + q: How do I use the Google Generative AI SDK with Kong AI Gateway? + a: Configure the AI Proxy Advanced plugin with `llm_format` set to `gemini`, then use the Google Generative AI SDK to send requests through Kong AI Gateway. + +tools: + - deck + +prereqs: + inline: + - title: Gemini AI + include_content: prereqs/gemini + icon_url: /assets/icons/gcp.svg + - title: Python + include_content: prereqs/python + icon_url: /assets/icons/python.svg + - title: Google Generative AI SDK + content: | + Install the Google Generative AI SDK: + ```sh + pip install google-generativeai + ``` + icon_url: /assets/icons/gcp.svg + entities: + services: + - gemini-service + routes: + - gemini-route + +cleanup: + inline: + - title: Clean up Konnect environment + include_content: cleanup/platform/konnect + icon_url: /assets/icons/gateway.svg + - title: Destroy the {{site.base_gateway}} container + include_content: cleanup/products/gateway + icon_url: /assets/icons/gateway.svg + +automated_tests: false +--- + +## Configure the AI Proxy plugin + +The AI Proxy plugin supports Google's Gemini models and works with the Google Generative AI SDK. This configuration allows you to use the standard Gemini SDK. Apply the plugin configuration with your GCP service account credentials: + +{% entity_examples %} +entities: + plugins: + - name: ai-proxy + service: gemini-service + config: + route_type: llm/v1/chat + llm_format: gemini + auth: + param_name: key + param_value: ${gcp_api_key} + param_location: query + model: + provider: gemini + name: gemini-2.0-flash-exp +variables: + gcp_api_key: + value: $GEMINI_API_KEY +{% endentity_examples %} + +## Test with Google Generative AI SDK + +Create a test script that uses the Google Generative AI SDK. The script initializes a client with a dummy API key because Kong AI Gateway handles authentication, then sends a generation request through the gateway: + +```py +cat << 'EOF' > gemini.py +#!/usr/bin/env python3 +import os +from google import genai + +BASE_URL = "http://localhost:8000/gemini" + +def gemini_chat(): + + try: + print(f"Connecting to: {BASE_URL}") + + client = genai.Client( + api_key=os.environ.get("DECK_GEMINI_API_KEY"), + vertexai=False, + http_options={ + "base_url": BASE_URL + } + ) + + print("Sending message...") + response = client.models.generate_content( + model="gemini-2.0-flash-exp", + contents="Hello! How are you?" + ) + + print(f"Response: {response.text}") + + except Exception as e: + print(f"Error: {e}") + import traceback + traceback.print_exc() + +if __name__ == "__main__": + gemini_chat() +EOF +``` + +Run the script: +```sh +python3 gemini.py +``` + +Expected output: + +```text +Connecting to: http://localhost:8000/gemini +Sending message... +Response: Hello! I'm doing well, thank you for asking. As a large language model, I don't experience feelings or emotions in the way humans do, but I'm functioning properly and ready to assist you. How can I help you today? +``` \ No newline at end of file diff --git a/app/_how-tos/use-vertex-sdk-chat.md b/app/_how-tos/use-vertex-sdk-chat.md new file mode 100644 index 0000000000..65203365ab --- /dev/null +++ b/app/_how-tos/use-vertex-sdk-chat.md @@ -0,0 +1,183 @@ +--- +title: Use Google Generative AI SDK for Vertex AI service chats with Kong AI Gateway +content_type: how_to +related_resources: + - text: AI Gateway + url: /ai-gateway/ + - text: AI Proxy Advanced + url: /plugins/ai-proxy-advanced/ + - text: Vertex AI Authentication + url: https://cloud.google.com/vertex-ai/docs/authentication + +description: "Configure the AI Proxy Advanced plugin to authenticate with Google's Gemini API using GCP service account credentials and test with the native Vertex AI request format." + +products: + - gateway + - ai-gateway + +works_on: + - on-prem + - konnect + +min_version: + gateway: '3.10' + +plugins: + - ai-proxy-advanced + +entities: + - service + - route + - plugin + +tags: + - ai + +tldr: + q: How do I use Vertex AI's native format with Kong AI Gateway? + a: Configure the AI Proxy Advanced plugin with `llm_format` set to `gemini`, then send requests using Vertex AI's native API format with the contents array structure. + +tools: + - deck + +prereqs: + inline: + - title: Vertex AI + include_content: prereqs/vertex-ai + icon_url: /assets/icons/gcp.svg + - title: Python + include_content: prereqs/python + icon_url: /assets/icons/python.svg + - title: Google Generative AI SDK + content: | + Install the Google Generative AI SDK: + ```sh + pip install google-generativeai + ``` + icon_url: /assets/icons/gcp.svg + entities: + services: + - gemini-service + routes: + - gemini-route + +cleanup: + inline: + - title: Clean up Konnect environment + include_content: cleanup/platform/konnect + icon_url: /assets/icons/gateway.svg + - title: Destroy the {{site.base_gateway}} container + include_content: cleanup/products/gateway + icon_url: /assets/icons/gateway.svg + +automated_tests: false +--- + +## Configure the AI Proxy Advanced plugin + +The AI Proxy Advanced plugin supports Google's Vertex AI models with service account authentication. This configuration allows you to route requests in Vertex AI's native format through Kong AI Gateway. The plugin handles authentication with GCP, manages the connection to Vertex AI endpoints, and proxies requests without modifying the Gemini-specific request structure. + +Apply the plugin configuration with your GCP service account credentials: + +{% entity_examples %} +entities: + plugins: + - name: ai-proxy-advanced + service: gemini-service + config: + llm_format: gemini + genai_category: text/generation + targets: + - route_type: llm/v1/chat + logging: + log_payloads: false + log_statistics: true + model: + provider: gemini + name: gemini-2.0-flash-exp + options: + gemini: + api_endpoint: ${gcp_api_endpoint} + project_id: ${gcp_project_id} + location_id: ${gcp_location_id} + auth: + allow_override: false + gcp_use_service_account: true + gcp_service_account_json: ${gcp_service_account_json} +variables: + gcp_api_endpoint: + value: $GCP_API_ENDPOINT + gcp_project_id: + value: $GCP_PROJECT_ID + gcp_service_account_json: + value: $GCP_SERVICE_ACCOUNT_JSON + gcp_location_id: + value: $GCP_LOCATION_ID +{% endentity_examples %} + +## Create Python script + +Create a test script that sends a request using Vertex AI's native API format. The script constructs the Vertex AI endpoint URL with your project ID and location, then sends a properly formatted request: + +```py +cat << 'EOF' > vertex.py +#!/usr/bin/env python3 +import os +from google import genai +import sys +import time +import threading + +def spinner(): + chars = ['⠋', '⠙', '⠹', '⠸', '⠼', '⠴', '⠦', '⠧', '⠇', '⠏'] + idx = 0 + while not stop_spinner: + sys.stdout.write(f'\r{chars[idx % len(chars)]} Generating response...') + sys.stdout.flush() + idx += 1 + time.sleep(0.1) + sys.stdout.write('\r' + ' ' * 30 + '\r') + sys.stdout.flush() + +client = genai.Client( + vertexai=True, + project=os.environ.get("DECK_GCP_PROJECT_ID", "gcp-sdet-test"), + location=os.environ.get("DECK_GCP_LOCATION_ID", "us-central1"), + http_options={ + "base_url": "http://localhost:8000/gemini" + } +) + +stop_spinner = False +spinner_thread = threading.Thread(target=spinner) +spinner_thread.start() + +try: + response = client.models.generate_content( + model="gemini-2.0-flash-exp", + contents="Hello! Say hello back to me!" + ) + stop_spinner = True + spinner_thread.join() + print(f"Model: {response.model_version}") + print(response.text) +except Exception as e: + stop_spinner = True + spinner_thread.join() + print(f"Error: {e}") +EOF +``` + +## Validate the configuration + +Now, let's run the script we created in the previous step: + +```sh +python3 vertex.py +``` + +Expected output: + +```text +Hello there! +``` \ No newline at end of file diff --git a/app/_includes/prereqs/gemini.md b/app/_includes/prereqs/gemini.md new file mode 100644 index 0000000000..093fae22aa --- /dev/null +++ b/app/_includes/prereqs/gemini.md @@ -0,0 +1,18 @@ +Before you begin, you must get the Gemini API key from Google Cloud: + +1. Go to the [Google Cloud Console](https://console.cloud.google.com/). +2. Select or create a project. +3. Enable the **Generative Language API**: + - Navigate to **APIs & Services > Library**. + - Search for "Generative Language API". + - Click **Enable**. +4. Create an API key: + - Navigate to **APIs & Services > Credentials**. + - Click **Create Credentials > API Key**. + - Copy the generated API key. + + +Export the API key as an environment variable: +```sh +export DECK_GEMINI_API_KEY="" +``` \ No newline at end of file diff --git a/app/_includes/prereqs/vertex-ai.md b/app/_includes/prereqs/vertex-ai.md index aa413e92e4..56fb019e6c 100644 --- a/app/_includes/prereqs/vertex-ai.md +++ b/app/_includes/prereqs/vertex-ai.md @@ -2,14 +2,25 @@ Before you begin, you must get the following credentials from Google Cloud: - **Service Account Key**: A JSON key file for a service account with Vertex AI permissions - **Project ID**: Your Google Cloud project identifier +- **Location ID**: Your Google Cloud project location identifier - **API Endpoint**: The global Vertex AI API endpoint `https://aiplatform.googleapis.com` After creating the key, convert the contents of `modelarmor-admin-key.json` into a **single-line JSON string**. -Escape all necessary characters — quotes (`"`) and newlines (`\n`) — so that it becomes a valid one-line JSON string. +Escape all necessary characters. Quotes (`"`) become `\"` and newlines become `\n`. The result must be a valid one-line JSON string. + Then export your credentials as environment variables: -```bash +```sh export DECK_GCP_SERVICE_ACCOUNT_JSON="" -export DECK_GCP_SERVICE_ACCOUNT_JSON="your-service-account-json" -export DECK_GCP_PROJECT_ID="your-project-id" -``` \ No newline at end of file +export DECK_GCP_LOCATION_ID="" +export DECK_GCP_API_ENDPOINT="" +export DECK_GCP_PROJECT_ID="" +``` + +Set up GCP Application Default Credentials (ADC) with your quota project: + +```sh +gcloud auth application-default set-quota-project +``` + +Replace `` with your actual project ID. This configures ADC to use your project for API quota and billing. \ No newline at end of file