-
Notifications
You must be signed in to change notification settings - Fork 80
feat(ai-gateway): Add how-tos for Gemini/Vertex SDKs #3786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 15 commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
6a9f582
Add how-tos
tomek-labuk 76603e8
Fix frontmatter
tomek-labuk db6ce79
Fix spellechecks
tomek-labuk c03fe05
Update tldrs
tomek-labuk 3be5071
Fix spelling
tomek-labuk 92dda8c
change py script
tomek-labuk f55c143
Merge branch 'main' into feat/google-sdk-how-tos
tomek-labuk 2e42346
Add corrects Gemini prereq
tomek-labuk 1cd2b03
Fix and update scripts
tomek-labuk 179ab8c
Update vertex prerea
tomek-labuk c8ba43d
Fix prereqs and scripts
tomek-labuk 42f1256
fixes
tomek-labuk baa850a
set automated tests to false
tomek-labuk 3357f2d
Merge branch 'main' into feat/google-sdk-how-tos
tomek-labuk fe1e3d9
add gemini prereq
tomek-labuk 17f3fe4
Update vertex prereq
tomek-labuk b72a6d6
Apply suggestions from code review
tomek-labuk 2956015
Merge branch 'main' into feat/google-sdk-how-tos
tomek-labuk File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| name: gemini-route | ||
| paths: | ||
| - /gemini | ||
| service: | ||
| name: gemini-service |
2 changes: 2 additions & 0 deletions
2
app/_data/entity_examples/gateway/services/gemini-service.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| name: example-service | ||
| url: http://httpbin.konghq.com/gemini | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,152 @@ | ||
| --- | ||
| title: Use Google Generative AI SDK for Gemini AI service chats with Kong AI Gateway | ||
| content_type: how_to | ||
| related_resources: | ||
| - text: AI Gateway | ||
| url: /ai-gateway/ | ||
| - text: AI Proxy Advanced | ||
| url: /plugins/ai-proxy-advanced/ | ||
| - text: Google Generative AI SDK | ||
| url: https://ai.google.dev/gemini-api/docs/sdks | ||
|
|
||
| description: "Configure the AI Proxy plugin for Gemini and test with the Google Generative AI SDK using the standard Gemini API format." | ||
|
|
||
| products: | ||
| - gateway | ||
| - ai-gateway | ||
|
|
||
| works_on: | ||
| - on-prem | ||
| - konnect | ||
|
|
||
| min_version: | ||
| gateway: '3.10' | ||
|
|
||
| plugins: | ||
| - ai-proxy-advanced | ||
|
|
||
| entities: | ||
| - service | ||
| - route | ||
| - plugin | ||
|
|
||
| tags: | ||
| - ai | ||
|
|
||
| tldr: | ||
| q: How do I use the Google Generative AI SDK with Kong AI Gateway? | ||
| a: Configure the AI Proxy Advanced plugin with `llm_format` set to `gemini`, then use the Google Generative AI SDK to send requests through Kong AI Gateway. | ||
|
|
||
| tools: | ||
| - deck | ||
|
|
||
| prereqs: | ||
| inline: | ||
| - title: Gemini AI | ||
| include_content: prereqs/gemini | ||
| icon_url: /assets/icons/gcp.svg | ||
| - title: Python | ||
| include_content: prereqs/python | ||
| icon_url: /assets/icons/python.svg | ||
| - title: Google Generative AI SDK | ||
| content: | | ||
| Install the Google Generative AI SDK: | ||
| ```sh | ||
| pip install google-generativeai | ||
| ``` | ||
| icon_url: /assets/icons/gcp.svg | ||
| entities: | ||
| services: | ||
| - gemini-service | ||
| routes: | ||
| - gemini-route | ||
|
|
||
| cleanup: | ||
| inline: | ||
| - title: Clean up Konnect environment | ||
| include_content: cleanup/platform/konnect | ||
| icon_url: /assets/icons/gateway.svg | ||
| - title: Destroy the {{site.base_gateway}} container | ||
| include_content: cleanup/products/gateway | ||
| icon_url: /assets/icons/gateway.svg | ||
|
|
||
| automated_tests: false | ||
| --- | ||
|
|
||
| ## Configure the AI Proxy plugin | ||
|
|
||
| The AI Proxy plugin supports Google's Gemini models and works with the Google Generative AI SDK. This configuration allows you to use the standard Gemini SDK. Apply the plugin configuration with your GCP service account credentials: | ||
|
|
||
| {% entity_examples %} | ||
| entities: | ||
| plugins: | ||
| - name: ai-proxy | ||
| service: gemini-service | ||
| config: | ||
| route_type: llm/v1/chat | ||
| llm_format: gemini | ||
| auth: | ||
| param_name: key | ||
| param_value: ${gcp_api_key} | ||
| param_location: query | ||
| model: | ||
| provider: gemini | ||
| name: gemini-2.0-flash-exp | ||
| variables: | ||
| gcp_api_key: | ||
| value: $GCP_API_KEY | ||
tomek-labuk marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| {% endentity_examples %} | ||
|
|
||
| ## Test with Google Generative AI SDK | ||
|
|
||
| Create a test script that uses the Google Generative AI SDK. The script initializes a client with a dummy API key because Kong AI Gateway handles authentication, then sends a generation request through the gateway: | ||
|
|
||
| ```py | ||
| #!/usr/bin/env python3 | ||
tomek-labuk marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| import os | ||
| from google import genai | ||
|
|
||
| BASE_URL = "http://localhost:8000/gemini" | ||
|
|
||
| def gemini_chat(): | ||
|
|
||
| try: | ||
| print(f"Connecting to: {BASE_URL}") | ||
|
|
||
| client = genai.Client( | ||
| api_key=os.environ.get("DECK_GEMINI_API_KEY"), | ||
| vertexai=False, | ||
| http_options={ | ||
| "base_url": BASE_URL | ||
| } | ||
| ) | ||
|
|
||
| print("Sending message...") | ||
| response = client.models.generate_content( | ||
| model="gemini-2.0-flash-exp", | ||
| contents="Hello! How are you?" | ||
| ) | ||
|
|
||
| print(f"Response: {response.text}") | ||
|
|
||
| except Exception as e: | ||
| print(f"Error: {e}") | ||
| import traceback | ||
| traceback.print_exc() | ||
|
|
||
| if __name__ == "__main__": | ||
| gemini_chat() | ||
tomek-labuk marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ``` | ||
|
|
||
| Run the script: | ||
| ```sh | ||
| python3 gemini.py | ||
| ``` | ||
|
|
||
| Expected output: | ||
|
|
||
| ```text | ||
| Connecting to: http://localhost:8000/gemini | ||
| Sending message... | ||
| Response: Hello! I'm doing well, thank you for asking. As a large language model, I don't experience feelings or emotions in the way humans do, but I'm functioning properly and ready to assist you. How can I help you today? | ||
| ``` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,183 @@ | ||
| --- | ||
| title: Use Google Generative AI SDK for Vertex AI service chats with Kong AI Gateway | ||
| content_type: how_to | ||
| related_resources: | ||
| - text: AI Gateway | ||
| url: /ai-gateway/ | ||
| - text: AI Proxy Advanced | ||
| url: /plugins/ai-proxy-advanced/ | ||
| - text: Vertex AI Authentication | ||
| url: https://cloud.google.com/vertex-ai/docs/authentication | ||
|
|
||
| description: "Configure the AI Proxy Advanced plugin to authenticate with Google's Gemini API using GCP service account credentials and test with the native Vertex AI request format." | ||
|
|
||
| products: | ||
| - gateway | ||
| - ai-gateway | ||
|
|
||
| works_on: | ||
| - on-prem | ||
| - konnect | ||
|
|
||
| min_version: | ||
| gateway: '3.10' | ||
|
|
||
| plugins: | ||
| - ai-proxy-advanced | ||
|
|
||
| entities: | ||
| - service | ||
| - route | ||
| - plugin | ||
|
|
||
| tags: | ||
| - ai | ||
|
|
||
| tldr: | ||
| q: How do I use Vertex AI's native format with Kong AI Gateway? | ||
| a: Configure the AI Proxy Advanced plugin with `llm_format` set to `gemini`, then send requests using Vertex AI's native API format with the contents array structure. | ||
|
|
||
| tools: | ||
| - deck | ||
|
|
||
| prereqs: | ||
| inline: | ||
| - title: Vertex AI | ||
| include_content: prereqs/vertex-ai | ||
| icon_url: /assets/icons/gcp.svg | ||
| - title: Python | ||
| include_content: prereqs/python | ||
| icon_url: /assets/icons/python.svg | ||
| - title: Google Generative AI SDK | ||
| content: | | ||
| Install the Google Generative AI SDK: | ||
| ```sh | ||
| pip install google-generativeai | ||
| ``` | ||
| icon_url: /assets/icons/gcp.svg | ||
| entities: | ||
| services: | ||
| - gemini-service | ||
| routes: | ||
| - gemini-route | ||
|
|
||
| cleanup: | ||
| inline: | ||
| - title: Clean up Konnect environment | ||
| include_content: cleanup/platform/konnect | ||
| icon_url: /assets/icons/gateway.svg | ||
| - title: Destroy the {{site.base_gateway}} container | ||
| include_content: cleanup/products/gateway | ||
| icon_url: /assets/icons/gateway.svg | ||
|
|
||
| automated_tests: false | ||
| --- | ||
|
|
||
| ## Configure the AI Proxy Advanced plugin | ||
|
|
||
| The AI Proxy Advanced plugin supports Google's Vertex AI models with service account authentication. This configuration allows you to route requests in Vertex AI's native format through Kong AI Gateway. The plugin handles authentication with GCP, manages the connection to Vertex AI endpoints, and proxies requests without modifying the Gemini-specific request structure. | ||
|
|
||
| Apply the plugin configuration with your GCP service account credentials: | ||
|
|
||
| {% entity_examples %} | ||
| entities: | ||
| plugins: | ||
| - name: ai-proxy-advanced | ||
| service: gemini-service | ||
| config: | ||
| llm_format: gemini | ||
| genai_category: text/generation | ||
| targets: | ||
| - route_type: llm/v1/chat | ||
| logging: | ||
| log_payloads: false | ||
| log_statistics: true | ||
| model: | ||
| provider: gemini | ||
| name: gemini-2.0-flash-exp | ||
| options: | ||
| gemini: | ||
| api_endpoint: ${gcp_api_endpoint} | ||
| project_id: ${gcp_project_id} | ||
| location_id: ${gcp_location_id} | ||
| auth: | ||
| allow_override: false | ||
| gcp_use_service_account: true | ||
| gcp_service_account_json: ${gcp_service_account_json} | ||
| variables: | ||
| gcp_api_endpoint: | ||
| value: $GCP_API_ENDPOINT | ||
| gcp_project_id: | ||
| value: $GCP_PROJECT_ID | ||
| gcp_service_account_json: | ||
| value: $GCP_SERVICE_ACCOUNT_JSON | ||
| gcp_location_id: | ||
| value: $GCP_LOCATION_ID | ||
| {% endentity_examples %} | ||
|
|
||
| ## Create Python script | ||
|
|
||
| Create a test script that sends a request using Vertex AI's native API format. The script constructs the Vertex AI endpoint URL with your project ID and location, then sends a properly formatted request: | ||
|
|
||
| ```py | ||
| cat << 'EOF' > vertex.py | ||
| #!/usr/bin/env python3 | ||
| import os | ||
| from google import genai | ||
| import sys | ||
| import time | ||
| import threading | ||
|
|
||
| def spinner(): | ||
| chars = ['⠋', '⠙', '⠹', '⠸', '⠼', '⠴', '⠦', '⠧', '⠇', '⠏'] | ||
| idx = 0 | ||
| while not stop_spinner: | ||
| sys.stdout.write(f'\r{chars[idx % len(chars)]} Generating response...') | ||
| sys.stdout.flush() | ||
| idx += 1 | ||
| time.sleep(0.1) | ||
| sys.stdout.write('\r' + ' ' * 30 + '\r') | ||
| sys.stdout.flush() | ||
|
|
||
| client = genai.Client( | ||
| vertexai=True, | ||
| project=os.environ.get("DECK_GCP_PROJECT_ID", "gcp-sdet-test"), | ||
| location=os.environ.get("DECK_GCP_LOCATION_ID", "us-central1"), | ||
| http_options={ | ||
| "base_url": "http://localhost:8000/gemini" | ||
| } | ||
| ) | ||
|
|
||
| stop_spinner = False | ||
| spinner_thread = threading.Thread(target=spinner) | ||
| spinner_thread.start() | ||
|
|
||
| try: | ||
| response = client.models.generate_content( | ||
| model="gemini-2.0-flash-exp", | ||
| contents="Hello! Say hello back to me!" | ||
| ) | ||
| stop_spinner = True | ||
| spinner_thread.join() | ||
| print(f"Model: {response.model_version}") | ||
| print(response.text) | ||
| except Exception as e: | ||
| stop_spinner = True | ||
| spinner_thread.join() | ||
| print(f"Error: {e}") | ||
| EOF | ||
| ``` | ||
|
|
||
| ## Validate the configuration | ||
|
|
||
| Now, let's run the script we created in the previous step: | ||
|
|
||
| ```sh | ||
| python3 vertex.py | ||
| ``` | ||
|
|
||
| Expected output: | ||
|
|
||
| ```text | ||
| Hello there! | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| Before you begin, you must get the Gemini API key from Google Cloud: | ||
|
|
||
| 1. Go to the [Google Cloud Console](https://console.cloud.google.com/). | ||
| 2. Select or create a project. | ||
| 3. Enable the **Generative Language API**: | ||
| - Navigate to **APIs & Services > Library**. | ||
| - Search for "Generative Language API". | ||
| - Click **Enable**. | ||
| 4. Create an API key: | ||
| - Navigate to **APIs & Services > Credentials**. | ||
| - Click **Create Credentials > API Key**. | ||
| - Copy the generated API key. | ||
|
|
||
|
|
||
| Export the API key as an environment variable: | ||
| ```sh | ||
| export DECK_GEMINI_API_KEY="<your_gemini_api_key>" | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.