Skip to content

Commit e24bd07

Browse files
authored
fix: remove deploy/backend targets from Makefile when deployment_target is none (#773)
- Wrap deploy and backend Makefile targets in conditional so they are omitted when deployment_target is 'none', allowing enhance to properly add them later - Condense /invoke endpoint documentation in GEMINI.md - Show enhance command in GEMINI.md table when no deployment target
1 parent 7e6fdb9 commit e24bd07

File tree

2 files changed

+33
-283
lines changed

2 files changed

+33
-283
lines changed

agent_starter_pack/base_templates/python/GEMINI.md

Lines changed: 31 additions & 280 deletions
Original file line numberDiff line numberDiff line change
@@ -954,7 +954,11 @@ gh run watch RUN_ID --repo OWNER/REPO
954954
{%- endif %}
955955
| `make lint` | Check code quality |
956956
| `make setup-dev-env` | Set up dev infrastructure (Terraform) |
957-
| `make deploy` | Deploy to dev (requires human approval) |
957+
{%- if cookiecutter.deployment_target != 'none' %}
958+
| `make deploy` | Deploy to dev |
959+
{%- else %}
960+
| `uvx agent-starter-pack enhance` (or equivalent) | Add a deployment target to enable `make deploy` |
961+
{%- endif %}
958962

959963
## Testing Your Deployed Agent
960964

@@ -964,8 +968,6 @@ After deployment, you can test your agent. The method depends on your deployment
964968

965969
The deployment endpoint is stored in `deployment_metadata.json` after `make deploy` completes.
966970

967-
{%- if cookiecutter.deployment_target == "agent_engine" %}
968-
969971
### Testing Agent Engine Deployment
970972

971973
Your agent is deployed to Vertex AI Agent Engine.
@@ -1008,8 +1010,6 @@ make playground
10081010
# Open http://localhost:8000 in your browser
10091011
```
10101012

1011-
{%- elif cookiecutter.deployment_target == "cloud_run" %}
1012-
10131013
### Testing Cloud Run Deployment
10141014

10151015
Your agent is deployed to Cloud Run.
@@ -1087,57 +1087,6 @@ gcloud beta iap web add-iam-policy-binding \
10871087
```
10881088

10891089
**Note:** Use `iap web add-iam-policy-binding` for IAP access, not `run services add-iam-policy-binding` (which is for `roles/run.invoker`).
1090-
{%- if cookiecutter.is_adk and cookiecutter.session_type == "cloud_sql" %}
1091-
1092-
### Testing Cloud SQL Session Persistence
1093-
1094-
Your agent uses Cloud SQL (PostgreSQL) for session storage. To verify sessions persist correctly:
1095-
1096-
**1. Test Session Creation and Resume:**
1097-
1098-
```bash
1099-
# First request - create session and have a conversation
1100-
curl -X POST $SERVICE_URL/run \
1101-
-H "Authorization: Bearer $(gcloud auth print-identity-token)" \
1102-
-H "Content-Type: application/json" \
1103-
-d '{"input": "Qualify lead #123"}' | jq -r '.session_id'
1104-
1105-
# Save the session_id from the response, then test resume:
1106-
curl -X POST $SERVICE_URL/run \
1107-
-H "Authorization: Bearer $(gcloud auth print-identity-token)" \
1108-
-H "Content-Type: application/json" \
1109-
-d '{"input": "What was the lead score?", "session_id": "SESSION_ID_FROM_ABOVE"}'
1110-
```
1111-
1112-
The agent should recall details from the first conversation.
1113-
1114-
**2. Verify Cloud SQL Connection:**
1115-
1116-
```bash
1117-
# Check Cloud Run service logs for successful DB connection
1118-
gcloud run services logs read {{cookiecutter.project_name}} \
1119-
--project=YOUR_DEV_PROJECT_ID \
1120-
--region=YOUR_REGION \
1121-
--limit=50 | grep -i "database\|cloud_sql"
1122-
1123-
# Verify Cloud SQL instance is running
1124-
gcloud sql instances describe {{cookiecutter.project_name}}-db-dev \
1125-
--project=YOUR_DEV_PROJECT_ID
1126-
```
1127-
1128-
**3. Common Cloud SQL Issues:**
1129-
1130-
| Issue | Symptom | Resolution |
1131-
|-------|---------|------------|
1132-
| Connection timeout | `Connection refused` errors | Check Cloud SQL instance is in same region as Cloud Run |
1133-
| IAM auth failed | `Login failed` errors | Verify service account has `roles/cloudsql.client` |
1134-
| Session not found | `Session does not exist` | Verify session_id matches and DB tables were created |
1135-
| Volume mount failed | `cloudsql volume not found` | Check terraform applied Cloud SQL volume configuration |
1136-
1137-
{%- endif %}
1138-
1139-
{%- endif %}
1140-
{%- if cookiecutter.is_a2a %}
11411090

11421091
### Testing A2A Protocol Agents
11431092

@@ -1198,7 +1147,6 @@ curl -X POST \
11981147
curl -H "Authorization: Bearer $(gcloud auth print-identity-token)" \
11991148
"$SERVICE_URL/a2a/app/.well-known/agent-card.json"
12001149
```
1201-
{%- endif %}
12021150

12031151
### Running Load Tests
12041152

@@ -1238,245 +1186,48 @@ Your agent currently runs as an interactive service. However, many use cases req
12381186

12391187
### Adding an /invoke Endpoint
12401188

1241-
To enable batch/event processing, add an `/invoke` endpoint to your FastAPI app that auto-detects the input format:
1242-
1243-
```python
1244-
# Add to {{cookiecutter.agent_directory}}/fast_api_app.py
1245-
1246-
from typing import List, Any, Dict
1247-
import asyncio
1248-
import base64
1249-
import json
1250-
from pydantic import BaseModel
1251-
1252-
# Request/Response models for different sources
1253-
class BQResponse(BaseModel):
1254-
replies: List[str]
1255-
1256-
# Concurrency control (module-level for reuse)
1257-
MAX_CONCURRENT = 10
1258-
semaphore = asyncio.Semaphore(MAX_CONCURRENT)
1259-
1260-
1261-
async def run_agent(prompt: str) -> str:
1262-
"""Run the agent with concurrency control.
1263-
1264-
Uses Runner + InMemorySessionService for stateless batch processing.
1265-
Each invocation creates a fresh session (no conversation history).
1266-
"""
1267-
async with semaphore:
1268-
try:
1269-
from {{cookiecutter.agent_directory}}.agent import root_agent
1270-
from google.adk.runners import Runner
1271-
from google.adk.sessions import InMemorySessionService
1272-
from google.genai import types as genai_types
1273-
1274-
# Create ephemeral session for this request
1275-
session_service = InMemorySessionService()
1276-
await session_service.create_session(
1277-
app_name="app", user_id="invoke_user", session_id="invoke_session"
1278-
)
1279-
runner = Runner(
1280-
agent=root_agent, app_name="app", session_service=session_service
1281-
)
1282-
1283-
# Run agent and collect final response
1284-
final_response = ""
1285-
async for event in runner.run_async(
1286-
user_id="invoke_user",
1287-
session_id="invoke_session",
1288-
new_message=genai_types.Content(
1289-
role="user",
1290-
parts=[genai_types.Part.from_text(text=prompt)]
1291-
),
1292-
):
1293-
if event.is_final_response() and event.content and event.content.parts:
1294-
final_response = event.content.parts[0].text
1295-
return final_response
1296-
except Exception as e:
1297-
return json.dumps({"error": str(e)})
1189+
Add an `/invoke` endpoint to `{{cookiecutter.agent_directory}}/fast_api_app.py` for batch/event processing. The endpoint auto-detects the input format (BigQuery Remote Function, Pub/Sub, Eventarc, or direct HTTP).
12981190

1191+
**Core pattern:** Create a `run_agent` helper using `Runner` + `InMemorySessionService` for stateless processing, with a semaphore for concurrency control. Then route by request shape:
12991192

1193+
```python
13001194
@app.post("/invoke")
13011195
async def invoke(request: Dict[str, Any]):
1302-
"""
1303-
Universal endpoint that auto-detects input format and routes accordingly.
1304-
1305-
Supported formats:
1306-
- BigQuery Remote Function: {"calls": [[row1], [row2], ...]}
1307-
- Pub/Sub Push: {"message": {"data": "base64...", "attributes": {...}}}
1308-
- Eventarc: {"data": {...}, "type": "google.cloud.storage.object.v1.finalized"}
1309-
- Direct HTTP: {"input": "your prompt here"}
1310-
"""
1311-
1312-
# === BigQuery Remote Function ===
1313-
# Format: {"calls": [[col1, col2], [col1, col2], ...]}
1314-
if "calls" in request:
1315-
async def process_row(row_data: List[Any]) -> str:
1316-
prompt = f"Analyze: {row_data}"
1317-
return await run_agent(prompt)
1318-
1319-
results = await asyncio.gather(
1320-
*[process_row(row) for row in request["calls"]]
1321-
)
1322-
return BQResponse(replies=results)
1323-
1324-
# === Pub/Sub Push Subscription ===
1325-
# Format: {"message": {"data": "base64...", "attributes": {...}}, "subscription": "..."}
1326-
if "message" in request:
1327-
message = request["message"]
1328-
# Decode base64 data
1329-
data_b64 = message.get("data", "")
1330-
try:
1331-
data = base64.b64decode(data_b64).decode("utf-8")
1332-
payload = json.loads(data)
1333-
except Exception:
1334-
payload = data_b64 # Use raw if not JSON
1335-
1336-
attributes = message.get("attributes", {})
1337-
prompt = f"Process event: {payload}\nAttributes: {attributes}"
1338-
1339-
result = await run_agent(prompt)
1340-
1341-
# Pub/Sub expects 2xx response to acknowledge
1342-
return {"status": "success", "result": result}
1343-
1344-
# === Eventarc (Cloud Events) ===
1345-
# Format: {"data": {...}, "type": "google.cloud.storage.object.v1.finalized", ...}
1346-
if "type" in request and request.get("type", "").startswith("google.cloud."):
1347-
event_type = request["type"]
1348-
event_data = request.get("data", {})
1349-
1350-
# Example: Cloud Storage event
1351-
if "storage" in event_type:
1352-
bucket = event_data.get("bucket", "unknown")
1353-
name = event_data.get("name", "unknown")
1354-
prompt = f"Process file event: gs://{bucket}/{name}\nEvent type: {event_type}"
1355-
else:
1356-
prompt = f"Process GCP event: {event_type}\nData: {event_data}"
1357-
1358-
result = await run_agent(prompt)
1359-
return {"status": "success", "result": result}
1360-
1361-
# === Direct HTTP / Webhook ===
1362-
# Format: {"input": "your prompt"} or {"prompt": "your prompt"}
1363-
if "input" in request or "prompt" in request:
1364-
prompt = request.get("input") or request.get("prompt")
1365-
result = await run_agent(prompt)
1366-
return {"status": "success", "result": result}
1367-
1368-
# Unknown format
1369-
return {"status": "error", "message": "Unknown request format", "received_keys": list(request.keys())}
1370-
```
1371-
1372-
### Local Testing (Before Deployment)
1373-
1374-
**IMPORTANT:** Always test the `/invoke` endpoint locally before deploying. Unlike interactive chatbots, batch/event processing is harder to debug in production.
1375-
1376-
```bash
1377-
# Start local backend (default port 8000)
1378-
make local-backend
1379-
1380-
# Or specify a custom port (useful for parallel development)
1381-
make local-backend PORT=8081
1196+
if "calls" in request: # BigQuery: {"calls": [[row1], [row2]]}
1197+
results = await asyncio.gather(*[run_agent(f"Analyze: {row}") for row in request["calls"]])
1198+
return {"replies": results}
1199+
if "message" in request: # Pub/Sub: {"message": {"data": "base64..."}}
1200+
payload = base64.b64decode(request["message"]["data"]).decode()
1201+
return {"status": "success", "result": await run_agent(payload)}
1202+
if "type" in request: # Eventarc: {"type": "google.cloud...", "data": {...}}
1203+
return {"status": "success", "result": await run_agent(str(request["data"]))}
1204+
if "input" in request: # Direct HTTP: {"input": "prompt"}
1205+
return {"status": "success", "result": await run_agent(request["input"])}
13821206
```
13831207

1384-
**Test BigQuery batch format:**
1208+
**Test locally** with `make local-backend`, then curl each format:
13851209
```bash
1386-
curl -X POST http://localhost:8000/invoke \
1387-
-H "Content-Type: application/json" \
1210+
# BigQuery
1211+
curl -X POST http://localhost:8000/invoke -H "Content-Type: application/json" \
13881212
-d '{"calls": [["test input 1"], ["test input 2"]]}'
1213+
# Direct
1214+
curl -X POST http://localhost:8000/invoke -H "Content-Type: application/json" \
1215+
-d '{"input": "your prompt here"}'
13891216
```
13901217

1391-
**Test Pub/Sub format (with base64 encoding):**
1392-
```bash
1393-
DATA=$(echo -n '{"key": "value"}' | base64)
1394-
curl -X POST http://localhost:8000/invoke \
1395-
-H "Content-Type: application/json" \
1396-
-d "{\"message\": {\"data\": \"$DATA\"}}"
1397-
```
1398-
1399-
**Test Eventarc format:**
1400-
```bash
1401-
curl -X POST http://localhost:8000/invoke \
1402-
-H "Content-Type: application/json" \
1403-
-d '{
1404-
"type": "google.cloud.storage.object.v1.finalized",
1405-
"data": {"bucket": "my-bucket", "name": "file.pdf"}
1406-
}'
1407-
```
1408-
1409-
**What to verify:**
1410-
- Correct format detection (check which branch handles your request)
1411-
- Expected response format (`{"replies": [...]}` for BQ, `{"status": "success"}` for events)
1412-
- Tool calls in logs (for side-effect mode)
1413-
- Error handling for malformed inputs
1414-
1415-
### Integration Examples
1416-
1417-
**BigQuery Remote Function:**
1418-
```sql
1419-
-- Create connection (one-time setup)
1420-
CREATE EXTERNAL CONNECTION `project.region.bq_connection`
1421-
OPTIONS (cloud_resource_id="//cloudresourcemanager.googleapis.com/projects/PROJECT_ID");
1422-
1423-
-- Create remote function
1424-
CREATE FUNCTION dataset.analyze_customer(data STRING)
1425-
RETURNS STRING
1426-
REMOTE WITH CONNECTION `project.region.bq_connection`
1427-
OPTIONS (endpoint = 'https://{{cookiecutter.project_name}}.run.app/invoke');
1428-
1429-
-- Process millions of rows
1430-
SELECT customer_id, dataset.analyze_customer(customer_data) AS analysis
1431-
FROM customers;
1432-
```
1433-
1434-
**Pub/Sub Push Subscription:**
1218+
**Connect to GCP services:**
14351219
```bash
1436-
# Create push subscription pointing to /invoke
1437-
gcloud pubsub subscriptions create my-subscription \
1438-
--topic=my-topic \
1220+
# Pub/Sub push subscription
1221+
gcloud pubsub subscriptions create my-sub --topic=my-topic \
14391222
--push-endpoint=https://{{cookiecutter.project_name}}.run.app/invoke
1440-
```
1441-
1442-
**Eventarc Trigger:**
1443-
```bash
1444-
# Trigger on Cloud Storage events
1445-
gcloud eventarc triggers create storage-trigger \
1223+
# Eventarc trigger
1224+
gcloud eventarc triggers create my-trigger \
14461225
--destination-run-service={{cookiecutter.project_name}} \
14471226
--destination-run-path=/invoke \
1448-
--event-filters="type=google.cloud.storage.object.v1.finalized" \
1449-
--event-filters="bucket=my-bucket"
1227+
--event-filters="type=google.cloud.storage.object.v1.finalized"
14501228
```
14511229

1452-
### Production Considerations
1453-
1454-
**Rate Limiting & Retry:**
1455-
- Use semaphores to limit concurrent Gemini calls (avoid 429 errors)
1456-
- Implement exponential backoff for transient failures
1457-
- For BigQuery: Raise `TransientError` on 429s to trigger automatic retries
1458-
1459-
**Error Handling:**
1460-
- Return per-row errors as JSON objects, don't fail entire batch
1461-
- Log errors with trace IDs for debugging
1462-
- Monitor error rates via Cloud Logging/Monitoring
1463-
1464-
**Cost Control:**
1465-
- Set Cloud Run `--max-instances` to cap concurrent executions
1466-
- Monitor Gemini API usage and set budget alerts
1467-
- Test with small batches before running on production data
1468-
1469-
### Reference Implementation
1470-
1471-
See complete production example with chunking, error handling, and monitoring:
1472-
https://github.com/richardhe-fundamenta/practical-gcp-examples/blob/main/bq-remote-function-agent/customer-advisor/app/fast_api_app.py
1473-
1474-
**Key patterns from reference:**
1475-
- Async processing with semaphore throttling (`MAX_CONCURRENT_ROWS = 10`)
1476-
- Chunk batching for memory efficiency (`CHUNK_SIZE = 10`)
1477-
- Transient vs permanent error classification
1478-
1479-
- Structured output extraction from agent responses
1230+
**Production tips:** Use semaphores to limit concurrent Gemini calls (avoid 429s), set Cloud Run `--max-instances`, and return per-row errors instead of failing entire batches. See [reference implementation](https://github.com/richardhe-fundamenta/practical-gcp-examples/blob/main/bq-remote-function-agent/customer-advisor/app/fast_api_app.py) for production patterns.
14801231

14811232
---
14821233

agent_starter_pack/base_templates/python/Makefile

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -256,6 +256,7 @@ build-inspector-if-needed:
256256
cd tools/a2a-inspector/frontend && npm run build; \
257257
fi
258258
{%- endif %}
259+
{%- if cookiecutter.deployment_target != 'none' %}
259260

260261
# ==============================================================================
261262
# Backend Deployment Targets
@@ -296,13 +297,11 @@ deploy:
296297
--entrypoint-object=agent_engine \
297298
--requirements-file={{cookiecutter.agent_directory}}/app_utils/.requirements.txt \
298299
$(if $(AGENT_IDENTITY),--agent-identity)
299-
{%- elif cookiecutter.deployment_target == 'none' %}
300-
@echo "No deployment target configured."
301-
@echo "Run 'uvx agent-starter-pack enhance' to add a deployment target."
302300
{%- endif %}
303301

304302
# Alias for 'make deploy' for backward compatibility
305303
backend: deploy
304+
{%- endif %}
306305

307306
{%- if cookiecutter.cicd_runner != 'skip' %}
308307

0 commit comments

Comments
 (0)