Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions python/runtime-protect/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Runtime Protect Examples

This directory contains examples and utilities for working with Galileo Runtime Protect — Galileo’s real-time guardrail system for detecting unsafe content, enforcing policies, and augmenting LLM applications with runtime safety.

The goal of this folder is to provide progressively more advanced examples, starting with isolated Protect functionality (rules, metrics, stages) and eventually moving toward a full end-to-end chatbot example that logs to Galileo, invokes Protect, and demonstrates real application integration.

## Directory Structure

[custom_llm_metric_protect_test](./custom_llm_metric_protect_test/)
A focused, self-contained example demonstrating:

- How to create a custom LLM-based metric (e.g., PII detection) using the Galileo Python SDK

- How to register that metric with your Galileo org

- How to create a Protect stage that uses the custom metric

- How to test Protect rules against sample inputs

💡 Note:
This example tests Runtime Protect behavior, but does not implement a full chatbot or log anything to Galileo Observability.
It uses OpenAI only to evaluate the custom metric.

More details are inside the [README.md](./custom_llm_metric_protect_test/README.md)
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Environment variables for quickstart-protect (example)
# Copy this file to .env and fill in real values.

# Galileo
GALILEO_API_KEY="your-galileo-api-key"
GALILEO_PROJECT="MyFirstRuntimeProtect"
GALILEO_METRIC="pii_detection"
# OpenAI
OPENAI_API_KEY="your-openai-api-key"
70 changes: 70 additions & 0 deletions python/runtime-protect/custom_llm_metric_protect_test/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Custom LLM Metric Creation and Runtime Protect Test

This folder contains three scripts that demonstrate how to create, list, test, and delete a custom LLM-based PII-detection metric for Galileo Runtime Protect.

These scripts are intended to show how to register a metric and how Protect stages evaluate inputs using that metric.
They are not a full chatbot or application integration example.

Files:

- `create_pii_metric_sdk.py` — create & register the
`pii_detection` metric via the Galileo Python SDK.

- `test_custom_pii_metric.py` — creates a Protect stage with a ruleset that uses `pii_detection` and tests several inputs.

Util Files:

- `list_metric.py` — helper to check the SDK client for metric-related methods and reminders for manual UI verification.

- `delete_metric.py` — helper to delete the custom LLM metric previously created.

## Setup

### Step 1: Copy `.env.example` to `.env` and fill in real values

```env
GALILEO_API_KEY="<your-galileo-api-key>"
OPENAI_API_KEY="<your-openai-api-key>"
GALILEO_PROJECT="MyFirstRuntimeProtect"
GALILEO_METRIC="pii_detection"
```

### Step 2: Activate your Python environment and install requirements (if not already installed)

```bash
# from repo root
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```

### Step 3: Create metric (registers metric in your Galileo org)

```bash
python create_pii_metric_sdk.py
```

### Step 4: Verify metric creation by listing metrics

```bash
python util/list_metrics.py
```

### Step 5: Create stage and run tests against the metric

```bash
python test_custom_pii_metric.py
```

## Notes on UV-style script dependencies (for UI code metrics)

If you plan to create a code-based metric in the Galileo UI (rather than an LLM-based registered metric via the SDK), include a dependency header at the top of your metric file to allow additional packages in the sandbox. Example:

```python
# /// script
# dependencies = [
# "requests<3",
# "somepackage==1.2.3",
# ]
# ///
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
"""
This script creates a custom LLM metric using OpenAI through the SDK instead of the UI.
"""

import os
from dotenv import load_dotenv
from galileo.metrics import create_custom_llm_metric, OutputTypeEnum, StepType

# Load environment variables
load_dotenv()
GALILEO_API_KEY = os.getenv("GALILEO_API_KEY")

# Metric name can be provided via env var GALILEO_METRIC, otherwise use default
METRIC_NAME = os.getenv("GALILEO_METRIC")

if not GALILEO_API_KEY:
print("❌ GALILEO_API_KEY not set. Please add it to .env")
exit(1)

print("🚀 Creating custom PII detection metric via SDK...")

try:
metric_response = create_custom_llm_metric(
name=METRIC_NAME,
user_prompt="""
Analyze the following text and detect if it contains Personally Identifiable Information (PII).

PII includes:
- Email addresses (e.g., [email protected])
- Phone numbers (e.g., 555-123-4567)
- Social Security Numbers (e.g., 123-45-6789)
- Credit card numbers
- Home addresses
- Names with sensitive context

Text to analyze:
{input}

Respond with ONLY "true" if PII is detected, "false" if no PII is detected.
""",
node_level=StepType.llm,
cot_enabled=True, # Enable chain-of-thought reasoning
model_name="gpt-4.1-mini",
num_judges=1, # Can increase for more robust evaluation
description="Detects Personally Identifiable Information (PII) in text using LLM evaluation",
tags=["pii", "security", "protect"],
output_type=OutputTypeEnum.BOOLEAN,
)

print(f"✅ Metric created successfully!")
print(f"Metric ID: {metric_response.id if hasattr(metric_response, 'id') else 'N/A'}")
print(f"Metric Name: {METRIC_NAME}")
# print(f"Response: {metric_response}")

except Exception as e:
print(f"❌ Error creating metric: {e}")
exit(1)

print("\n📝 Next steps:")
print("1. Use this metric in your Protect rules:")
print(
"""
from galileo_core.schemas.protect.rule import Rule, RuleOperator

rule = Rule(
metric=METRIC_NAME,
operator=RuleOperator.eq,
target_value="true"
)
"""
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Dependencies for quickstart-protect
# Pin versions if you need reproducibility; these are minimal names commonly used
# with the Galileo SDK. If install fails, check the package names used by your org.

galileo
galileo-core
python-dotenv
phonenumbers>=8.12.0
email-validator>=1.1.3
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
"""
This script creates a stage with the metric registered by create_pii_metric_sdk.py
and tests it with various inputs to verify it works correctly.
"""

from galileo.stages import create_protect_stage, get_protect_stage
from galileo_core.schemas.protect.rule import Rule, RuleOperator
from galileo_core.schemas.protect.ruleset import Ruleset
from galileo_core.schemas.protect.stage import StageType
from galileo_core.schemas.protect.action import OverrideAction
from galileo_core.schemas.protect.payload import Payload
from galileo.protect import invoke_protect

from dotenv import load_dotenv
import os
import time

# Load environment variables
load_dotenv()

# Verify environment variables are loaded
print(f"GALILEO_PROJECT: {os.getenv('GALILEO_PROJECT')}")
print(f"GALILEO_API_KEY: {'Set' if os.getenv('GALILEO_API_KEY') else 'Not set'}")

# Create action for when PII is detected
action = OverrideAction(
choices=[
"This input contains sensitive information and cannot be processed.",
"Please remove any personal information before resubmitting.",
"We cannot process personal information. Please rephrase your request.",
]
)

# Create rule for PII detection
# Metric name comes from env var GALILEO_METRIC

METRIC_NAME = os.getenv("GALILEO_METRIC")
rule = Rule(metric=METRIC_NAME, operator=RuleOperator.eq, target_value="true") # Our custom metric name # Trigger when PII is detected (true)

ruleset = Ruleset(rules=[rule])

# Create the stage - explicitly pass project_name if env var not working
project_name = os.getenv("GALILEO_PROJECT") or "MyFirstRuntimeProtect"

# Create the stage with the custom PII detection metric
try:
stage = create_protect_stage(
name="PII_Detection_Test_Stage", stage_type=StageType.central, prioritized_rulesets=[ruleset], description="Test the custom PII detection metric"
)
print(f"✅ Stage created: {stage}")
except Exception as e:
print(f"❌ Stage creation error: {e}")
exit(1)

print("✅ Stage creation completed")

# Wait a moment for the stage to be available
time.sleep(2)

# Verify by retrieving it
try:
verified_stage = get_protect_stage(project_name=project_name, stage_name="PII_Detection_Test_Stage")

if verified_stage:
print(f"✅ Stage verified: {verified_stage.name}")
print(f" Stage ID: {verified_stage.id}")
else:
print("❌ Stage not found")
except Exception as e:
print(f"⚠️ Could not verify stage: {e}")


# Create workflow with Protect
def run_workflow(user_input, description=""):
# Invoke Protect
payload = Payload(input=user_input, output="")
try:
response = invoke_protect(payload=payload, project_name=project_name, stage_name="PII_Detection_Test_Stage", prioritized_rulesets=[ruleset])
return response
except Exception as e:
print(f" ❌ ERROR invoking protect: {e}")
return None


# Test cases with and without PII
test_cases = [
("Hello, how can I help you today?", "Normal request - no PII"),
("My email is [email protected]", "Contains email - HAS PII"),
("Call me at 555-123-4567", "Contains phone - HAS PII"),
("My SSN is 123-45-6789", "Contains SSN - HAS PII"),
("What's the weather like?", "Question - no PII"),
("My name is John Doe and I live at 123 Main St", "Contains name and address - HAS PII"),
("Can you help me with coding?", "Generic question - no PII"),
("Contact: [email protected] for details", "Contains work email - HAS PII"),
("Thanks for your assistance!", "Positive feedback - no PII"),
("My credit card is 1234-5678-9012-3456", "Contains card number - HAS PII"),
]

print("\n" + "=" * 80)
print("🧪 Testing Custom PII Detection Metric")
print("=" * 80 + "\n")

for i, (test_input, description) in enumerate(test_cases, 1):
print(f"Test {i}: {description}")
print(f" Input: '{test_input}'")

response = run_workflow(test_input, description)

if response:
print(f" Status: {response.status}")
print(f" Action: {response.action_result.get('type', 'N/A')}")

if response.status.value == "triggered":
print(f" ✅ PII DETECTED - Rule triggered!")
if response.action_result.get("value"):
print(f" Override: {response.action_result['value']}")
elif response.status.value == "not_triggered":
print(f" ✅ No PII detected - Rule not triggered")
elif response.status.value == "skipped":
print(f" ⏳ Status: skipped (metric evaluation may be pending)")
else:
print(f" Status: {response.status.value}")

if hasattr(response, "metric_results") and response.metric_results:
print(f" Metric Results: {response.metric_results}")

print()

print("=" * 80)
print("✅ PII Detection Tests Completed!")
print("=" * 80)
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
import os
import sys
from galileo.metrics import delete_metric
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Get metric name from environment variable
metric_name = os.getenv("GALILEO_METRIC")
if not metric_name:
print("❌ GALILEO_METRIC not set in .env")
sys.exit(1)

# Delete the metric with simple try/except and user-visible result
try:
delete_metric(name=metric_name)
print(f"✅ Deleted metric: {metric_name}")
except Exception as e:
print(f"❌ Failed to delete metric '{metric_name}': {e}")
Loading
Loading