Skip to content

Commit 9fc52de

Browse files
author
Vikram Shenoy
committed
2.0 Lite Support, Inference Adapter extensibility, SM Inference Adapter support
1 parent bcd64fc commit 9fc52de

File tree

30 files changed

+4817
-1140
lines changed

30 files changed

+4817
-1140
lines changed

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,4 +34,7 @@ ENV/
3434
*.swo
3535

3636
# Misc
37-
.DS_Store
37+
.DS_Store
38+
39+
# Design documentation (not for commit)
40+
design_docs/

README.md

Lines changed: 268 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,13 @@
11
# Nova Prompt Optimizer
22

3-
A Python SDK for optimizing prompts for Nova.
3+
A Python SDK for optimizing prompts for Amazon Nova and other models deployed on AWS.
44

55
## 📚 Table of contents
66
* [Installation](#installation)
77
* [Pre-Requisites](#pre-requisites)
8+
* [What's New](#-whats-new)
89
* [Quick Start: Facility Support Analyzer Dataset](#-quick-start)
10+
* [Quick Start: SageMaker Endpoints](#-quick-start-sagemaker-endpoints)
911
* [Core Concepts](#core-concepts)
1012
* [Input Adapters](#input-adapters)
1113
* [1. Prompt Adapter](#1-prompt-adapter)
@@ -16,6 +18,9 @@ A Python SDK for optimizing prompts for Nova.
1618
* [Optimizers](#optimizers)
1719
* [NovaPromptOptimizer](#novapromptoptimizer)
1820
* [Evaluator](#evaluator)
21+
* [Advanced Features](#advanced-features)
22+
* [Separate Inference Adapters](#separate-inference-adapters)
23+
* [SageMaker Endpoint Support](#sagemaker-endpoint-support)
1924
* [Optimization Recommendations](#optimization-recommendations)
2025
* [Preview Status](#-preview-status)
2126
* [Interaction with AWS Bedrock](#interaction-with-aws-bedrock)
@@ -50,13 +55,133 @@ export AWS_SECRET_ACCESS_KEY="..."
5055
5. Wait for approval (instant in most cases)
5156

5257

58+
## 🎉 What's New
59+
60+
### SageMaker Endpoint Support
61+
Optimize prompts for models deployed on Amazon SageMaker! The optimizer now supports:
62+
- **SageMaker endpoints** as task models for optimization
63+
- **Automatic Bedrock integration** for meta-prompting with Nova 2.0 Lite
64+
- **OpenAI-compatible** payload format for SageMaker endpoints
65+
66+
### Separate Inference Adapters
67+
Use different models for different optimization phases:
68+
- **Meta-prompting**: Automatically uses Bedrock with Nova 2.0 Lite (or specify your own)
69+
- **Task optimization**: Use any supported backend (Bedrock, SageMaker, etc.)
70+
71+
### Example: Optimize for SageMaker
72+
```python
73+
from amzn_nova_prompt_optimizer.core.inference import SageMakerInferenceAdapter
74+
from amzn_nova_prompt_optimizer.core.optimizers import NovaPromptOptimizer
75+
76+
# Your SageMaker endpoint
77+
sagemaker_adapter = SageMakerInferenceAdapter(
78+
endpoint_name="my-model-endpoint",
79+
region_name="us-west-2"
80+
)
81+
82+
# Optimizer automatically uses Bedrock Nova 2.0 Lite for meta-prompting
83+
optimizer = NovaPromptOptimizer(
84+
prompt_adapter=prompt_adapter,
85+
inference_adapter=sagemaker_adapter, # Your SageMaker endpoint
86+
dataset_adapter=dataset_adapter,
87+
metric_adapter=metric_adapter
88+
)
89+
90+
optimized_prompt = optimizer.optimize(mode="lite")
91+
```
92+
93+
See the [Quick Start for SageMaker](#-quick-start-sagemaker-endpoints) section below for a complete example.
94+
95+
5396
## 🏁 Quick Start
5497
### Facility Support Analyzer Dataset
5598
The Facility Support Analyzer dataset consists of emails that are to be classified based on category, urgency and sentiment.
5699

57100
Please see the [samples](samples/facility-support-analyzer/) folder for example notebooks of how to optimize a prompt in the scenario where a [user prompt template is to be optimized](samples/facility-support-analyzer/user_prompt_only) and the scenario where a [user and system prompt is to be optimized together](samples/facility-support-analyzer/system_and_user_prompt)
58101

59102

103+
## 🚀 Quick Start: SageMaker Endpoints
104+
105+
Optimize prompts for models deployed on Amazon SageMaker in just a few steps:
106+
107+
### 1. Install and Configure
108+
```bash
109+
pip install nova-prompt-optimizer
110+
export AWS_ACCESS_KEY_ID="..."
111+
export AWS_SECRET_ACCESS_KEY="..."
112+
```
113+
114+
### 2. Prepare Your Dataset
115+
```jsonl
116+
{"input": "What is machine learning?", "answer": "Machine learning is..."}
117+
{"input": "Explain neural networks", "answer": "Neural networks are..."}
118+
```
119+
120+
### 3. Run Optimization
121+
```python
122+
from amzn_nova_prompt_optimizer.core.inference import SageMakerInferenceAdapter
123+
from amzn_nova_prompt_optimizer.core.input_adapters.dataset_adapter import JSONDatasetAdapter
124+
from amzn_nova_prompt_optimizer.core.input_adapters.prompt_adapter import TextPromptAdapter
125+
from amzn_nova_prompt_optimizer.core.input_adapters.metric_adapter import MetricAdapter
126+
from amzn_nova_prompt_optimizer.core.optimizers import NovaPromptOptimizer
127+
128+
# Define your metric
129+
class ExactMatchMetric(MetricAdapter):
130+
def apply(self, prediction: str, ground_truth: str) -> float:
131+
return 1.0 if prediction.strip() == ground_truth.strip() else 0.0
132+
133+
def batch_apply(self, predictions: list, ground_truths: list) -> float:
134+
matches = sum(self.apply(p, g) for p, g in zip(predictions, ground_truths))
135+
return matches / len(predictions) if predictions else 0.0
136+
137+
# Setup SageMaker adapter for your endpoint
138+
sagemaker_adapter = SageMakerInferenceAdapter(
139+
endpoint_name="YOUR-ENDPOINT-NAME",
140+
region_name="us-west-2"
141+
)
142+
143+
# Load dataset
144+
dataset_adapter = JSONDatasetAdapter(
145+
input_columns={"input"},
146+
output_columns={"answer"}
147+
)
148+
dataset_adapter.adapt("your_dataset.jsonl")
149+
train_set, test_set = dataset_adapter.split(0.5)
150+
151+
# Setup initial prompt
152+
prompt_adapter = TextPromptAdapter()
153+
prompt_adapter.set_system_prompt(content="You are a helpful assistant.")
154+
prompt_adapter.set_user_prompt(
155+
content="Answer this question: {{input}}",
156+
variables={"input"}
157+
)
158+
prompt_adapter.adapt()
159+
160+
# Setup metric
161+
metric_adapter = ExactMatchMetric()
162+
163+
# Create optimizer (automatically uses Bedrock Nova 2.0 Lite for meta-prompting)
164+
optimizer = NovaPromptOptimizer(
165+
prompt_adapter=prompt_adapter,
166+
inference_adapter=sagemaker_adapter, # Your SageMaker endpoint
167+
dataset_adapter=dataset_adapter,
168+
metric_adapter=metric_adapter
169+
)
170+
171+
# Run optimization
172+
optimized_prompt = optimizer.optimize(mode="lite")
173+
174+
# Save results
175+
optimized_prompt.save("optimized_prompts/")
176+
```
177+
178+
**What happens during optimization:**
179+
1. **Meta-Prompting Phase**: Uses Bedrock with Nova 2.0 Lite to generate an improved prompt structure (~30 seconds)
180+
2. **Task Optimization Phase**: Tests multiple prompt variations on your SageMaker endpoint (~5-15 minutes)
181+
182+
For more details, see the [SageMaker Quick Start Guide](docs/QUICK_START_SAGEMAKER.md).
183+
184+
60185
## Core Concepts
61186

62187
### Input Adapters
@@ -88,32 +213,57 @@ prompt_adapter.adapt()
88213
Learn More about the Prompt Adapter [here](docs/PromptAdapter.md)
89214

90215
### 2. Inference Adapter
91-
**Responsibility:** Ability to call an inference backend for the models e.g. Bedrock, etc.
216+
**Responsibility:** Ability to call an inference backend for the models e.g. Bedrock, SageMaker, etc.
92217

93-
**Sample use of Inference Adapter**
218+
**Sample use of Bedrock Inference Adapter**
94219

95220
```python
96-
from amzn_nova_prompt_optimizer.core.inference.adapter import BedrockInferenceAdapter
221+
from amzn_nova_prompt_optimizer.core.inference import BedrockInferenceAdapter
97222

98223
inference_adapter = BedrockInferenceAdapter(region_name="us-east-1")
99224
```
100225

101-
You can pass `rate_limit` into constructor of InferenceAdapter to limit the max TPS of bedrock call to avoid throttle. Default to 2 if not set.
226+
**Sample use of SageMaker Inference Adapter**
227+
228+
```python
229+
from amzn_nova_prompt_optimizer.core.inference import SageMakerInferenceAdapter
230+
231+
inference_adapter = SageMakerInferenceAdapter(
232+
endpoint_name="my-model-endpoint",
233+
region_name="us-west-2"
234+
)
235+
```
236+
237+
You can pass `rate_limit` into constructor of InferenceAdapter to limit the max TPS of calls to avoid throttle. Default to 2 if not set.
102238

103239
```python
104-
from amzn_nova_prompt_optimizer.core.inference.adapter import BedrockInferenceAdapter
240+
from amzn_nova_prompt_optimizer.core.inference import BedrockInferenceAdapter
105241

106242
inference_adapter = BedrockInferenceAdapter(region_name="us-east-1", rate_limit=10) # Max 10 TPS
107243
```
108244

109-
**Supported Inference Adapters:** `BedrockInferenceAdapter`
245+
**Supported Inference Adapters:**
246+
- `BedrockInferenceAdapter` - For Amazon Bedrock models
247+
- `SageMakerInferenceAdapter` - For SageMaker endpoints (OpenAI-compatible format)
110248

111249
**Core Functions**
112250

113251
Call the model using the parameters
114252
```python
115-
# Call the model with the passed parametrs
116-
inference_output = inference_adapter.call_model(model_id: str, system_prompt: str, messages: List[Dict[str, str]], inf_config: Dict[str, Any])
253+
# Call the model with the passed parameters
254+
inference_output = inference_adapter.call_model(
255+
model_id: str,
256+
system_prompt: str,
257+
messages: List[Dict[str, str]],
258+
inf_config: Dict[str, Any]
259+
)
260+
```
261+
262+
Test endpoint connectivity (SageMaker)
263+
```python
264+
# Test if endpoint is accessible
265+
if inference_adapter.test_connection():
266+
print("✓ Endpoint is ready")
117267
```
118268

119269
The Inference Adapter accepts the `system_prompt` as a string.
@@ -256,9 +406,23 @@ nova_prompt_optimizer = NovaPromptOptimizer(prompt_adapter=prompt_adapter, infer
256406

257407
optimized_prompt_adapter = nova_prompt_optimizer.optimize(mode="lite")
258408
```
259-
NovaPromptOptimizer uses Premier for Meta Prompting and then uses MIPROv2 with 20 candidates and 50 trials with Premier as Prompting model and task model dependent on the mode it's set at.
409+
NovaPromptOptimizer uses Nova 2.0 Lite for Meta Prompting and then uses MIPROv2 with 20 candidates and 30 trials. The task model depends on the mode it's set at.
410+
411+
**Automatic Meta-Prompting with Bedrock:**
412+
When you don't provide a `meta_prompt_inference_adapter`, NovaPromptOptimizer automatically creates a BedrockInferenceAdapter with Nova 2.0 Lite for the meta-prompting phase. This means you can use any inference adapter (including SageMaker) for your task model, and the optimizer will handle meta-prompting with Bedrock automatically.
413+
414+
**Optimization Modes:**
415+
416+
| Mode | Meta-Prompt Model | Task Model | Use Case |
417+
|------|-------------------|------------|----------|
418+
| `micro` | Nova 2.0 Lite | Nova Micro | Fast, cost-effective |
419+
| `lite` | Nova 2.0 Lite | Nova Lite | Balanced (default) |
420+
| `pro` | Nova 2.0 Lite | Nova Pro | High quality |
421+
| `lite-2` | Nova 2.0 Lite | Nova 2.0 Lite | Maximum quality |
422+
260423
You can specify enable_json_fallback=False to disable the behavior that MIPROv2 will [fallback to use JSONAdapter to parse LM model output](https://github.com/stanfordnlp/dspy/blob/main/dspy/adapters/chat_adapter.py#L44-L51). This will force MIPROv2 use structured output (pydantic model) to parse LM output.
261424

425+
**Custom Mode:**
262426
You could also define a custom mode and pass your own parameter values to NovaPromptOptimizer
263427

264428
```python
@@ -314,10 +478,103 @@ evaluator.save("eval_results.jsonl")
314478
WARNING amzn_nova_prompt_optimizer.core.inference: Warn: Prompt Variables not found in User Prompt, injecting them at the end of the prompt
315479
```
316480

481+
## Advanced Features
482+
483+
### Separate Inference Adapters
484+
485+
Use different inference adapters for meta-prompting and task optimization phases. This is particularly useful when optimizing prompts for SageMaker endpoints while using Bedrock for meta-prompting.
486+
487+
**Example:**
488+
```python
489+
from amzn_nova_prompt_optimizer.core.inference import (
490+
BedrockInferenceAdapter,
491+
SageMakerInferenceAdapter
492+
)
493+
from amzn_nova_prompt_optimizer.core.optimizers import NovaPromptOptimizer
494+
495+
# Bedrock for meta-prompting (generates optimized prompts)
496+
meta_adapter = BedrockInferenceAdapter(region_name="us-east-1")
497+
498+
# SageMaker for task model (the model being optimized)
499+
task_adapter = SageMakerInferenceAdapter(
500+
endpoint_name="my-endpoint",
501+
region_name="us-west-2"
502+
)
503+
504+
# Create optimizer with separate adapters
505+
optimizer = NovaPromptOptimizer(
506+
prompt_adapter=prompt_adapter,
507+
inference_adapter=task_adapter, # For task optimization
508+
dataset_adapter=dataset_adapter,
509+
metric_adapter=metric_adapter,
510+
meta_prompt_inference_adapter=meta_adapter # For meta-prompting
511+
)
512+
513+
optimized_prompt = optimizer.optimize(mode="lite")
514+
```
515+
516+
**Benefits:**
517+
- Use the best model for each optimization phase
518+
- Optimize SageMaker endpoints with Bedrock intelligence
519+
- Cross-region support
520+
- Independent rate limiting per adapter
521+
522+
For more details, see the [Separate Inference Adapters Guide](docs/SeparateInferenceAdapters.md).
523+
524+
### SageMaker Endpoint Support
525+
526+
The SDK now supports optimizing prompts for models deployed on Amazon SageMaker. SageMaker endpoints must use an OpenAI-compatible message format:
527+
528+
**Required Payload Format:**
529+
```json
530+
{
531+
"messages": [
532+
{"role": "system", "content": "You are helpful"},
533+
{"role": "user", "content": "Hello"}
534+
],
535+
"max_tokens": 1000,
536+
"temperature": 0.7,
537+
"top_p": 0.9,
538+
"top_k": 50
539+
}
540+
```
541+
542+
**Expected Response Format:**
543+
```json
544+
{
545+
"choices": [
546+
{
547+
"message": {
548+
"role": "assistant",
549+
"content": "Hello! How can I help you?"
550+
}
551+
}
552+
]
553+
}
554+
```
555+
556+
**Testing Your Endpoint:**
557+
```python
558+
from amzn_nova_prompt_optimizer.core.inference import SageMakerInferenceAdapter
559+
560+
adapter = SageMakerInferenceAdapter(
561+
endpoint_name="my-endpoint",
562+
region_name="us-west-2"
563+
)
564+
565+
# Test connectivity
566+
if adapter.test_connection():
567+
print("✓ Endpoint is ready for optimization")
568+
else:
569+
print("✗ Endpoint connection failed")
570+
```
571+
572+
For a complete guide, see the [SageMaker Quick Start](docs/QUICK_START_SAGEMAKER.md).
573+
317574
## Optimization Recommendations
318575
1. Provide representative real-world evaluation sets and split them into training and testing sets. Ensure dataset is balanced on output label when splitting train and test sets.
319576
2. For evaluation sets, the ground truth column should be as close to the inference output as possible. e.g. If the inference output is {"answer": "POSITIVE"} ground truth should also be in the same format {"answer": "POSITIVE"}
320-
3. For NovaPromptOptimizer, choose the mode (mode= "premier" | ""pro" | "lite" | "micro") based on your Nova Model of choice. By default, we use "pro".
577+
3. For NovaPromptOptimizer, choose the mode (mode= "lite-2" | ""pro" | "lite" | "micro") based on your Nova Model of choice. By default, we use "pro".
321578
4. The `apply` function of the evaluation metric should return a numerical value between 0 and 1 for NovaPromptOptimizer or MIPROv2.
322579

323580
## ⚠️ Preview Status

build.sh

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
#!/bin/bash
2+
set -e
3+
4+
echo "🧹 Cleaning previous builds..."
5+
rm -rf dist/ build/ *.egg-info src/*.egg-info
6+
7+
echo "🔨 Building package..."
8+
python3 -m build
9+
10+
echo ""
11+
echo "✅ Build complete!"
12+
echo ""
13+
echo "📦 Generated files:"
14+
ls -lh dist/
15+
16+
echo ""
17+
echo "🚀 To install locally:"
18+
echo " pip install dist/nova_prompt_optimizer-*-py3-none-any.whl"
19+
echo ""
20+
echo "📤 To distribute:"
21+
echo " - Share the .whl file directly"
22+
echo " - Upload to internal PyPI: twine upload --repository-url <url> dist/*"
23+
echo " - Upload to S3: aws s3 cp dist/*.whl s3://your-bucket/packages/"
24+
echo " - Attach to GitHub release"

0 commit comments

Comments
 (0)