Skip to content

Commit 4fd3cfd

Browse files
committed
Merge branch 'feature/custom-extraction-prompt' into 'develop'
feat: Implement custom prompt generator Lambda See merge request genaiic-reusable-assets/engagement-artifacts/genaiic-idp-accelerator!255
2 parents bb3229f + 0561f3d commit 4fd3cfd

File tree

16 files changed

+1800
-141
lines changed

16 files changed

+1800
-141
lines changed

CHANGELOG.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,21 @@ SPDX-License-Identifier: MIT-0
55

66
## [Unreleased]
77

8+
### Added
9+
10+
11+
812
## [0.3.12]
913

1014
### Added
1115

16+
- **Custom Prompt Generator Lambda Support for Patterns 2 & 3**
17+
- Added `custom_prompt_lambda_arn` configuration field to enable injection of custom business logic into extraction processing
18+
- **Key Features**: Lambda interface with all template placeholders (DOCUMENT_TEXT, DOCUMENT_CLASS, ATTRIBUTE_NAMES_AND_DESCRIPTIONS, DOCUMENT_IMAGE), URI-based image handling for JSON serialization, comprehensive error handling with fail-fast behavior, scoped IAM permissions requiring GENAIIDP-* function naming
19+
- **Use Cases**: Document type-specific processing rules, integration with external systems for customer configurations, conditional processing based on document content, regulatory compliance and industry-specific requirements
20+
- **Demo Resources**: Interactive notebook demonstration (`step3_extraction_with_custom_lambda.ipynb`), SAM deployment template for demo Lambda function, comprehensive documentation and examples in `notebooks/examples/demo-lambda/`
21+
- **Benefits**: Custom business logic without core code changes, backward compatible (existing deployments unchanged), robust JSON serialization handling all object types, complete observability with detailed logging
22+
1223
- **Refactored Document Classification Service for Enhanced Boundary Detection**
1324
- Consolidated `multimodalPageLevelClassification` and the experimental `multimodalPageBoundaryClassification` (from v0.3.11) into a single enhanced `multimodalPageLevelClassification` method
1425
- Implemented BIO-like sequence segmentation with document boundary indicators: "start" (new document) and "continue" (same document)

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ White-glove customization, deployment, and integration support for production us
3333
- **Modular, pluggable patterns**: Pre-built processing patterns using state-of-the-art models and AWS services
3434
- **Advanced Classification**: Support for page-level and holistic document packet classification
3535
- **Few Shot Example Support**: Improve accuracy through example-based prompting
36+
- **Custom Business Logic Integration**: Inject custom prompt generation logic via Lambda functions for specialized document processing
3637
- **High Throughput Processing**: Handles large volumes of documents through intelligent queuing
3738
- **Built-in Resilience**: Comprehensive error handling, retries, and throttling management
3839
- **Cost Optimization**: Pay-per-use pricing model with built-in controls

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.3.12-wip
1+
0.3.12-wip2

docs/extraction.md

Lines changed: 206 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -226,6 +226,212 @@ extraction:
226226
4. **Handle OCR Limitations**: Use images to fill gaps where OCR may miss visual-only content
227227
5. **Consider Document Types**: Different document types benefit from different image placement strategies
228228

229+
## Custom Prompt Generator Lambda Functions
230+
231+
The extraction service supports custom Lambda functions for advanced prompt generation, allowing you to inject custom business logic into the extraction process while leveraging the existing IDP infrastructure.
232+
233+
### Overview
234+
235+
Custom prompt generator Lambda functions enable:
236+
237+
- **Document type-specific processing** with specialized extraction logic
238+
- **Integration with external systems** for dynamic configuration retrieval
239+
- **Conditional processing** based on document content analysis
240+
- **Regulatory compliance** with industry-specific prompt requirements
241+
- **Multi-tenant customization** for different customer requirements
242+
243+
### Configuration
244+
245+
Add the `custom_prompt_lambda_arn` field to your extraction configuration:
246+
247+
```yaml
248+
extraction:
249+
model: us.amazon.nova-pro-v1:0
250+
temperature: 0.0
251+
system_prompt: "Your default system prompt..."
252+
task_prompt: "Your default task prompt..."
253+
# Custom Lambda function for prompt generation
254+
custom_prompt_lambda_arn: "arn:aws:lambda:us-east-1:123456789012:function:GENAIIDP-my-extractor"
255+
```
256+
257+
**Lambda Function Requirements:**
258+
- Function name must start with `GENAIIDP-` (required for IAM permissions)
259+
- Must return valid JSON with `system_prompt` and `task_prompt_content` fields
260+
- Available in Patterns 2 and 3 only
261+
262+
### Lambda Interface
263+
264+
Your Lambda function receives a comprehensive payload with all context needed for prompt generation:
265+
266+
**Input Payload:**
267+
```json
268+
{
269+
"config": {
270+
"extraction": {...},
271+
"classes": [...],
272+
"assessment": {...}
273+
},
274+
"prompt_placeholders": {
275+
"DOCUMENT_TEXT": "Full OCR extracted text from all pages",
276+
"DOCUMENT_CLASS": "Invoice",
277+
"ATTRIBUTE_NAMES_AND_DESCRIPTIONS": "Invoice Number\t[Unique identifier]...",
278+
"DOCUMENT_IMAGE": ["s3://bucket/document/pages/1/image.jpg", "s3://bucket/document/pages/2/image.jpg"]
279+
},
280+
"default_task_prompt_content": [
281+
{"text": "Resolved default task prompt with placeholders replaced"},
282+
{"image_uri": "<image_placeholder>"},
283+
{"cachePoint": true}
284+
],
285+
"serialized_document": {
286+
"id": "document-123",
287+
"input_bucket": "my-bucket",
288+
"input_key": "documents/invoice.pdf",
289+
"pages": {...},
290+
"sections": [...],
291+
"status": "EXTRACTING"
292+
}
293+
}
294+
```
295+
296+
**Required Output:**
297+
```json
298+
{
299+
"system_prompt": "Your custom system prompt based on document analysis",
300+
"task_prompt_content": [
301+
{"text": "Your custom task prompt with business logic applied"},
302+
{"image_uri": "<preserved_placeholder>"},
303+
{"cachePoint": true}
304+
]
305+
}
306+
```
307+
308+
### Implementation Examples
309+
310+
**Document Type Detection:**
311+
```python
312+
def lambda_handler(event, context):
313+
placeholders = event.get('prompt_placeholders', {})
314+
document_class = placeholders.get('DOCUMENT_CLASS', '')
315+
316+
if 'bank statement' in document_class.lower():
317+
return generate_banking_prompts(event)
318+
elif 'invoice' in document_class.lower():
319+
return generate_invoice_prompts(event)
320+
else:
321+
return use_default_prompts(event)
322+
```
323+
324+
**Content-Based Analysis:**
325+
```python
326+
def lambda_handler(event, context):
327+
placeholders = event.get('prompt_placeholders', {})
328+
document_text = placeholders.get('DOCUMENT_TEXT', '')
329+
image_uris = placeholders.get('DOCUMENT_IMAGE', [])
330+
331+
# Multi-page processing logic
332+
if len(image_uris) > 3:
333+
return generate_multi_page_prompts(event)
334+
335+
# International document detection
336+
if any(term in document_text.lower() for term in ['vat', 'gst', 'euro']):
337+
return generate_international_prompts(event)
338+
339+
return use_standard_prompts(event)
340+
```
341+
342+
**External System Integration:**
343+
```python
344+
import boto3
345+
346+
def lambda_handler(event, context):
347+
document = event.get('serialized_document', {})
348+
customer_id = document.get('customer_id') # Custom field
349+
350+
# Retrieve customer-specific rules
351+
dynamodb = boto3.resource('dynamodb')
352+
table = dynamodb.Table('customer-extraction-rules')
353+
354+
customer_rules = table.get_item(Key={'customer_id': customer_id}).get('Item', {})
355+
356+
# Apply customer-specific customization
357+
if customer_rules.get('enhanced_validation'):
358+
return generate_enhanced_validation_prompts(event)
359+
360+
return use_standard_prompts(event)
361+
```
362+
363+
### Error Handling
364+
365+
The system implements **fail-fast error handling** for custom Lambda functions:
366+
367+
- **Lambda invocation failures** cause extraction to fail with detailed error messages
368+
- **Invalid response format** results in extraction failure with validation errors
369+
- **Function errors** propagate with Lambda error details
370+
- **Timeout scenarios** fail with timeout information
371+
372+
**Example Error Messages:**
373+
```
374+
Failed to invoke custom prompt Lambda arn:aws:lambda:...: Connection timeout
375+
Custom prompt Lambda failed: KeyError: 'system_prompt' not found in response
376+
Custom prompt Lambda returned invalid response format: expected dict, got str
377+
```
378+
379+
### Performance Considerations
380+
381+
- **Lambda Overhead**: Adds latency from Lambda cold starts and execution time
382+
- **JSON Serialization**: Optimized with URI-based image handling to minimize payload size
383+
- **Efficient Interface**: Avoids sending large image bytes, uses S3 URIs instead
384+
- **Monitoring**: Comprehensive logging for performance analysis and troubleshooting
385+
386+
### Deployment and Testing
387+
388+
**1. Demo Lambda Function:**
389+
Deploy the provided demo Lambda for testing:
390+
```bash
391+
cd notebooks/examples/demo-lambda
392+
sam deploy --guided
393+
```
394+
395+
**2. Interactive Testing:**
396+
Use the demo notebook for hands-on experimentation:
397+
```bash
398+
jupyter notebook notebooks/examples/step3_extraction_with_custom_lambda.ipynb
399+
```
400+
401+
**3. Production Deployment:**
402+
Create your production Lambda with business-specific logic and deploy with appropriate IAM permissions.
403+
404+
### Use Cases
405+
406+
**Financial Services:**
407+
- Regulatory compliance prompts for different financial products
408+
- Multi-currency transaction handling with exchange rate awareness
409+
- Customer-specific formatting for different banking institutions
410+
411+
**Healthcare:**
412+
- HIPAA compliance with privacy-focused prompts
413+
- Medical terminology enhancement for clinical documents
414+
- Provider-specific templates for different healthcare systems
415+
416+
**Legal:**
417+
- Jurisdiction-specific legal language processing
418+
- Contract type specialization (NDAs, service agreements, etc.)
419+
- Compliance requirements for regulatory documents
420+
421+
**Insurance:**
422+
- Policy type customization for different insurance products
423+
- Claims processing with adjuster-specific requirements
424+
- Risk assessment integration with underwriting systems
425+
426+
### Security and Compliance
427+
428+
- **Scoped IAM Permissions**: Only Lambda functions with `GENAIIDP-*` naming can be invoked
429+
- **Audit Trail**: All Lambda invocations are logged for security monitoring
430+
- **Input Validation**: Lambda response structure is validated before use
431+
- **Fail-Safe Operation**: Lambda failures cause extraction to fail rather than continue with potentially incorrect prompts
432+
433+
For complete examples and deployment instructions, see `notebooks/examples/demo-lambda/README.md`.
434+
229435
## Using CachePoint for Extraction
230436

231437
CachePoint is a feature of select Bedrock models that caches partial computations to improve performance and reduce costs. When used with extraction, it provides:

0 commit comments

Comments
 (0)