aws-solutions-library-samples
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 58 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 58 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 7 additions & 5 deletions b/‎README.md‎
Lines changed: 7 additions & 5 deletions
diff --git a/‎VERSION‎
Lines changed: 1 addition & 1 deletion b/‎VERSION‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎lib/idp_common_pkg/README.md‎
Lines changed: 244 additions & 0 deletions b/‎lib/idp_common_pkg/README.md‎
Lines changed: 244 additions & 0 deletions
diff --git a/‎lib/idp_common_pkg/idp_common/__init__.py‎
Lines changed: 29 additions & 0 deletions b/‎lib/idp_common_pkg/idp_common/__init__.py‎
Lines changed: 29 additions & 0 deletions
@@ -2,6 +2,7 @@
 build.toml
 model.tar.gz
 .checksum
+.checksums/
 .vscode/
 .DS_Store
 dist/
 
@@ -0,0 +1,58 @@
+# Changelog
+
+
+## [0.2.17]
+
+### Enhanced Textract OCR Features
+- Added support for Textract advanced features (TABLES, FORMS, SIGNATURES, LAYOUT)
+- OCR results now output in rich markdown format for better visualization
+- Configurable OCR feature selection through schema configuration
+- Improved metering and tracking for different Textract feature combinations
+
+## [0.2.16] 
+
+### Add additional model choice
+- Claude, Nova, Meta, and DeepSeek model selection now available
+
+### New Document-Based Architecture
+
+The `idp_common_pkg` introduces a unified Document model approach for consistent document processing:
+
+#### Core Classes
+- **Document**: Central data model that tracks document state through the entire processing pipeline
+- **Page**: Represents individual document pages with OCR results and classification
+- **Section**: Represents logical document sections with classification and extraction results
+
+#### Service Classes
+- **OcrService**: Processes documents with AWS Textract and updates the Document with OCR results
+- **ClassificationService**: Classifies document pages/sections using Bedrock or SageMaker backends
+- **ExtractionService**: Extracts structured information from document sections using Bedrock
+
+### Pattern Implementation Updates
+- Lambda functions refactored, and significantly simplified, to use Document and Section objects, and new Service classes
+
+### Key Benefits
+
+1. **Simplified Integration**: Consistent interfaces make service integration straightforward
+2. **Improved Maintainability**: Unified data model reduces code duplication and complexity
+3. **Better Error Handling**: Standardized approach to error capture and reporting
+4. **Enhanced Traceability**: Complete document history throughout the processing pipeline
+5. **Flexible Backend Support**: Easy switching between Bedrock and SageMaker backends
+6. **Optimized Resource Usage**: Focused document processing for better performance
+7. **Granular Package Installation**: Install only required components with extras syntax
+
+### Example Notebook
+
+A new comprehensive Jupyter notebook demonstrates the Document-based workflow:
+- Shows complete end-to-end processing (OCR → Classification → Extraction)
+- Uses AWS services (S3, Textract, Bedrock)
+- Demonstrates Document object creation and manipulation
+- Showcases how to access and utilize extraction results
+- Provides a template for custom implementations
+- Includes granular package installation examples (`pip install "idp_common_pkg[ocr,classification,extraction]"`)
+
+This refactoring sets the foundation for more maintainable, extensible document processing workflows with clearer data flow and easier troubleshooting.
+
+### Refactored publish.sh script
+ - improved modularity with functions
+ - improved checksum logic to determine when to rebuild components
@@ -242,15 +242,17 @@ Navigate into the project root directory and, in a bash shell, run:
 
 **This completes the preparation stage of the installation process. The process now proceeds to the Cloudformation stack installation stage.**
 
-When completed, it displays the CloudFormation templates S3 URLs, 1-click URLs for launching the stack creation in CloudFormation console, and a command to deploy from the CLI:
+When completed, it displays the CloudFormation templates S3 URL, and a 1-click URLs for launching the stack creation in CloudFormation console:
 ```
 OUTPUTS
-Template URL: https://s3.us-east-1.amazonaws.com/bobs-artifacts-us-east-1/transflo-idp/packaged.yaml
-CF Launch URL: https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://s3.us-east-1.amazonaws.com/bobs-artifacts-us-east-1/transflo-idp/packaged.yaml&stackName=IDP
-CLI Deploy: aws cloudformation deploy --region us-east-1 --template-file /tmp/1132557/packaged.yaml --capabilities CAPABILITY_NAMED_IAM CAPABILITY_AUTO_EXPAND --stack-name <your_stack_name>>
+Template URL: https://s3.<region>.amazonaws.com/<cfn_bucket_basename>-<region>/<cfn_prefix>/packaged.yaml
+1-Click Launch URL: https://<region>.console.aws.amazon.com/cloudformation/home?region=<region>#/stacks/create/review?templateURL=https://s3.<region>.amazonaws.com/<cfn_bucket_basename>-<region>/<cfn_prefix>/packaged.yaml&stackName=IDP
 Done
 ```
 
+** Recommended: Deploy using AWS CloudFormation console.**  
+For your first time deployment, log in to your AWS account and then use the `1-Click Launch URL` to create a new stack with CloudFormation. It's easier to inspect the available parameter options using the console initially. The CLI option below is better suited for scripted / automated deployments, and requires that you already know the right parameter values to use.
+
 ```bash
 # To install from the CLI the `CLI Deploy` command will be similar to the following:
 aws cloudformation deploy \
@@ -267,7 +269,7 @@ aws cloudformation deploy \
 * `<the-pattern-name-here>` should be one of the valid pattern names encased in quotes. (Each pattern may have their own required parameter overrides, see README documentation for details.)
   * `Pattern3 - Packet processing with Textract, SageMaker(UDOP), and Bedrock`
   * `Pattern2 - Packet processing with Textract and Bedrock`
-    * (This is a great pattern to start with to try out the solution because it has not further dependencies.)
+    * (This is a great pattern to start with to try out the solution because it has no further dependencies.)
   * `Pattern1 - Packet or Media processing with Bedrock Data Automation (BDA)`
 
 After you have deployed the stack, check the Outputs tab to inspect names and links to the dashboards, buckets, workflows and other solution resources.
 
@@ -1 +1 @@
-0.2.14
+0.2.17
@@ -0,0 +1,244 @@
+# IDP Common Package
+
+This package contains common utilities and services for the GenAI IDP Accelerator patterns.
+
+## Components
+
+### Core Data Model
+
+- **Document Model**: Central data structure for the entire IDP pipeline ([models.py](idp_common/models.py))
+
+### Core Services
+
+- **OCR**: Document OCR processing with AWS Textract ([README](idp_common/ocr/README.md))
+- **Classification**: Document classification using LLMs and SageMaker/UDOP ([README](idp_common/classification/README.md))
+- **Extraction**: Field extraction from documents using LLMs ([README](idp_common/extraction/README.md))
+
+### AWS Service Clients
+
+- Bedrock client with retry logic
+- S3 client operations
+- CloudWatch metrics
+
+### Configuration
+
+- DynamoDB-based configuration management
+- Support for default and custom configuration merging
+
+### Image Processing
+
+- Image resizing and preparation
+- Support for multimodal inference with Bedrock
+
+### Utils
+
+- Retry/backoff algorithm
+- S3 URI parsing
+- Metering data aggregation
+
+## Unified Document-based Architecture
+
+All core services (OCR, Classification, and Extraction) have been refactored to use a unified Document model approach:
+
+```python
+from idp_common import get_config
+from idp_common.models import Document
+from idp_common import ocr, classification, extraction
+
+# Initialize document
+document = Document(
+    id="doc-123",
+    input_bucket="my-input-bucket",
+    input_key="documents/sample.pdf",
+    output_bucket="my-output-bucket"
+)
+
+# Get configuration
+config = get_config()
+
+# Process with OCR
+ocr_service = ocr.OcrService(config=config)
+document = ocr_service.process_document(document)
+
+# Perform classification (supports both Bedrock and SageMaker/UDOP backends)
+classification_service = classification.ClassificationService(
+    config=config,
+    backend="bedrock"  # or "sagemaker" for SageMaker UDOP model
+)
+document = classification_service.classify_document(document)
+
+# Extract information from a section
+extraction_service = extraction.ExtractionService(config=config)
+document = extraction_service.process_document_section(
+    document=document, 
+    section_id=document.sections[0].section_id
+)
+
+# Access the extraction results URI
+result_uri = document.sections[0].extraction_result_uri
+```
+
+## Service Modules
+
+### Document Model (`models.py`)
+
+The central data model for the IDP processing pipeline:
+- Represents the state of a document as it moves through processing
+- Tracks pages, sections, processing status, and results
+- Common data structure shared between all services
+
+### OCR Service (`ocr`)
+
+Provides OCR processing of documents using AWS Textract:
+- Document-based OCR processing with the `process_document()` method
+- Multi-page document processing with thread concurrency
+- Image extraction and optimization
+- Support for enhanced Textract features (TABLES, FORMS, SIGNATURES, LAYOUT) with granular control
+- Rich markdown output for tables and forms preservation
+- Well-structured results for downstream processing
+
+### Classification Service (`classification`)
+
+Document classification using multimodal LLMs:
+- Document-based classification with the `classify_document()` method
+- Support for both Bedrock and SageMaker backends
+- Page-level and document-level classification
+- Section detection for multi-class documents
+- Configurable document types and descriptions
+- Multimodal classification with both text and images
+
+### Extraction Service (`extraction`)
+
+Field extraction from documents using multimodal LLMs:
+- Document-based extraction with the `process_document_section()` method
+- Extraction of structured data from document sections
+- Support for document class-specific attribute definitions
+- Multimodal extraction using both text and images
+- Flexible prompt templates configurable via the configuration system
+- Results stored in S3 with URIs tracked in the Document model
+
+## Basic Usage
+
+```python
+from idp_common import (
+    bedrock,       # Bedrock client and operations
+    s3,            # S3 operations
+    metrics,       # CloudWatch metrics
+    image,         # Image processing
+    utils,         # General utilities
+    config,        # Configuration module
+    get_config,    # Direct access to the configuration function
+    ocr,           # OCR service and models
+    classification, # Classification service and models
+    extraction     # Extraction service and models
+)
+from idp_common.models import Document, Status
+
+# Get configuration (merged from Default and Custom records in the DynamoDb Configuration Table)
+cfg = get_config()
+
+# Create a document object
+document = Document(
+    input_bucket="my-bucket",
+    input_key="my-document.pdf",
+    output_bucket="output-bucket"
+)
+
+# OCR Processing
+ocr_service = ocr.OcrService()  # Basic text detection
+# ocr_service = ocr.OcrService(enhanced_features=["TABLES", "FORMS"])  # Enhanced features
+document = ocr_service.process_document(document)
+
+# Document Classification (choose your backend)
+classification_service = classification.ClassificationService(
+    config=cfg, 
+    backend="bedrock"  # or "sagemaker" for UDOP model
+)
+document = classification_service.classify_document(document)
+
+# Field Extraction for a section
+extraction_service = extraction.ExtractionService(config=cfg)
+document = extraction_service.process_document_section(document, section_id="section-1")
+
+# Publish a metric
+metrics.put_metric("MetricName", 1)
+
+# Invoke Bedrock
+response = bedrock.invoke_model(...)
+
+# Read from S3
+content = s3.get_text_content("s3://bucket/key.json")
+
+# Process an image for model input
+image_bytes = image.prepare_image("s3://bucket/image.jpg")
+
+# Parse S3 URI
+bucket, key = utils.parse_s3_uri("s3://bucket/key")
+```
+
+## Configuration
+
+The configuration module provides a way to retrieve and merge configuration from DynamoDB. It expects:
+
+1. A DynamoDB table with a primary key named 'Configuration'
+2. Two configuration items with keys 'Default' and 'Custom'
+
+The `get_config()` function retrieves both configurations and merges them, with custom values taking precedence over default ones.
+
+```python
+# Get configuration with default table name from CONFIGURATION_TABLE_NAME environment variable
+config = get_config()
+
+# Or specify a table name explicitly
+config = get_config(table_name="my-config-table")
+```
+
+## Installation with Granular Dependencies
+
+To minimize Lambda package size, you can install only the specific components you need:
+
+```bash
+# Install core functionality only (minimal dependencies)
+pip install "idp_common[core]"
+
+# Install with OCR support
+pip install "idp_common[ocr]"
+
+# Install with classification support
+pip install "idp_common[classification]"
+
+# Install with extraction support
+pip install "idp_common[extraction]"
+
+# Install with image processing support
+pip install "idp_common[image]"
+
+# Install everything
+pip install "idp_common[all]"
+
+# Install multiple components
+pip install "idp_common[ocr,classification]"
+```
+
+For Lambda functions, specify only the required components in requirements.txt:
+
+```
+../../lib/idp_common_pkg[extraction]
+```
+
+This ensures that only the necessary dependencies are included in your Lambda deployment package.
+
+## Development Notes
+
+This package has been refactored to use a unified Document-based approach across all services:
+
+1. All services now accept and return Document objects
+2. Each service updates the Document with its results
+3. Results are properly encapsulated in the Document model
+4. Large results (like extraction attributes) are stored in S3 with only URIs in the Document
+
+Key benefits:
+- Consistency across all services
+- Simplified data flow in serverless functions
+- Better resource usage with the focused document pattern
+- Improved maintainability with standardized interfaces
@@ -0,0 +1,29 @@
+# Use true lazy loading for all submodules
+__version__ = "0.1.0"
+
+# Cache for lazy-loaded submodules
+_submodules = {}
+
+def __getattr__(name):
+    """Lazy load submodules only when accessed"""
+    if name in ['bedrock', 's3', 'metrics', 'image', 'utils', 'config', 'ocr', 'classification', 'extraction', 'models']:
+        if name not in _submodules:
+            _submodules[name] = __import__(f"idp_common.{name}", fromlist=['*'])
+        return _submodules[name]
+    
+    # Special handling for directly exposed functions
+    if name == 'get_config':
+        config = __getattr__('config')
+        return config.get_config
+    
+    # Special handling for directly exposed classes
+    if name in ['Document', 'Page', 'Section', 'Status']:
+        models = __getattr__('models')
+        return getattr(models, name)
+    
+    raise AttributeError(f"module 'idp_common' has no attribute '{name}'")
+
+__all__ = [
+    'bedrock', 's3', 'metrics', 'image', 'utils', 'config', 'ocr', 'classification', 'extraction', 'models',
+    'get_config', 'Document', 'Page', 'Section', 'Status'
+]