aws-solutions-library-samples
diff --git a/‎CHANGELOG.md‎
Lines changed: 8 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎VERSION‎
Lines changed: 1 addition & 1 deletion b/‎VERSION‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/web-ui.md‎
Lines changed: 60 additions & 0 deletions b/‎docs/web-ui.md‎
Lines changed: 60 additions & 0 deletions
diff --git a/‎lib/idp_common_pkg/idp_common/appsync/client.py‎
Lines changed: 1 addition & 1 deletion b/‎lib/idp_common_pkg/idp_common/appsync/client.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎lib/idp_common_pkg/idp_common/appsync/service.py‎
Lines changed: 11 additions & 3 deletions b/‎lib/idp_common_pkg/idp_common/appsync/service.py‎
Lines changed: 11 additions & 3 deletions
diff --git a/‎lib/idp_common_pkg/idp_common/dynamodb/service.py‎
Lines changed: 22 additions & 7 deletions b/‎lib/idp_common_pkg/idp_common/dynamodb/service.py‎
Lines changed: 22 additions & 7 deletions
diff --git a/‎lib/idp_common_pkg/idp_common/models.py‎
Lines changed: 2 additions & 2 deletions b/‎lib/idp_common_pkg/idp_common/models.py‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎lib/idp_common_pkg/idp_common/utils/__init__.py‎
Lines changed: 16 additions & 2 deletions b/‎lib/idp_common_pkg/idp_common/utils/__init__.py‎
Lines changed: 16 additions & 2 deletions
@@ -6,6 +6,14 @@ SPDX-License-Identifier: MIT-0
 ## [Unreleased]
 
 ### Added
+
+- **Edit Sections Feature for Modifying Class/Type and Reprocessing Extraction**
+  - Added Edit Sections interface for Pattern-2 and Pattern-3 workflows with reprocessing optimization
+  - **Key Features**: Section management (create, update, delete), classification updates, page reassignment with overlap detection, real-time validation
+  - **Selective Reprocessing**: Only modified sections are reprocessed while preserving existing data for unmodified sections
+  - **Processing Pipeline**: All functions (OCR/Classification/Extraction/Assessment) automatically skip redundant operations based on data presence
+  - **Pattern Compatibility**: Full functionality for Pattern-2/Pattern-3, informative modal for Pattern-1 explaining BDA not yet supported
+
 - **Analytics Agent Schema Optimization for Improved Performance**
   - **Embedded Database Overview**: Complete table listing and guidance embedded directly in system prompt (no tool call needed)
   - **On-Demand Detailed Schemas**: `get_table_info(['specific_tables'])` loads detailed column information only for tables actually needed by the query
 
@@ -1 +1 @@
-0.3.16
+0.3.17-rc1
@@ -26,6 +26,66 @@ The solution includes a responsive web-based user interface built with React tha
 - **Document Process Flow visualization** for detailed workflow execution monitoring and troubleshooting
 - **Document Analytics** for querying and visualizing processed document data
 
+## Edit Sections
+
+The Edit Sections feature provides an intelligent interface for modifying document section classifications and page assignments, with automatic reprocessing optimization for Pattern-2 and Pattern-3 workflows.
+
+### Key Capabilities
+
+- **Section Management**: Create, update, and delete document sections with validation
+- **Classification Updates**: Change section document types with real-time validation
+- **Page Reassignment**: Move pages between sections with overlap detection
+- **Intelligent Reprocessing**: Only modified sections are reprocessed, preserving existing data
+- **Immediate Feedback**: Status updates appear instantly in the UI
+- **Pattern Compatibility**: Available for Pattern-2 and Pattern-3, with informative guidance for Pattern-1
+
+### How to Use
+
+1. Navigate to a completed document's detail page
+2. In the "Document Sections" panel, click the "Edit Sections" button
+3. **For Pattern-2/Pattern-3**: Enter edit mode with inline editing capabilities
+4. **For Pattern-1**: View informative modal explaining BDA architecture differences
+
+#### Editing Workflow (Pattern-2/Pattern-3)
+
+1. **Edit Section Classifications**: Use dropdowns to change document types
+2. **Modify Page Assignments**: Edit comma-separated page IDs (e.g., "1, 2, 3")
+3. **Add New Sections**: Click "Add Section" for new document boundaries  
+4. **Delete Sections**: Use remove buttons to delete unnecessary sections
+5. **Validation**: Real-time validation prevents overlapping pages and invalid configurations
+6. **Submit Changes**: Click "Save & Process Changes" to trigger selective reprocessing
+
+### Processing Optimization
+
+The Edit Sections feature uses **2-phase schema knowledge optimization**:
+
+#### Phase 1: Frontend 
+- **Selective Payload**: Only sends sections that actually changed
+- **Validation Engine**: Prevents invalid configurations before submission
+
+#### Phase 2: Backend   
+- **Pipeline**: Processing functions automatically skip redundant operations
+  - **OCR**: Skips if pages already have OCR data
+  - **Classification**: Skips if pages already classified
+  - **Extraction**: Skips if sections have extraction data
+  - **Assessment**: Skips if extraction results contain assessment data
+- **Selective Reprocessing**: Only modified sections lose their data and get reprocessed
+
+### Pattern Compatibility
+
+#### Pattern-2 and Pattern-3 Support
+- **Full Functionality**: Complete edit capabilities with intelligent reprocessing
+- **Performance Optimization**: Automatic selective processing for efficiency  
+- **Data Preservation**: Unmodified sections retain all processing results
+
+#### Pattern-1 Information
+Pattern-1 uses **Bedrock Data Automation (BDA)** with automatic section management. When Edit Sections is clicked, users see an informative modal explaining:
+
+- **Architecture Differences**: BDA handles section boundaries automatically
+- **Alternative Workflows**: Available options like "View/Edit Data", Configuration updates, and document reprocessing
+- **Future Considerations**: Guidance on using Pattern-2/Pattern-3 for fine-grained section control
+
+
 ## Document Analytics
 
 The Document Analytics feature allows users to query their processed documents using natural language and receive results in various formats including charts, tables, and text responses.
 
@@ -93,7 +93,7 @@ def execute_mutation(
         request = AWSRequest(
             method="POST",
             url=self.api_url,
-            data=json.dumps(data).encode(),
+            data=json.dumps(data, default=str).encode(),
             headers={
                 "Content-Type": "application/json",
                 "Accept": "application/json",
 
@@ -149,10 +149,18 @@ def _document_to_update_input(self, document: Document) -> Dict[str, Any]:
                 if section.confidence_threshold_alerts:
                     alerts_data = []
                     for alert in section.confidence_threshold_alerts:
+                        # Convert Decimal values to string to avoid serialization issues
+                        confidence_value = alert.get("confidence")
+                        confidence_threshold_value = alert.get("confidence_threshold")
+
                         alert_data = {
                             "attributeName": alert.get("attribute_name"),
-                            "confidence": alert.get("confidence"),
-                            "confidenceThreshold": alert.get("confidence_threshold"),
+                            "confidence": float(confidence_value)
+                            if confidence_value is not None
+                            else None,
+                            "confidenceThreshold": float(confidence_threshold_value)
+                            if confidence_threshold_value is not None
+                            else None,
                         }
                         alerts_data.append(alert_data)
                     section_data["ConfidenceThresholdAlerts"] = alerts_data
@@ -164,7 +172,7 @@ def _document_to_update_input(self, document: Document) -> Dict[str, Any]:
 
         # Add metering data if available
         if document.metering:
-            input_data["Metering"] = json.dumps(document.metering)
+            input_data["Metering"] = json.dumps(document.metering, default=str)
 
         # Add evaluation status & report if available
         if document.evaluation_status:
 
@@ -286,7 +286,7 @@ def _dynamodb_item_to_document(self, item: Dict[str, Any]) -> Document:
         doc = Document(
             id=item.get("ObjectKey"),
             input_key=item.get("ObjectKey"),
-            num_pages=item.get("PageCount", 0),
+            num_pages=int(item.get("PageCount", 0)),  # Ensure PageCount is integer
             queued_time=item.get("QueuedTime"),
             start_time=item.get("WorkflowStartTime"),
             completion_time=item.get("CompletionTime"),
@@ -304,23 +304,38 @@ def _dynamodb_item_to_document(self, item: Dict[str, Any]) -> Document:
                 logger.warning(f"Unknown status '{object_status}', using QUEUED")
                 doc.status = Status.QUEUED
 
-        # Convert metering data
-        metering_json = item.get("Metering")
-        if metering_json:
+        # Convert metering data - handle both JSON string and native dict formats
+        metering_data = item.get("Metering")
+        if metering_data:
             try:
-                doc.metering = json.loads(metering_json)
+                if isinstance(metering_data, str):
+                    # It's a JSON string, parse it
+                    if metering_data.strip():  # Only parse non-empty strings
+                        doc.metering = json.loads(metering_data)
+                    else:
+                        doc.metering = {}
+                else:
+                    # It's already a dict/object (native DynamoDB format), use it directly
+                    doc.metering = metering_data
             except json.JSONDecodeError:
-                logger.warning("Failed to parse metering data")
+                logger.warning("Failed to parse metering JSON string, using empty dict")
+                doc.metering = {}
+            except Exception as e:
+                logger.warning(f"Error processing metering data: {e}, using empty dict")
+                doc.metering = {}
 
         # Convert pages
         pages_data = item.get("Pages", [])
         if pages_data is not None:  # Ensure pages_data is not None before iterating
             for page_data in pages_data:
                 page_id = str(page_data.get("Id"))
+                text_uri = page_data.get("TextUri")
                 doc.pages[page_id] = Page(
                     page_id=page_id,
                     image_uri=page_data.get("ImageUri"),
-                    raw_text_uri=page_data.get("TextUri"),
+                    raw_text_uri=text_uri,
+                    parsed_text_uri=text_uri,  # Set both raw and parsed to same URI
+                    text_confidence_uri=page_data.get("TextConfidenceUri"),
                     classification=page_data.get("Class"),
                 )
 
 
@@ -277,7 +277,7 @@ def from_dict(cls, data: Dict[str, Any]) -> "Document":
             input_bucket=data.get("input_bucket"),
             input_key=data.get("input_key"),
             output_bucket=data.get("output_bucket"),
-            num_pages=data.get("num_pages", 0),
+            num_pages=int(data.get("num_pages", 0)),  # Ensure num_pages is integer
             initial_event_time=data.get("initial_event_time"),
             queued_time=data.get("queued_time"),
             start_time=data.get("start_time"),
@@ -356,7 +356,7 @@ def from_s3_event(cls, event: Dict[str, Any], output_bucket: str) -> "Document":
 
     def to_json(self) -> str:
         """Convert document to JSON string."""
-        return json.dumps(self.to_dict())
+        return json.dumps(self.to_dict(), default=str)
 
     @classmethod
     def from_json(cls, json_str: str) -> "Document":
 
@@ -89,7 +89,21 @@ def merge_metering_data(existing_metering: Dict[str, Any],
             for unit, value in metrics.items():
                 if service_api not in merged:
                     merged[service_api] = {}
-                merged[service_api][unit] = merged[service_api].get(unit, 0) + value
+                
+                # Convert both values to numbers to handle string vs int mismatch
+                try:
+                    existing_value = merged[service_api].get(unit, 0)
+                    # Handle both string and numeric values
+                    if isinstance(existing_value, str):
+                        existing_value = float(existing_value)
+                    if isinstance(value, str):
+                        value = float(value)
+                    
+                    merged[service_api][unit] = existing_value + value
+                except (ValueError, TypeError) as e:
+                    logger.warning(f"Error converting metering values for {service_api}.{unit}: existing={merged[service_api].get(unit)}, new={value}, error={e}")
+                    # Fallback to new value if conversion fails
+                    merged[service_api][unit] = value
         else:
             logger.warning(f"Unexpected metering data format for {service_api}: {metrics}")
 
@@ -632,4 +646,4 @@ def check_token_limit(document_text: str, extraction_results: Dict[str, Any], co
     else:
         logger.info(f"This document is configured with {int(configured_max_tokens)} max_tokens, "
                     f" requires approximately {int(estimated_tokens)} tokens.")
-    return None
+    return None