You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+46Lines changed: 46 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,8 +5,54 @@ SPDX-License-Identifier: MIT-0
5
5
6
6
## [Unreleased]
7
7
8
+
## [0.3.13]
9
+
8
10
### Added
9
11
12
+
-**External MCP Agent Integration for Custom Tool Extension**
13
+
- Added External MCP (Model Context Protocol) Agent support that enables integration with custom MCP servers to extend IDP capabilities
14
+
-**Cross-Account Integration**: Host MCP servers in separate AWS accounts or external infrastructure with secure OAuth authentication using AWS Cognito
15
+
-**Dynamic Tool Discovery**: Automatically discovers and integrates available tools from MCP servers through the IDP web interface
16
+
-**Secure Authentication Flow**: Uses AWS Cognito User Pools for OAuth bearer token authentication with proper token validation
17
+
-**Configuration Management**: JSON array configuration in AWS Secrets Manager supporting multiple MCP server connections with optional custom agent names and descriptions
18
+
-**Real-time Integration**: Tools become immediately available through the IDP web interface after configuration
19
+
20
+
-**AWS GovCloud Support with Automated Template Generation**
21
+
- Added GovCloud compatibility through `scripts/generate_govcloud_template.py` script
22
+
-**ARN Partition Compatibility**: All templates updated to use `arn:${AWS::Partition}:` for both commercial and GovCloud regions
-**Core Functionality Preserved**: All 3 processing patterns and complete 6-step pipeline (OCR, Classification, Extraction, Assessment, Summarization, Evaluation) remain fully functional
25
+
-**Automated Workflow**: Single script orchestrates build + GovCloud template generation + S3 upload with deployment URLs
26
+
-**Enterprise Ready**: Enables headless document processing for government and enterprise environments requiring GovCloud compliance
27
+
-**Documentation**: New `docs/govcloud-deployment.md` with deployment guide, architecture differences, and access methods
28
+
29
+
-**Pattern-2 and Pattern-3 Assessment now generate geometry (bounding boxes) for visualization in UI 'Visual Editor' (parity with Pattern-1)**
30
+
- Added comprehensive spatial localization capabilities to both regular and granular assessment services
31
+
-**Automatic Processing**: When LLM provides bbox coordinates, automatically converts to UI-compatible (Visual Edit) geometry format without any configuration
32
+
-**Universal Support**: Works with all attribute types - simple attributes, nested group attributes (e.g., CompanyAddress.State), and list attributes
33
+
-**Enhanced Prompts**: Updated assessment task prompts with spatial-localization-guidelines requesting bbox coordinates in normalized 0-1000 scale
34
+
-**Demo Notebooks**: Assessment notebooks now showcase automatic bounding box processing
35
+
36
+
-**New Python-Based Publishing System**
37
+
- Replaced `publish.sh` bash script with new `publish.py` Python script
38
+
- Rich console interface with progress bars, spinners, and colored output using Rich library
39
+
- Multi-threaded artifact building and uploading for significantly improved performance
40
+
- Native support for Linux, macOS, and Windows environments
41
+
42
+
-**Windows Development Environment Setup Guide and Helper Script**
43
+
- New `scripts/dev_setup.bat` (570 lines) for complete Windows development environment configuration
44
+
45
+
-**OCR Service Default Image Sizing for Resource Optimization**
46
+
- Implemented automatic default image size limits (951×1268) when no image sizing configuration is provided
47
+
-**Key Benefits**: Reduction in vision model token consumption, prevents OutOfMemory errors during concurrent processing, improves processing speed and reduces bandwidth usage
48
+
49
+
### Changed
50
+
51
+
-**Reverted to python3.12 runtime to resolve build package dependency problems**
52
+
53
+
### Fixed
54
+
-**Improved Visual Edit bounding box position when using image zoom or pan**
You are a document summarization expert who can analyze and summarize documents from various domains including medical, financial, legal, and general business documents. Your task is to create a summary that captures the key information, main points, and important details from the document. Your output must be in valid JSON format. \nSummarization Style: Balanced\\nCreate a balanced summary that provides a moderate level of detail. Include the main points and key supporting information, while maintaining the document's overall structure. Aim for a comprehensive yet concise summary.\n Your output MUST be in valid JSON format with markdown content. You MUST strictly adhere to the output format specified in the instructions.
371
+
371
372
assessment:
372
373
enabled: true
373
374
image:
@@ -383,130 +384,146 @@ assessment:
383
384
max_tokens: '10000'
384
385
top_k: '5'
385
386
temperature: '0.0'
386
-
model: us.amazon.nova-pro-v1:0
387
+
model: us.amazon.nova-lite-v1:0
387
388
system_prompt: >-
388
-
You are a document analysis assessment expert. Your task is to evaluate the confidence of extraction results by analyzing the source document evidence. Respond only with JSON containing confidence scores for each extracted attribute.
389
+
You are a document analysis assessment expert. Your role is to evaluate the confidence and accuracy of data extraction results by analyzing them against source documents.
390
+
391
+
Provide accurate confidence scores for each assessment.
392
+
When bounding boxes are requested, provide precise coordinate locations where information appears in the document.
389
393
task_prompt: >-
390
394
<background>
391
-
392
-
You are an expert document analysis assessment system. Your task is to evaluate the confidence of extraction results for a document of class {DOCUMENT_CLASS}.
393
-
395
+
You are an expert document analysis assessment system. Your task is to evaluate the confidence of extraction results for a document of class {DOCUMENT_CLASS} and provide precise spatial localization for each field.
394
396
</background>
395
397
396
-
397
398
<task>
398
-
399
-
Analyze the extraction results against the source document and provide confidence assessments for each extracted attribute. Consider factors such as:
400
-
401
-
1. Text clarity and OCR quality in the source regions
402
-
2. Alignment between extracted values and document content
403
-
3. Presence of clear evidence supporting the extraction
404
-
4. Potential ambiguity or uncertainty in the source material
399
+
Analyze the extraction results against the source document and provide confidence assessments AND bounding box coordinates for each extracted attribute. Consider factors such as:
400
+
1. Text clarity and OCR quality in the source regions
401
+
2. Alignment between extracted values and document content
402
+
3. Presence of clear evidence supporting the extraction
403
+
4. Potential ambiguity or uncertainty in the source material
405
404
5. Completeness and accuracy of the extracted information
406
-
405
+
6. Precise spatial location of each field in the document
407
406
</task>
408
407
409
-
410
408
<assessment-guidelines>
411
-
412
-
For each attribute, provide:
413
-
A confidence score between 0.0 and 1.0 where:
409
+
For each attribute, provide:
410
+
- A confidence score between 0.0 and 1.0 where:
414
411
- 1.0 = Very high confidence, clear and unambiguous evidence
415
412
- 0.8-0.9 = High confidence, strong evidence with minor uncertainty
416
413
- 0.6-0.7 = Medium confidence, reasonable evidence but some ambiguity
417
414
- 0.4-0.5 = Low confidence, weak or unclear evidence
418
415
- 0.0-0.3 = Very low confidence, little to no supporting evidence
419
-
420
-
Guidelines:
421
-
- Base assessments on actual document content and OCR quality
422
-
- Consider both text-based evidence and visual/layout clues
423
-
- Account for OCR confidence scores when provided
424
-
- Be objective and specific in reasoning
416
+
- A clear explanation of the confidence reasoning
417
+
- Precise spatial coordinates where the field appears in the document
418
+
419
+
Guidelines:
420
+
- Base assessments on actual document content and OCR quality
421
+
- Consider both text-based evidence and visual/layout clues
422
+
- Account for OCR confidence scores when provided
423
+
- Be objective and specific in reasoning
425
424
- If an extraction appears incorrect, score accordingly with explanation
426
-
425
+
- Provide tight, accurate bounding boxes around the actual text
- page: Page number where the field appears (starting from 1)
432
+
433
+
Coordinate system:
434
+
- Use normalized scale 0-1000 for both x and y axes
435
+
- x1, y1 = top-left corner of bounding box
436
+
- x2, y2 = bottom-right corner of bounding box
437
+
- Ensure x2 > x1 and y2 > y1
438
+
- Make bounding boxes tight around the actual text content
439
+
- If a field spans multiple lines, create a bounding box that encompasses all relevant text
440
+
</spatial-localization-guidelines>
430
441
431
-
Analyze the extraction results against the source document and provide confidence assessments. Return a JSON object with the following structure based on the attribute type:
442
+
<final-instructions>
443
+
Analyze the extraction results against the source document and provide confidence assessments with spatial localization. Return a JSON object with the following structure based on the attribute type:
432
444
433
-
For SIMPLE attributes:
445
+
For SIMPLE attributes:
434
446
{
435
447
"simple_attribute_name": {
436
448
"confidence": 0.85,
449
+
"bbox": [100, 200, 300, 250],
450
+
"page": 1
437
451
}
438
452
}
439
453
440
-
For GROUP attributes (nested object structure):
454
+
For GROUP attributes (nested object structure):
441
455
{
442
456
"group_attribute_name": {
443
457
"sub_attribute_1": {
444
458
"confidence": 0.90,
459
+
"bbox": [150, 300, 250, 320],
460
+
"page": 1
445
461
},
446
462
"sub_attribute_2": {
447
463
"confidence": 0.75,
464
+
"bbox": [150, 325, 280, 345],
465
+
"page": 1
448
466
}
449
467
}
450
468
}
451
469
452
-
For LIST attributes (array of assessed items):
470
+
For LIST attributes (array of assessed items):
453
471
{
454
472
"list_attribute_name": [
455
473
{
456
474
"item_attribute_1": {
457
475
"confidence": 0.95,
476
+
"bbox": [100, 400, 200, 420],
477
+
"page": 1
458
478
},
459
479
"item_attribute_2": {
460
480
"confidence": 0.88,
481
+
"bbox": [250, 400, 350, 420],
482
+
"page": 1
461
483
}
462
484
},
463
485
{
464
486
"item_attribute_1": {
465
487
"confidence": 0.92,
488
+
"bbox": [100, 425, 200, 445],
489
+
"page": 1
466
490
},
467
491
"item_attribute_2": {
468
492
"confidence": 0.70,
493
+
"bbox": [250, 425, 350, 445],
494
+
"page": 1
469
495
}
470
496
}
471
497
]
472
498
}
473
499
474
-
IMPORTANT:
475
-
- For LIST attributes like "Transactions", assess EACH individual item in the list separately
476
-
- Each transaction should be assessed as a separate object in the array
477
-
- Do NOT provide aggregate assessments for list items - assess each one individually
478
-
- Include assessments for ALL attributes present in the extraction results
500
+
IMPORTANT:
501
+
- For LIST attributes like "Transactions", assess EACH individual item in the list separately with individual bounding boxes
502
+
- Each transaction should be assessed as a separate object in the array with its own spatial coordinates
503
+
- Do NOT provide aggregate assessments for list items - assess each one individually with precise locations
504
+
- Include assessments AND bounding boxes for ALL attributes present in the extraction results
479
505
- Match the exact structure of the extracted data
480
-
506
+
- Provide page numbers for all bounding boxes (starting from 1)
0 commit comments