Skip to content

Commit d9a4a9a

Browse files
committed
fix merge conflict
2 parents d84f433 + 9aac585 commit d9a4a9a

File tree

3 files changed

+20
-18
lines changed

3 files changed

+20
-18
lines changed

CHANGELOG.md

Lines changed: 8 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,27 +5,17 @@ SPDX-License-Identifier: MIT-0
55

66
## [Unreleased]
77

8-
### Added
9-
10-
- **BIO-like Sequence Segmentation for Multimodal Classification**
11-
- Enhanced multimodal page-level classification with document boundary detection using BIO (Begin-Inside-Outside) tagging
12-
- Each page receives both document type and boundary indicator ("start"/"continue") for automatic multi-document packet segmentation
13-
- Eliminates need for manual document splitting in complex packets containing multiple documents of same type
14-
15-
### Changed
8+
## [0.3.12]
169

17-
- **Consolidated Classification Methods**
18-
- Merged `multimodalPageBoundaryClassification` into enhanced `multimodalPageLevelClassification`
19-
- Removed `MULTIMODAL_PAGE_BOUNDARY` constant and simplified configuration logic
20-
- Maintains backward compatibility with existing configurations
21-
22-
### Documentation
10+
### Added
2311

24-
- **Updated Classification Documentation**
25-
- Enhanced service docstrings and README files with sequence segmentation examples
12+
- **Refactored Document Classification Service for Enhanced Boundary Detection**
13+
- Consolidated `multimodalPageLevelClassification` and the experimental `multimodalPageBoundaryClassification` (from v0.3.11) into a single enhanced `multimodalPageLevelClassification` method
14+
- Implemented BIO-like sequence segmentation with document boundary indicators: "start" (new document) and "continue" (same document)
15+
- Automatically segments multi-document packets, even when they contain multiple documents of the same type
2616
- Added comprehensive classification guide with method comparisons and best practices
27-
28-
### Added
17+
- **Benefits**: Simplified codebase with single multimodal classification method, improved handling of complex document packets, maintains backward compatibility
18+
- **No Breaking Changes**: Existing configurations work unchanged, no configuration updates required
2919

3020
- **Enhanced A2I Template and Workflow Management**
3121
- Enhanced A2I template with improved user interface and clearer instructions for reviewers

config_library/pattern-2/lending-package-sample/config.yaml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1079,6 +1079,7 @@ extraction:
10791079
system_prompt: >-
10801080
You are a document assistant. Respond only with JSON. Never make up data, only provide data found in the document being provided.
10811081
summarization:
1082+
enabled: true
10821083
top_p: '0.1'
10831084
max_tokens: '4096'
10841085
top_k: '5'
@@ -1140,6 +1141,7 @@ summarization:
11401141
system_prompt: >-
11411142
You are a document summarization expert who can analyze and summarize documents from various domains including medical, financial, legal, and general business documents. Your task is to create a summary that captures the key information, main points, and important details from the document. Your output must be in valid JSON format. \nSummarization Style: Balanced\\nCreate a balanced summary that provides a moderate level of detail. Include the main points and key supporting information, while maintaining the document's overall structure. Aim for a comprehensive yet concise summary.\n Your output MUST be in valid JSON format with markdown content. You MUST strictly adhere to the output format specified in the instructions.
11421143
assessment:
1144+
enabled: true
11431145
image:
11441146
target_height: ''
11451147
target_width: ''
@@ -1422,3 +1424,13 @@ pricing:
14221424
price: '1.5E-6'
14231425
- name: cacheWriteInputTokens
14241426
price: '1.875E-5'
1427+
- name: bedrock/us.anthropic.claude-opus-4-1-20250805-v1:0
1428+
units:
1429+
- name: inputTokens
1430+
price: '1.5E-5'
1431+
- name: outputTokens
1432+
price: '7.5E-5'
1433+
- name: cacheReadInputTokens
1434+
price: '1.5E-6'
1435+
- name: cacheWriteInputTokens
1436+
price: '1.875E-5'

samples/lending_package.pdf

98.9 KB
Binary file not shown.

0 commit comments

Comments
 (0)