Skip to content

Commit e9455ed

Browse files
author
Bob Strahan
committed
docs: Reorder classification methods to prioritize multimodal page-level as default
1 parent 1969b23 commit e9455ed

File tree

1 file changed

+59
-60
lines changed

1 file changed

+59
-60
lines changed

docs/classification.md

Lines changed: 59 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -20,66 +20,7 @@ The solution supports multiple classification approaches that vary by pattern:
2020

2121
Pattern 2 offers two main classification approaches, configured through different templates:
2222

23-
#### Text-Based Holistic Classification (Default)
24-
25-
- Analyzes entire document packets to identify logical boundaries
26-
- Identifies distinct document segments within multi-page documents
27-
- Determines document type for each segment
28-
- Better suited for multi-document packets where context spans multiple pages
29-
- Deployed when you select the default pattern-2 configuration during stack deployment or update
30-
31-
The default configuration in `config_library/pattern-2/default/config.yaml` implements this approach with a task prompt that instructs the model to:
32-
33-
1. Read through the entire document package to understand its contents
34-
2. Identify page ranges that form complete, distinct documents
35-
3. Match each document segment to one of the defined document types
36-
4. Record the start and end pages for each identified segment
37-
38-
Example configuration:
39-
40-
```yaml
41-
classification:
42-
classificationMethod: textbasedHolisticClassification
43-
model: us.amazon.nova-pro-v1:0
44-
task_prompt: >-
45-
<task-description>
46-
You are a document classification system. Your task is to analyze a document package
47-
containing multiple pages and identify distinct document segments, classifying each
48-
segment according to the predefined document types provided below.
49-
</task-description>
50-
51-
<document-types>
52-
{CLASS_NAMES_AND_DESCRIPTIONS}
53-
</document-types>
54-
55-
<document-boundary-rules>
56-
Rules for determining document boundaries:
57-
- Content continuity: Pages with continuing paragraphs, numbered sections, or ongoing narratives belong to the same document
58-
- Visual consistency: Similar layouts, headers, footers, and styling indicate pages belong together
59-
- Logical structure: Documents typically have clear beginning, middle, and end sections
60-
- New document indicators: Title pages, cover sheets, or significantly different subject matter signal a new document
61-
</document-boundary-rules>
62-
63-
<<CACHEPOINT>>
64-
65-
<document-text>
66-
{DOCUMENT_TEXT}
67-
</document-text>
68-
```
69-
70-
## Limitations of Text-Based Holistic Classification
71-
72-
Despite its strengths in handling full-document context, this method has several limitations:
73-
74-
**Context & Model Constraints:**:
75-
- Long documents can exceed the context window of smaller models, resulting in request failure.
76-
- Lengthy inputs may dilute the model’s focus, leading to inaccurate or inconsistent classifications.
77-
- Requires high-context models such as Amazon Nova Premier, which supports up to 1 million tokens. Smaller models are not suitable for this method.
78-
- For more details on supported models and their context limits, refer to the [Amazon Bedrock Supported Models documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
79-
80-
**Scalability Challenges**: Not ideal for very large or visually complex document sets. In such cases, the Multi-Modal Page-Level Classification method is more appropriate.
81-
82-
#### MultiModal Page-Level Classification with Sequence Segmentation
23+
#### MultiModal Page-Level Classification with Sequence Segmentation (default)
8324

8425
- Classifies each page independently using both text and image data
8526
- **Uses sequence segmentation with BIO-like tagging for document boundary detection**
@@ -141,6 +82,64 @@ The boundary detection is automatically included in the classification results.
14182
}
14283
}
14384
```
85+
#### Text-Based Holistic Classification
86+
87+
- Analyzes entire document packets to identify logical boundaries
88+
- Identifies distinct document segments within multi-page documents
89+
- Determines document type for each segment
90+
- Better suited for multi-document packets where context spans multiple pages
91+
- Deployed when you select the default pattern-2 configuration during stack deployment or update
92+
93+
The default configuration in `config_library/pattern-2/default/config.yaml` implements this approach with a task prompt that instructs the model to:
94+
95+
1. Read through the entire document package to understand its contents
96+
2. Identify page ranges that form complete, distinct documents
97+
3. Match each document segment to one of the defined document types
98+
4. Record the start and end pages for each identified segment
99+
100+
Example configuration:
101+
102+
```yaml
103+
classification:
104+
classificationMethod: textbasedHolisticClassification
105+
model: us.amazon.nova-pro-v1:0
106+
task_prompt: >-
107+
<task-description>
108+
You are a document classification system. Your task is to analyze a document package
109+
containing multiple pages and identify distinct document segments, classifying each
110+
segment according to the predefined document types provided below.
111+
</task-description>
112+
113+
<document-types>
114+
{CLASS_NAMES_AND_DESCRIPTIONS}
115+
</document-types>
116+
117+
<document-boundary-rules>
118+
Rules for determining document boundaries:
119+
- Content continuity: Pages with continuing paragraphs, numbered sections, or ongoing narratives belong to the same document
120+
- Visual consistency: Similar layouts, headers, footers, and styling indicate pages belong together
121+
- Logical structure: Documents typically have clear beginning, middle, and end sections
122+
- New document indicators: Title pages, cover sheets, or significantly different subject matter signal a new document
123+
</document-boundary-rules>
124+
125+
<<CACHEPOINT>>
126+
127+
<document-text>
128+
{DOCUMENT_TEXT}
129+
</document-text>
130+
```
131+
132+
## Limitations of Text-Based Holistic Classification
133+
134+
Despite its strengths in handling full-document context, this method has several limitations:
135+
136+
**Context & Model Constraints:**:
137+
- Long documents can exceed the context window of smaller models, resulting in request failure.
138+
- Lengthy inputs may dilute the model’s focus, leading to inaccurate or inconsistent classifications.
139+
- Requires high-context models such as Amazon Nova Premier, which supports up to 1 million tokens. Smaller models are not suitable for this method.
140+
- For more details on supported models and their context limits, refer to the [Amazon Bedrock Supported Models documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html).
141+
142+
**Scalability Challenges**: Not ideal for very large or visually complex document sets. In such cases, the Multi-Modal Page-Level Classification method is more appropriate.
144143
145144
### Pattern 3: UDOP-Based Classification
146145

0 commit comments

Comments
 (0)