Skip to content

Commit edb73c3

Browse files
committed
feat: Integrate service tier into all IDP services and configurations
Update all IDP services to read and pass service_tier parameter to Bedrock API. Update all default configuration files with service_tier settings. Service Integration: - OCR: Reads service_tier from ocr.service_tier or global config - Classification: Includes service_tier in config dict - Extraction: Reads service_tier with fallback logic - Assessment: Reads and passes service_tier to Bedrock - Granular Assessment: Propagates service_tier through parallel/sequential processing - Summarization: Includes service_tier in config dict Configuration Updates: - Pattern 1: Added global and operation-specific service_tier settings - Pattern 2: Added global and operation-specific service_tier settings - Pattern 3: Added global and operation-specific service_tier settings - All configs include explanatory comments CLI Updates: - Added --service-tier parameter to deploy command - Added --service-tier parameter to run-inference command - Validates against: priority, standard, flex Quality: - All ruff lint checks passing - Type hints complete - Docstrings updated - Backward compatible
1 parent 2e5a658 commit edb73c3

File tree

16 files changed

+227
-23
lines changed

16 files changed

+227
-23
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,9 @@ notebooks/examples/data
2525
*tmp-dev-assets*
2626
scratch/
2727

28+
# Service tier implementation artifacts
29+
service_tier_*.md
30+
2831
# Node.js / npm
2932
node_modules/
3033
package-lock.json

config_library/pattern-1/lending-package-sample/config.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,14 @@
22
# SPDX-License-Identifier: MIT-0
33

44
notes: Processing configuration in BDA project.
5+
# Global service tier setting (priority, standard, flex)
6+
service_tier: "standard"
57
assessment:
8+
service_tier: null # null = use global service_tier
69
default_confidence_threshold: '0.8'
710
summarization:
811
enabled: true
12+
service_tier: null # null = use global service_tier
913
top_p: "0.0"
1014
max_tokens: '4096'
1115
top_k: '5'

config_library/pattern-2/lending-package-sample/config.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,13 @@
11
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
22
# SPDX-License-Identifier: MIT-0
33
notes: Default settings for lending-package-sample configuration
4+
# Global service tier setting (priority, standard, flex)
5+
# This applies to all operations unless overridden at operation level
6+
service_tier: "standard"
47
ocr:
58
backend: "textract" # Default to Textract for backward compatibility
69
model_id: "us.anthropic.claude-3-7-sonnet-20250219-v1:0"
10+
service_tier: null # null = use global service_tier
711
system_prompt: "You are an expert OCR system. Extract all text from the provided image accurately, preserving layout where possible."
812
task_prompt: "Extract all text from this document image. Preserve the layout, including paragraphs, tables, and formatting."
913
features:
@@ -1189,6 +1193,7 @@ classification:
11891193
classificationMethod: multimodalPageLevelClassification
11901194
maxPagesForClassification: "ALL"
11911195
sectionSplitting: llm_determined
1196+
service_tier: null # null = use global service_tier
11921197
image:
11931198
target_height: ""
11941199
target_width: ""
@@ -1250,6 +1255,7 @@ classification:
12501255
4. Outputting in the exact JSON format specified in <output-format>
12511256
</final-instructions>
12521257
extraction:
1258+
service_tier: null # null = use global service_tier
12531259
agentic:
12541260
enabled: false
12551261
review_agent: false
@@ -1351,6 +1357,7 @@ extraction:
13511357
You are a document assistant. Respond only with JSON. Never make up data, only provide data found in the document being provided.
13521358
summarization:
13531359
enabled: true
1360+
service_tier: null # null = use global service_tier
13541361
top_p: "0.0"
13551362
max_tokens: "4096"
13561363
top_k: "5"
@@ -1425,6 +1432,7 @@ summarization:
14251432
You are a document summarization expert who can analyze and summarize documents from various domains including medical, financial, legal, and general business documents. Your task is to create a summary that captures the key information, main points, and important details from the document. Your output must be in valid JSON format. \nSummarization Style: Balanced\\nCreate a balanced summary that provides a moderate level of detail. Include the main points and key supporting information, while maintaining the document's overall structure. Aim for a comprehensive yet concise summary.\n Your output MUST be in valid JSON format with markdown content. You MUST strictly adhere to the output format specified in the instructions.
14261433
assessment:
14271434
enabled: true
1435+
service_tier: null # null = use global service_tier
14281436
validation_enabled: false
14291437
image:
14301438
target_height: ""

config_library/pattern-3/rvl-cdip-package-sample/config.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,12 @@
22
# SPDX-License-Identifier: MIT-0
33

44
notes: Default settings
5+
# Global service tier setting (priority, standard, flex)
6+
service_tier: "standard"
57
ocr:
68
backend: "textract" # Default to Textract for backward compatibility
79
model_id: "us.anthropic.claude-3-7-sonnet-20250219-v1:0"
10+
service_tier: null # null = use global service_tier
811
system_prompt: "You are an expert OCR system. Extract all text from the provided image accurately, preserving layout where possible."
912
task_prompt: "Extract all text from this document image. Preserve the layout, including paragraphs, tables, and formatting."
1013
features:
@@ -765,7 +768,9 @@ classes:
765768
labeled 'notes', 'remarks', or 'comments'.
766769
classification:
767770
model: Custom fine tuned UDOP model
771+
service_tier: null # null = use global service_tier (UDOP doesn't use Bedrock, but kept for consistency)
768772
extraction:
773+
service_tier: null # null = use global service_tier
769774
image:
770775
target_width: ""
771776
target_height: ""
@@ -864,6 +869,7 @@ extraction:
864869
You are a document assistant. Respond only with JSON. Never make up data, only provide data found in the document being provided.
865870
summarization:
866871
enabled: true
872+
service_tier: null # null = use global service_tier
867873
top_p: "0.0"
868874
max_tokens: "4096"
869875
top_k: "5"
@@ -926,6 +932,7 @@ summarization:
926932
You are a document summarization expert who can analyze and summarize documents from various domains including medical, financial, legal, and general business documents. Your task is to create a summary that captures the key information, main points, and important details from the document. Your output must be in valid JSON format. \nSummarization Style: Balanced\\nCreate a balanced summary that provides a moderate level of detail. Include the main points and key supporting information, while maintaining the document's overall structure. Aim for a comprehensive yet concise summary.\n Your output MUST be in valid JSON format with markdown content. You MUST strictly adhere to the output format specified in the instructions.
927933
assessment:
928934
enabled: true
935+
service_tier: null # null = use global service_tier
929936
image:
930937
target_height: ""
931938
target_width: ""

idp_cli/idp_cli/cli.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -198,6 +198,12 @@ def cli():
198198
"--custom-config",
199199
help="Path to local config file or S3 URI (e.g., ./config.yaml or s3://bucket/config.yaml)",
200200
)
201+
@click.option(
202+
"--service-tier",
203+
type=click.Choice(["priority", "standard", "flex"]),
204+
default="standard",
205+
help="Service tier for Bedrock API calls (default: standard)",
206+
)
201207
@click.option("--parameters", help="Additional parameters as key=value,key2=value2")
202208
@click.option("--wait", is_flag=True, help="Wait for stack creation to complete")
203209
@click.option(
@@ -215,6 +221,7 @@ def deploy(
215221
enable_hitl: str,
216222
pattern_config: Optional[str],
217223
custom_config: Optional[str],
224+
service_tier: str,
218225
parameters: Optional[str],
219226
wait: bool,
220227
no_rollback: bool,
@@ -915,6 +922,11 @@ def rerun_inference(
915922
type=int,
916923
help="Seconds between status checks (default: 5)",
917924
)
925+
@click.option(
926+
"--service-tier",
927+
type=click.Choice(["priority", "standard", "flex"]),
928+
help="Service tier for Bedrock API calls (overrides configuration)",
929+
)
918930
@click.option("--region", help="AWS region (optional)")
919931
def run_inference(
920932
stack_name: str,
@@ -928,6 +940,7 @@ def run_inference(
928940
batch_prefix: str,
929941
monitor: bool,
930942
refresh_interval: int,
943+
service_tier: Optional[str],
931944
region: Optional[str],
932945
):
933946
"""

lib/idp_common_pkg/idp_common/assessment/granular_service.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -745,6 +745,7 @@ def _process_assessment_task(
745745
top_k: float,
746746
top_p: float,
747747
max_tokens: Optional[int],
748+
service_tier: Optional[str] = None,
748749
) -> AssessmentResult:
749750
"""
750751
Process a single assessment task.
@@ -759,6 +760,7 @@ def _process_assessment_task(
759760
top_k: Top-k parameter
760761
top_p: Top-p parameter
761762
max_tokens: Max tokens parameter
763+
service_tier: Service tier for Bedrock API
762764
763765
Returns:
764766
Assessment result
@@ -785,6 +787,7 @@ def _process_assessment_task(
785787
top_p=top_p,
786788
max_tokens=max_tokens,
787789
context="GranularAssessment",
790+
service_tier=service_tier,
788791
)
789792

790793
# Extract text from response
@@ -1584,6 +1587,13 @@ def process_document_section(self, document: Document, section_id: str) -> Docum
15841587
max_tokens = self.config.assessment.max_tokens
15851588
system_prompt = self.config.assessment.system_prompt
15861589

1590+
# Get service tier from config (operation-specific or global)
1591+
service_tier = None
1592+
if hasattr(self.config.assessment, "service_tier"):
1593+
service_tier = self.config.assessment.service_tier
1594+
if not service_tier and hasattr(self.config, "service_tier"):
1595+
service_tier = self.config.service_tier
1596+
15871597
# Get schema for this document class
15881598
class_schema = self._get_class_schema(class_label)
15891599
if not class_schema:
@@ -1669,6 +1679,7 @@ def process_document_section(self, document: Document, section_id: str) -> Docum
16691679
top_k,
16701680
top_p,
16711681
max_tokens,
1682+
service_tier,
16721683
): task
16731684
for task in tasks_to_process
16741685
}
@@ -1721,6 +1732,7 @@ def process_document_section(self, document: Document, section_id: str) -> Docum
17211732
top_k,
17221733
top_p,
17231734
max_tokens,
1735+
service_tier,
17241736
)
17251737
all_task_results.append(result)
17261738

lib/idp_common_pkg/idp_common/assessment/service.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -852,6 +852,13 @@ def process_document_section(self, document: Document, section_id: str) -> Docum
852852
# Time the model invocation
853853
request_start_time = time.time()
854854

855+
# Get service tier from config (operation-specific or global)
856+
service_tier = None
857+
if hasattr(self.config.assessment, "service_tier"):
858+
service_tier = self.config.assessment.service_tier
859+
if not service_tier and hasattr(self.config, "service_tier"):
860+
service_tier = self.config.service_tier
861+
855862
# Invoke Bedrock with the common library
856863
response_with_metering = bedrock.invoke_model(
857864
model_id=model_id,
@@ -862,6 +869,7 @@ def process_document_section(self, document: Document, section_id: str) -> Docum
862869
top_p=top_p,
863870
max_tokens=max_tokens,
864871
context="Assessment",
872+
service_tier=service_tier,
865873
)
866874

867875
total_duration = time.time() - request_start_time

lib/idp_common_pkg/idp_common/bedrock/client.py

Lines changed: 36 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,21 +8,21 @@
88
with built-in retry logic, metrics tracking, and configuration options.
99
"""
1010

11-
import boto3
11+
import copy
1212
import json
13-
import os
14-
import time
1513
import logging
16-
import copy
14+
import os
1715
import random
18-
import socket
19-
from typing import Dict, Any, List, Optional, Union, Tuple, Type
16+
import time
17+
from typing import Any, Dict, List, Optional, Union
18+
19+
import boto3
2020
from botocore.config import Config
2121
from botocore.exceptions import (
2222
ClientError,
23-
ReadTimeoutError,
2423
ConnectTimeoutError,
2524
EndpointConnectionError,
25+
ReadTimeoutError,
2626
)
2727
from urllib3.exceptions import ReadTimeoutError as Urllib3ReadTimeoutError
2828

@@ -42,9 +42,11 @@ class _RequestsConnectTimeout(Exception):
4242

4343
try:
4444
from requests.exceptions import (
45-
ReadTimeout as RequestsReadTimeout,
4645
ConnectTimeout as RequestsConnectTimeout,
4746
)
47+
from requests.exceptions import (
48+
ReadTimeout as RequestsReadTimeout,
49+
)
4850
except ImportError:
4951
# Fallback if requests is not available - use dummy exception classes
5052
RequestsReadTimeout = _RequestsReadTimeout # type: ignore[misc,assignment]
@@ -87,6 +89,7 @@ class _RequestsConnectTimeout(Exception):
8789
"eu.amazon.nova-2-lite-v1:0",
8890
]
8991

92+
9093
class BedrockClient:
9194
"""Client for interacting with Amazon Bedrock models."""
9295

@@ -139,6 +142,7 @@ def __call__(
139142
max_tokens: Optional[Union[int, str]] = None,
140143
max_retries: Optional[int] = None,
141144
context: str = "Unspecified",
145+
service_tier: Optional[str] = None,
142146
) -> Dict[str, Any]:
143147
"""
144148
Make the instance callable with the same signature as the original function.
@@ -154,6 +158,7 @@ def __call__(
154158
top_p: Optional top_p parameter (float or string)
155159
max_tokens: Optional max_tokens parameter (int or string)
156160
max_retries: Optional override for the instance's max_retries setting
161+
service_tier: Optional service tier (priority, standard, flex)
157162
158163
Returns:
159164
Bedrock response object with metering information
@@ -173,6 +178,7 @@ def __call__(
173178
max_tokens=max_tokens,
174179
max_retries=effective_max_retries,
175180
context=context,
181+
service_tier=service_tier,
176182
)
177183

178184
def _preprocess_content_for_cachepoint(
@@ -264,6 +270,7 @@ def invoke_model(
264270
max_tokens: Optional[Union[int, str]] = None,
265271
max_retries: Optional[int] = None,
266272
context: str = "Unspecified",
273+
service_tier: Optional[str] = None,
267274
) -> Dict[str, Any]:
268275
"""
269276
Invoke a Bedrock model with retry logic.
@@ -277,6 +284,7 @@ def invoke_model(
277284
top_p: Optional top_p parameter (float or string)
278285
max_tokens: Optional max_tokens parameter (int or string)
279286
max_retries: Optional override for the instance's max_retries setting
287+
service_tier: Optional service tier (priority, standard, flex)
280288
281289
Returns:
282290
Bedrock response object with metering information
@@ -368,9 +376,7 @@ def invoke_model(
368376
inference_config["topP"] = top_p
369377
# Remove temperature when using top_p to avoid conflicts
370378
del inference_config["temperature"]
371-
logger.debug(
372-
f"Using top_p={top_p} for inference (temperature ignored)"
373-
)
379+
logger.debug(f"Using top_p={top_p} for inference (temperature ignored)")
374380
else:
375381
logger.debug(
376382
f"Using temperature={temperature} for inference (top_p is 0 or None)"
@@ -438,6 +444,20 @@ def invoke_model(
438444
if not additional_model_fields:
439445
additional_model_fields = None
440446

447+
# Normalize and validate service tier
448+
normalized_service_tier = None
449+
if service_tier:
450+
tier_lower = service_tier.lower().strip()
451+
if tier_lower in ["priority", "flex"]:
452+
normalized_service_tier = tier_lower
453+
elif tier_lower in ["standard", "default"]:
454+
normalized_service_tier = "default"
455+
else:
456+
logger.warning(
457+
f"Invalid service_tier value '{service_tier}'. "
458+
f"Valid values are: priority, standard, flex. Using default tier."
459+
)
460+
441461
# Get guardrail configuration if available
442462
guardrail_config = self.get_guardrail_config()
443463

@@ -450,6 +470,11 @@ def invoke_model(
450470
"additionalModelRequestFields": additional_model_fields,
451471
}
452472

473+
# Add service tier if specified
474+
if normalized_service_tier:
475+
converse_params["serviceTier"] = normalized_service_tier
476+
logger.info(f"Using service tier: {normalized_service_tier}")
477+
453478
# Add guardrail config if available
454479
if guardrail_config:
455480
converse_params["guardrailConfig"] = guardrail_config

lib/idp_common_pkg/idp_common/classification/service.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -594,6 +594,10 @@ def _get_classification_config(self) -> Dict[str, Any]:
594594
"max_tokens": self.config.classification.max_tokens,
595595
}
596596

597+
# Add service tier (operation-specific or global)
598+
if hasattr(self.config.classification, "service_tier"):
599+
config["service_tier"] = self.config.classification.service_tier
600+
597601
# Validate system prompt
598602
system_prompt = self.config.classification.system_prompt
599603
if not system_prompt:
@@ -1222,6 +1226,11 @@ def _invoke_bedrock_model(
12221226
Returns:
12231227
Dictionary with response and metering data
12241228
"""
1229+
# Get service tier from config (operation-specific or global)
1230+
service_tier = config.get("service_tier")
1231+
if not service_tier and hasattr(self.config, "service_tier"):
1232+
service_tier = self.config.service_tier
1233+
12251234
return bedrock.invoke_model(
12261235
model_id=config["model_id"],
12271236
system_prompt=config["system_prompt"],
@@ -1231,6 +1240,7 @@ def _invoke_bedrock_model(
12311240
top_p=config["top_p"],
12321241
max_tokens=config["max_tokens"],
12331242
context="Classification",
1243+
service_tier=service_tier,
12341244
)
12351245

12361246
def _create_unclassified_result(

0 commit comments

Comments
 (0)