Skip to content

Commit 387f694

Browse files
ybalbert001Yuanbo Li
andauthored
[chore] support two types of cross-region inference (global, geographic) (#2146)
Co-authored-by: Yuanbo Li <ybalbert@amazon.com>
1 parent 0a22005 commit 387f694

File tree

10 files changed

+78
-64
lines changed

10 files changed

+78
-64
lines changed

models/bedrock/manifest.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
version: 0.0.53
1+
version: 0.0.54
22
type: plugin
33
author: langgenius
44
name: bedrock

models/bedrock/models/llm/amazon-nova.yaml

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -29,14 +29,18 @@ parameter_rules:
2929
- Nova Micro
3030
- name: cross-region
3131
label:
32-
zh_Hans: 使用跨区域推理
33-
en_US: Use Cross-Region Inference
34-
type: boolean
32+
zh_Hans: 跨区域推理
33+
en_US: Cross-Region Inference
34+
type: string
3535
required: true
36-
default: true
36+
default: geographic
37+
options:
38+
- disabled
39+
- geographic
40+
- global
3741
help:
38-
zh_Hans: 跨区域推理会自动选择您所在地理区域 AWS 区域 内的最佳位置来处理您的推理请求
39-
en_US: Cross-Region inference automatically selects the optimal AWS Region within your geography to process your inference request.
42+
zh_Hans: 跨区域推理会自动选择 AWS 区域内的最佳位置来处理您的推理请求。地理跨区域限于您所在地理区域,全球跨区域可跨所有区域
43+
en_US: Cross-Region inference automatically selects the optimal AWS Region to process your inference request. Geographic limits to your geography, Global spans all regions.
4044
- name: system_cache_checkpoint
4145
label:
4246
zh_Hans: 缓存系统提示词

models/bedrock/models/llm/anthropic-claude.yaml

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,9 @@ parameter_rules:
2323
zh_Hans: 指定模型名称
2424
en_US: specify model name
2525
required: true
26-
default: Claude 3.7 Sonnet
26+
default: Claude 4.5 Sonnet
2727
options:
28+
- Claude 4.5 Opus
2829
- Claude 4.5 Haiku
2930
- Claude 4.5 Sonnet
3031
- Claude 4.0 Sonnet
@@ -39,14 +40,18 @@ parameter_rules:
3940
- Claude 3 Opus
4041
- name: cross-region
4142
label:
42-
zh_Hans: 使用跨区域推理
43-
en_US: Use Cross-Region Inference
44-
type: boolean
43+
zh_Hans: 跨区域推理
44+
en_US: Cross-Region Inference
45+
type: string
4546
required: true
46-
default: true
47+
default: global
48+
options:
49+
- disabled
50+
- geographic
51+
- global
4752
help:
48-
zh_Hans: 跨区域推理会自动选择您所在地理区域 AWS 区域 内的最佳位置来处理您的推理请求
49-
en_US: Cross-Region inference automatically selects the optimal AWS Region within your geography to process your inference request.
53+
zh_Hans: 跨区域推理会自动选择 AWS 区域内的最佳位置来处理您的推理请求。地理跨区域限于您所在地理区域,全球跨区域可跨所有区域
54+
en_US: Cross-Region inference automatically selects the optimal AWS Region to process your inference request. Geographic limits to your geography, Global spans all regions.
5055
- name: system_cache_checkpoint
5156
label:
5257
zh_Hans: 缓存系统提示词

models/bedrock/models/llm/cohere.yaml

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -28,14 +28,18 @@ parameter_rules:
2828
- Command Light
2929
- name: cross-region
3030
label:
31-
zh_Hans: 使用跨区域推理
32-
en_US: Use Cross-Region Inference
33-
type: boolean
31+
zh_Hans: 跨区域推理
32+
en_US: Cross-Region Inference
33+
type: string
3434
required: true
35-
default: true
35+
default: geographic
36+
options:
37+
- disabled
38+
- geographic
39+
- global
3640
help:
37-
zh_Hans: 跨区域推理会自动选择您所在地理区域 AWS 区域 内的最佳位置来处理您的推理请求
38-
en_US: Cross-Region inference automatically selects the optimal AWS Region within your geography to process your inference request.
41+
zh_Hans: 跨区域推理会自动选择 AWS 区域内的最佳位置来处理您的推理请求。地理跨区域限于您所在地理区域,全球跨区域可跨所有区域
42+
en_US: Cross-Region inference automatically selects the optimal AWS Region to process your inference request. Geographic limits to your geography, Global spans all regions.
3943
- name: max_new_tokens
4044
use_template: max_tokens
4145
required: true

models/bedrock/models/llm/deepseek.yaml

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -27,14 +27,18 @@ parameter_rules:
2727
- DeepSeek V3.1
2828
- name: cross-region
2929
label:
30-
zh_Hans: 使用跨区域推理
31-
en_US: Use Cross-Region Inference
32-
type: boolean
30+
zh_Hans: 跨区域推理
31+
en_US: Cross-Region Inference
32+
type: string
3333
required: true
34-
default: false
34+
default: disabled
35+
options:
36+
- disabled
37+
- geographic
38+
- global
3539
help:
36-
zh_Hans: 跨区域推理会自动选择您所在地理区域 AWS 区域 内的最佳位置来处理您的推理请求
37-
en_US: Cross-Region inference automatically selects the optimal AWS Region within your geography to process your inference request.
40+
zh_Hans: 跨区域推理会自动选择 AWS 区域内的最佳位置来处理您的推理请求。地理跨区域限于您所在地理区域,全球跨区域可跨所有区域
41+
en_US: Cross-Region inference automatically selects the optimal AWS Region to process your inference request. Geographic limits to your geography, Global spans all regions.
3842
- name: max_tokens
3943
use_template: max_tokens
4044
required: true

models/bedrock/models/llm/llm.py

Lines changed: 4 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -291,20 +291,12 @@ def _get_model_info(self, model: str, credentials: dict, model_parameters: dict)
291291
# Get region prefix for model ID construction
292292
region_name = credentials['aws_region']
293293
region_prefix = None
294+
cross_region = model_parameters.pop('cross-region', 'disabled')
294295

295-
if model_parameters.pop('cross-region', False):
296+
if cross_region in ('geographic', 'global'):
296297
# Cross-region inference enabled
297-
# Check if the model supports global prefix (currently mainly Claude 4 series)
298-
supports_global = any(model_id.startswith(prefix) for prefix in [
299-
'anthropic.claude-sonnet-4', 'anthropic.claude-sonnet-4-5'
300-
])
301-
302-
if supports_global:
303-
# Prefer using global prefix
304-
region_prefix = model_ids.get_region_area(region_name, prefer_global=True)
305-
else:
306-
# Use traditional regional prefix
307-
region_prefix = model_ids.get_region_area(region_name, prefer_global=False)
298+
prefer_global = (cross_region == 'global')
299+
region_prefix = model_ids.get_region_area(region_name, prefer_global=prefer_global)
308300

309301
if not region_prefix:
310302
raise InvokeError(f'Failed to get cross-region inference prefix for region {region_name}')
@@ -313,14 +305,6 @@ def _get_model_info(self, model: str, credentials: dict, model_parameters: dict)
313305
raise InvokeError(f"Model {model_id} doesn't support cross-region inference")
314306

315307
model_id = "{}.{}".format(region_prefix, model_id)
316-
elif model_ids.is_support_cross_region(model_id):
317-
# Cross-region inference not enabled, but still add region prefix for all models
318-
region_prefix = model_ids.get_region_area(region_name, prefer_global=False)
319-
320-
if not region_prefix:
321-
raise InvokeError(f'Failed to get region prefix for region {region_name}')
322-
323-
model_id = "{}.{}".format(region_prefix, model_id)
324308

325309

326310
model_info = BedrockLargeLanguageModel._find_model_info(model_id)

models/bedrock/models/llm/meta-llama.yaml

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,14 +30,18 @@ parameter_rules:
3030
- Llama 3.2 90B Instruct
3131
- name: cross-region
3232
label:
33-
zh_Hans: 使用跨区域推理
34-
en_US: Use Cross-Region Inference
35-
type: boolean
33+
zh_Hans: 跨区域推理
34+
en_US: Cross-Region Inference
35+
type: string
3636
required: true
37-
default: true
37+
default: geographic
38+
options:
39+
- disabled
40+
- geographic
41+
- global
3842
help:
39-
zh_Hans: 跨区域推理会自动选择您所在地理区域 AWS 区域 内的最佳位置来处理您的推理请求
40-
en_US: Cross-Region inference automatically selects the optimal AWS Region within your geography to process your inference request.
43+
zh_Hans: 跨区域推理会自动选择 AWS 区域内的最佳位置来处理您的推理请求。地理跨区域限于您所在地理区域,全球跨区域可跨所有区域
44+
en_US: Cross-Region inference automatically selects the optimal AWS Region to process your inference request. Geographic limits to your geography, Global spans all regions.
4145
- name: temperature
4246
use_template: temperature
4347

models/bedrock/models/llm/model_ids.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88

99
BEDROCK_MODEL_IDS = {
1010
'anthropic claude': {
11+
'Claude 4.5 Opus': 'anthropic.claude-opus-4-5-20251101-v1:0',
1112
'Claude 4.5 Haiku': 'anthropic.claude-haiku-4-5-20251001-v1:0',
1213
'Claude 4.5 Sonnet': 'anthropic.claude-sonnet-4-5-20250929-v1:0',
1314
'Claude 4.0 Sonnet': 'anthropic.claude-sonnet-4-20250514-v1:0',

models/bedrock/models/llm/openai.yaml

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -26,14 +26,18 @@ parameter_rules:
2626
- GPT OSS 20B
2727
- name: cross-region
2828
label:
29-
zh_Hans: 使用跨区域推理
30-
en_US: Use Cross-Region Inference
31-
type: boolean
29+
zh_Hans: 跨区域推理
30+
en_US: Cross-Region Inference
31+
type: string
3232
required: true
33-
default: false
33+
default: disabled
34+
options:
35+
- disabled
36+
- geographic
37+
- global
3438
help:
35-
zh_Hans: 跨区域推理会自动选择您所在地理区域 AWS 区域 内的最佳位置来处理您的推理请求
36-
en_US: Cross-Region inference automatically selects the optimal AWS Region within your geography to process your inference request.
39+
zh_Hans: 跨区域推理会自动选择 AWS 区域内的最佳位置来处理您的推理请求。地理跨区域限于您所在地理区域,全球跨区域可跨所有区域
40+
en_US: Cross-Region inference automatically selects the optimal AWS Region to process your inference request. Geographic limits to your geography, Global spans all regions.
3741
- name: max_tokens
3842
use_template: max_tokens
3943
required: true

models/bedrock/models/llm/qwen.yaml

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -28,14 +28,18 @@ parameter_rules:
2828
- Qwen3 Coder 30B
2929
- name: cross-region
3030
label:
31-
zh_Hans: 使用跨区域推理
32-
en_US: Use Cross-Region Inference
33-
type: boolean
31+
zh_Hans: 跨区域推理
32+
en_US: Cross-Region Inference
33+
type: string
3434
required: true
35-
default: false
35+
default: disabled
36+
options:
37+
- disabled
38+
- geographic
39+
- global
3640
help:
37-
zh_Hans: 跨区域推理会自动选择您所在地理区域 AWS 区域 内的最佳位置来处理您的推理请求
38-
en_US: Cross-Region inference automatically selects the optimal AWS Region within your geography to process your inference request.
41+
zh_Hans: 跨区域推理会自动选择 AWS 区域内的最佳位置来处理您的推理请求。地理跨区域限于您所在地理区域,全球跨区域可跨所有区域
42+
en_US: Cross-Region inference automatically selects the optimal AWS Region to process your inference request. Geographic limits to your geography, Global spans all regions.
3943
- name: max_tokens
4044
use_template: max_tokens
4145
required: true

0 commit comments

Comments
 (0)