Skip to content

Commit 87770ef

Browse files
authored
feat(inferenceprofiles): add inference and cross-region inference pro… (#35048)
### Issue # (if applicable) Closes #<issue number here>. ### Reason for this change This PR introduces comprehensive support for Amazon Bedrock Inference Profiles in the AWS CDK Bedrock Alpha construct library, addressing the need for better cost tracking, model usage optimization, and cross-region inference capabilities. ### Description of changes 1. **Application Inference Profiles** : Added support for user-defined inference profiles that enable cost tracking and model usage monitoring Single-region application profiles for basic cost tracking Multi-region application profiles using cross-region inference profiles 2. **Cross-Region Inference Profiles**: Implemented system-defined profiles that enable seamless traffic distribution across multiple AWS regions - Support for handling unplanned traffic bursts - Enhanced resilience during peak demand periods - Geographic region-based routing (US, EU regions) 3. **Prompt Routers**: Added intelligent prompt routing capabilities ### Describe any new or updated permissions being added Implemented `grantProfileUsage()` method for proper IAM permission handling - Support for granting inference profile usage to other AWS resources - Proper IAM policy generation for profile access ### Description of how you validated changes Added unit test Added integ test And tested it with a cdkApp deployment. ### Checklist - [ Y] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md) ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
1 parent 30f0326 commit 87770ef

21 files changed

+3885
-6
lines changed

packages/@aws-cdk/aws-bedrock-alpha/README.md

Lines changed: 205 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,12 @@ This construct library facilitates the deployment of Bedrock Agents, enabling yo
4343
- [Prompt Properties](#prompt-properties)
4444
- [Prompt Version](#prompt-version)
4545
- [Import Methods](#import-methods)
46+
- [Inference Profiles](#inference-profiles)
47+
- [Using Inference Profiles](#using-inference-profiles)
48+
- [Types of Inference Profiles](#types-of-inference-profiles)
49+
- [Prompt Routers](#prompt-routers)
50+
- [Inference Profile Permissions](#inference-profile-permissions)
51+
- [Inference Profiles Import Methods](#inference-profiles-import-methods)
4652

4753
## Agents
4854

@@ -807,3 +813,202 @@ const importedPrompt = bedrock.Prompt.fromPromptAttributes(this, 'ImportedPrompt
807813
promptVersion: '1', // optional, defaults to 'DRAFT'
808814
});
809815
```
816+
817+
## Inference Profiles
818+
819+
Amazon Bedrock Inference Profiles provide a way to manage and optimize inference configurations for your foundation models. They allow you to define reusable configurations that can be applied across different prompts and agents.
820+
821+
### Using Inference Profiles
822+
823+
Inference profiles can be used with prompts and agents to maintain consistent inference configurations across your application.
824+
825+
#### With Agents
826+
827+
```ts fixture=default
828+
// Create a cross-region inference profile
829+
const crossRegionProfile = bedrock.CrossRegionInferenceProfile.fromConfig({
830+
geoRegion: bedrock.CrossRegionInferenceProfileRegion.US,
831+
model: bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V1_0,
832+
});
833+
834+
// Use the cross-region profile with an agent
835+
const agent = new bedrock.Agent(this, 'Agent', {
836+
foundationModel: crossRegionProfile,
837+
instruction: 'You are a helpful and friendly agent that answers questions about agriculture.',
838+
});
839+
```
840+
841+
#### With Prompts
842+
843+
```ts fixture=default
844+
// Create a prompt router for intelligent model selection
845+
const promptRouter = bedrock.PromptRouter.fromDefaultId(
846+
bedrock.DefaultPromptRouterIdentifier.ANTHROPIC_CLAUDE_V1,
847+
'us-east-1'
848+
);
849+
850+
// Use the prompt router with a prompt variant
851+
const variant = bedrock.PromptVariant.text({
852+
variantName: 'variant1',
853+
promptText: 'What is the capital of France?',
854+
model: promptRouter,
855+
});
856+
857+
new bedrock.Prompt(this, 'Prompt', {
858+
promptName: 'prompt-router-test',
859+
variants: [variant],
860+
});
861+
```
862+
863+
### Types of Inference Profiles
864+
865+
Amazon Bedrock offers two types of inference profiles:
866+
867+
#### Application Inference Profiles
868+
869+
Application inference profiles are user-defined profiles that help you track costs and model usage. They can be created for a single region or for multiple regions using a cross-region inference profile.
870+
871+
##### Single Region Application Profile
872+
873+
```ts fixture=default
874+
// Create an application inference profile for one Region
875+
const appProfile = new bedrock.ApplicationInferenceProfile(this, 'MyApplicationProfile', {
876+
applicationInferenceProfileName: 'claude-3-sonnet-v1',
877+
modelSource: bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_SONNET_V1_0,
878+
description: 'Application profile for cost tracking',
879+
tags: {
880+
Environment: 'Production',
881+
},
882+
});
883+
```
884+
885+
##### Multi-Region Application Profile
886+
887+
```ts fixture=default
888+
// Create a cross-region inference profile
889+
const crossRegionProfile = bedrock.CrossRegionInferenceProfile.fromConfig({
890+
geoRegion: bedrock.CrossRegionInferenceProfileRegion.US,
891+
model: bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V2_0,
892+
});
893+
894+
// Create an application inference profile across regions
895+
const appProfile = new bedrock.ApplicationInferenceProfile(this, 'MyMultiRegionProfile', {
896+
applicationInferenceProfileName: 'claude-35-sonnet-v2-multi-region',
897+
modelSource: crossRegionProfile,
898+
description: 'Multi-region application profile for cost tracking',
899+
});
900+
```
901+
902+
#### System Defined Inference Profiles
903+
904+
Cross-region inference enables you to seamlessly manage unplanned traffic bursts by utilizing compute across different AWS Regions. With cross-region inference, you can distribute traffic across multiple AWS Regions, enabling higher throughput and enhanced resilience during periods of peak demands.
905+
906+
Before using a CrossRegionInferenceProfile, ensure that you have access to the models and regions defined in the inference profiles. For instance, if you use the system defined inference profile "us.anthropic.claude-3-5-sonnet-20241022-v2:0", inference requests will be routed to US East (Virginia) us-east-1, US East (Ohio) us-east-2 and US West (Oregon) us-west-2. Thus, you need to have model access enabled in those regions for the model anthropic.claude-3-5-sonnet-20241022-v2:0.
907+
908+
##### System Defined Profile Configuration
909+
910+
```ts fixture=default
911+
const crossRegionProfile = bedrock.CrossRegionInferenceProfile.fromConfig({
912+
geoRegion: bedrock.CrossRegionInferenceProfileRegion.US,
913+
model: bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V2_0,
914+
});
915+
```
916+
917+
### Prompt Routers
918+
919+
Amazon Bedrock intelligent prompt routing provides a single serverless endpoint for efficiently routing requests between different foundational models within the same model family. It can help you optimize for response quality and cost. They offer a comprehensive solution for managing multiple AI models through a single serverless endpoint, simplifying the process for you. Intelligent prompt routing predicts the performance of each model for each request, and dynamically routes each request to the model that it predicts is most likely to give the desired response at the lowest cost.
920+
921+
#### Default and Custom Prompt Routers
922+
923+
```ts fixture=default
924+
// Use a default prompt router
925+
const variant = bedrock.PromptVariant.text({
926+
variantName: 'variant1',
927+
promptText: 'What is the capital of France?',
928+
model: bedrock.PromptRouter.fromDefaultId(
929+
bedrock.DefaultPromptRouterIdentifier.ANTHROPIC_CLAUDE_V1,
930+
'us-east-1'
931+
),
932+
});
933+
934+
new bedrock.Prompt(this, 'Prompt', {
935+
promptName: 'prompt-router-test',
936+
variants: [variant],
937+
});
938+
```
939+
940+
### Inference Profile Permissions
941+
942+
Use the `grantProfileUsage` method to grant appropriate permissions to resources that need to use the inference profile.
943+
944+
#### Granting Profile Usage Permissions
945+
946+
```ts fixture=default
947+
// Create an application inference profile
948+
const profile = new bedrock.ApplicationInferenceProfile(this, 'MyProfile', {
949+
applicationInferenceProfileName: 'my-profile',
950+
modelSource: bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V1_0,
951+
});
952+
953+
// Create a Lambda function
954+
const lambdaFunction = new lambda.Function(this, 'MyFunction', {
955+
runtime: lambda.Runtime.PYTHON_3_11,
956+
handler: 'index.handler',
957+
code: lambda.Code.fromInline('def handler(event, context): return "Hello"'),
958+
});
959+
960+
// Grant the Lambda function permission to use the inference profile
961+
profile.grantProfileUsage(lambdaFunction);
962+
963+
// Use a system defined inference profile
964+
const crossRegionProfile = bedrock.CrossRegionInferenceProfile.fromConfig({
965+
geoRegion: bedrock.CrossRegionInferenceProfileRegion.US,
966+
model: bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V1_0,
967+
});
968+
969+
// Grant permissions to use the cross-region inference profile
970+
crossRegionProfile.grantProfileUsage(lambdaFunction);
971+
```
972+
973+
The `grantProfileUsage` method adds the necessary IAM permissions to the resource, allowing it to use the inference profile. This includes permissions to call `bedrock:GetInferenceProfile` and `bedrock:ListInferenceProfiles` actions on the inference profile resource.
974+
975+
### Inference Profiles Import Methods
976+
977+
You can import existing application inference profiles using the following methods:
978+
979+
```ts fixture=default
980+
// Import an inference profile through attributes
981+
const importedProfile = bedrock.ApplicationInferenceProfile.fromApplicationInferenceProfileAttributes(
982+
this,
983+
'ImportedProfile',
984+
{
985+
inferenceProfileArn: 'arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/my-profile-id',
986+
inferenceProfileIdentifier: 'my-profile-id',
987+
}
988+
);
989+
```
990+
991+
You can also import an application inference profile from an existing L1 CloudFormation construct:
992+
993+
```ts fixture=default
994+
// Create or reference an existing L1 CfnApplicationInferenceProfile
995+
const cfnProfile = new aws_bedrock_cfn.CfnApplicationInferenceProfile(this, 'CfnProfile', {
996+
inferenceProfileName: 'my-cfn-profile',
997+
modelSource: {
998+
copyFrom: bedrock.BedrockFoundationModel.ANTHROPIC_CLAUDE_3_5_SONNET_V1_0.invokableArn,
999+
},
1000+
description: 'Profile created via L1 construct',
1001+
});
1002+
1003+
// Import the L1 construct as an L2 ApplicationInferenceProfile
1004+
const importedFromCfn = bedrock.ApplicationInferenceProfile.fromCfnApplicationInferenceProfile(cfnProfile);
1005+
1006+
// Grant permissions to use the imported profile
1007+
const lambdaFunction = new lambda.Function(this, 'MyFunction', {
1008+
runtime: lambda.Runtime.PYTHON_3_11,
1009+
handler: 'index.handler',
1010+
code: lambda.Code.fromInline('def handler(event, context): return "Hello"'),
1011+
});
1012+
1013+
importedFromCfn.grantProfileUsage(lambdaFunction);
1014+
```

packages/@aws-cdk/aws-bedrock-alpha/bedrock/agents/agent.ts

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -408,6 +408,7 @@ export class Agent extends AgentBase implements IAgent {
408408
* action groups associated with the ageny
409409
*/
410410
public readonly actionGroups: AgentActionGroup[] = [];
411+
411412
// ------------------------------------------------------
412413
// CDK-only attributes
413414
// ------------------------------------------------------
@@ -514,7 +515,7 @@ export class Agent extends AgentBase implements IAgent {
514515
this.agentCollaboration = props.agentCollaboration;
515516
if (props.agentCollaboration) {
516517
props.agentCollaboration.collaborators.forEach(ac => {
517-
this.addAgentCollaborator(ac);
518+
this.grantPermissionToAgent(ac);
518519
});
519520
}
520521

@@ -636,11 +637,14 @@ export class Agent extends AgentBase implements IAgent {
636637
}
637638

638639
/**
639-
* Adds a collaborator to the agent and grants necessary permissions.
640-
* @param agentCollaborator - The collaborator to add
641-
* @internal This method is used internally by the constructor and should not be called directly.
640+
* Grants permissions for an agent collaborator to this agent's role.
641+
* This method only grants IAM permissions and does not add the collaborator
642+
* to the agent's collaboration configuration. To add collaborators to the
643+
* agent configuration, include them in the AgentCollaboration when creating the agent.
644+
*
645+
* @param agentCollaborator - The collaborator to grant permissions for
642646
*/
643-
private addAgentCollaborator(agentCollaborator: AgentCollaborator) {
647+
private grantPermissionToAgent(agentCollaborator: AgentCollaborator) {
644648
agentCollaborator.grant(this.role);
645649
}
646650

@@ -682,7 +686,7 @@ export class Agent extends AgentBase implements IAgent {
682686
* @internal This is an internal core function and should not be called directly.
683687
*/
684688
private renderAgentCollaborators(): bedrock.CfnAgent.AgentCollaboratorProperty[] | undefined {
685-
if (!this.agentCollaboration) {
689+
if (!this.agentCollaboration || !this.agentCollaboration.collaborators || this.agentCollaboration.collaborators.length === 0) {
686690
return undefined;
687691
}
688692

packages/@aws-cdk/aws-bedrock-alpha/bedrock/index.ts

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,14 @@ export * from './prompts/prompt-inference-configuration';
2727
export * from './prompts/prompt-template-configuration';
2828
export * from './prompts/prompt-genai-resource';
2929

30+
// ===================================
31+
// Inference Profiles
32+
// ===================================
33+
export * from './inference-profiles/inference-profile';
34+
export * from './inference-profiles/application-inference-profile';
35+
export * from './inference-profiles/cross-region-inference-profile';
36+
export * from './inference-profiles/prompt-router';
37+
3038
// ===================================
3139
// Models
3240
// ===================================

0 commit comments

Comments
 (0)