Skip to content

feat: use inference profiles for on-demand Claude models []#10711

Merged
Mitch Goudy (mgoudy91) merged 4 commits intomasterfrom
feature/bedrock-inference-profiles
Mar 18, 2026
Merged

feat: use inference profiles for on-demand Claude models []#10711
Mitch Goudy (mgoudy91) merged 4 commits intomasterfrom
feature/bedrock-inference-profiles

Conversation

@mgoudy91
Copy link
Contributor

@mgoudy91 Mitch Goudy (mgoudy91) commented Mar 18, 2026

Summary

Followup to #10706
Fixes Bedrock ValidationException: "Invocation of model ID ... with on-demand throughput isn't supported. Retry your request with the ID or ARN of an inference profile that contains this model."

See:

Changes

  • Inference profile support: Models that require it (Sonnet 4.6, 4.5, 4, 3.5 Haiku, 3 Haiku, v3 Sonnet) now use inference profile IDs (e.g. us.anthropic.claude-sonnet-4-5-20250929-v1:0, global.anthropic.claude-sonnet-4-6) when calling InvokeModel / InvokeModelWithResponseStream.
  • Backward compatible: Stored config still uses foundation model id; only the value sent at invoke time changes. v2.1, Instant, Llama, Mistral continue to use foundation model ID.
  • Config screen: Models with getInvokeId are no longer marked NOT_IN_REGION solely for missing from ListFoundationModels; availability is determined by the invoke check with the profile ID.

Testing

  • npm run build and affected unit tests (featuredModels, Model, aiApi) pass.
  • Verified locally: Sonnet 4.6, 4.5, 4, 3.5 Haiku show as available when account has access; 403s are IAM/Marketplace subscription issues, not request format.

… Claude models

- Add getInvokeId(region) and pass region into invokeCommand so InvokeModel
  uses inference profile IDs for Sonnet 4.x, 3.5 Haiku, 3 Haiku, v3 Sonnet
- Keep backward compatibility: stored model id unchanged; v2.1, Instant,
  Llama, Mistral still use foundation model ID
- Treat models with getInvokeId as in-region when not in ListFoundationModels
  and rely on invoke check for availability

Made-with: Cursor
@mgoudy91 Mitch Goudy (mgoudy91) requested a review from a team as a code owner March 18, 2026 21:54
Copilot AI review requested due to automatic review settings March 18, 2026 21:54
@mgoudy91 Mitch Goudy (mgoudy91) changed the title feat(bedrock-content-generator): use inference profiles for on-demand Claude models feat: use inference profiles for on-demand Claude models [] Mar 18, 2026
@mgoudy91 Mitch Goudy (mgoudy91) enabled auto-merge (squash) March 18, 2026 21:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Bedrock content generator to invoke certain on-demand Anthropic Claude models via inference profile IDs (instead of raw foundation model IDs) to avoid Bedrock ValidationException errors and improve model availability detection.

Changes:

  • Add inference profile routing (getInvokeId, region→prefix mapping) for Claude models that require it.
  • Pass region into invoke command construction so the correct invoke-time modelId is used for streaming and availability checks.
  • Adjust the config UI’s region-availability logic and update related unit test expectations.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
apps/bedrock-content-generator/src/utils/aiApi/index.ts Pass region into invoke inputs; invoke using computed modelId for streaming and availability checks.
apps/bedrock-content-generator/src/configs/aws/featuredModels.ts Introduce getInvokeId + inference profile selection helpers; update featured Claude models to use inference profiles.
apps/bedrock-content-generator/src/components/config/model/Model.tsx Loosen NOT_IN_REGION determination for models that use inference profiles so they still get an invoke-based availability check.
apps/bedrock-content-generator/src/components/config/model/Model.spec.tsx Update assertion for the NOT_IN_REGION warning text now that fewer models are classified as region-missing from the foundation list.
Comments suppressed due to low confidence (2)

apps/bedrock-content-generator/src/utils/aiApi/index.ts:92

  • Now that more models rely on the invoke-time ID (inference profile), getModelAvailability should classify the common “not available in this region / invalid modelId for region” failures as NOT_IN_REGION instead of returning a raw Error (which becomes OTHER_ERROR). Consider handling ValidationException / ResourceNotFoundException (and/or matching their messages) to return NOT_IN_REGION so the config UI shows the correct availability state.
      const invokeInput = model.invokeCommand('', '', 1, this.region);
      await this.bedrockRuntimeClient.send(new InvokeModelCommand(invokeInput));
    } catch (e: unknown) {
      if (!(e instanceof Error)) {
        return Error('An unexpected error has occurred');
      }
      if (e instanceof AccessDeniedException) {
        if (e.message.includes('is not authorized to perform: bedrock:InvokeModel'))
          return 'FORBIDDEN';
        if (e.message.includes("You don't have access to the model with the specified model ID."))
          return 'NOT_IN_ACCOUNT';
      }

apps/bedrock-content-generator/src/components/config/model/Model.tsx:86

  • isInRegion is now computed as isInFoundationList || !!featuredModel.getInvokeId, which means models with getInvokeId are treated as “in region” even when they’re absent from ListFoundationModels. This makes the variable name/meaning misleading and can mask genuine NOT_IN_REGION cases unless the invoke error is mapped accordingly. Consider renaming this flag to something like shouldCheckAvailability (or similar) to reflect its purpose and avoid confusion.
          const isInFoundationList = allModels.some((m) => m.modelId === featuredModel.id);
          // Models that use inference profiles may not appear in ListFoundationModels; still run availability check (invoke with profile ID).
          const isInRegion = isInFoundationList || !!featuredModel.getInvokeId;

          return {
            ...featuredModel,
            invokeCommand: featuredModel.invokeCommand,
            availability: isInRegion ? 'AVAILABLE' : 'NOT_IN_REGION',

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Mitch Goudy (mgoudy91) and others added 3 commits March 18, 2026 16:00
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@mgoudy91 Mitch Goudy (mgoudy91) merged commit 4236faa into master Mar 18, 2026
14 checks passed
@mgoudy91 Mitch Goudy (mgoudy91) deleted the feature/bedrock-inference-profiles branch March 18, 2026 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants