Skip to content

Conversation

@sebsto
Copy link
Collaborator

@sebsto sebsto commented Nov 24, 2025

Add support for service tier

Amazon Bedrock offers three service tiers for model inference: Priority, Standard, and Flex. The Priority tier delivers the fastest response times for a price premium over standard on-demand pricing. It is best suited for mission-critical applications like customer-facing chatbots and real-time language translation services. The Standard tier provides consistent performance for everyday AI tasks, ideal for content generation, text analysis, and routine document processing. For workloads that can handle longer processing times, the Flex tier offers cost-effective processing for a pricing discount, perfect for model evaluations, content summarization, agentic workflows.

New API

Converse

let bedrock = try await BedrockService(
    region: .uswest2,
)

var builder = try ConverseRequestBuilder(with: .openai_gpt_oss_20b)
    .withPrompt("Who are you?")
    .withServiceTier(.priority)

var reply = try await bedrock.converse(with: builder)

InvokeModel (text)

let bedrock = try await BedrockService(
    region: .uswest2,
)

let textCompletion = try await bedrock.completeText(
    "Who are you?",
    with: .openai_gpt_oss_20b,
    serviceTier: .default
)

Doc:
https://docs.aws.amazon.com/bedrock/latest/userguide/service-tiers-inference.html

Launch:
https://aws.amazon.com/blogs/aws/new-amazon-bedrock-service-tiers-help-you-match-ai-workload-performance-with-cost/

@sebsto sebsto self-assigned this Nov 24, 2025
@sebsto sebsto added the enhancement New feature or request label Nov 24, 2025
@sebsto
Copy link
Collaborator Author

sebsto commented Nov 24, 2025

Changes

  • Updated ConverseRequest and InvokeModelRequest to include serviceTier parameter
  • Enhanced ConverseRequestBuilder with service tier configuration
  • Modified text generation methods across all model providers (Nova, Titan, Anthropic, DeepSeek, Llama, OpenAI) to support service tier (according to each model support status)
  • Updated Parameters class with service tier handling

Dependencies

  • Upgraded aws-sdk-swift from 1.5.51 to 1.6.3
  • Upgraded smithy-swift from 0.158.0 to 0.173.0
  • Upgraded swift-argument-parser from 1.6.1 to 1.6.2
  • Upgraded aws-crt-swift from 0.53.0 to 0.54.2

Examples

  • Added service tier usage example in OpenAI converse example
  • Updated test cases to reflect new service tier parameter

@sebsto sebsto merged commit 2823045 into main Nov 24, 2025
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant