Skip to content

Conversation

@nisranjan
Copy link
Contributor

@nisranjan nisranjan commented Dec 29, 2025

@droot @noahlwest

Following has been covered in this PR:

  1. Added Unit Tests for key Client & Chat interface APIs
  2. Fixed problems in bedrock client and chat implementation (vis-a-vis bedrock.go)
  • Fixed bedrockClient, added the controlPlane client and the runtime client which are required
  • AWS configuration supports multiple methods, implemented an order (and error messages) for them (details below)
  • Fixed implementation issues in all the key Client / Chat APIs e.g. (ListModels(), Send(), GenerateCompletion(), other)
  • Modified SendStreaming() implementation to handle blank / false Candidates / Parts
  1. Modified the Bedrock documentation

Known issues or not covered in this PR

  • bedrock Initialize() is not tested
  • bedrock SetResponseSchema() is neither implemented nor tested
  • Models (including all Anthropic models) use Provisioned throughput which will require addition configuration (this is actually open for feedback)
  • I haven't actually built and tested the kubectl-ai tool at this moment, but will complete that by tomorrow or EoW

Items for Feedback

Issue around AWS Authentication, Configuration params, Model config

  • AWS Credentials can Short Term, Long term keys, IAM User, and other methods: I have currently supported Auth using IAM User (which is tested and documented), Currently the bedrock throws an error if AWS_ACCESS_KEY_ID and / or AWS_SECRET_ACCESS_KEY is not set
  • Support for Short Term / Long term keys is WIP. Long terms keys is recommended for Testing or Research (i.e. non production use). How critical this is to kubectl-ai etc?
  • AWS Configuration like AWS_REGION can be picked up from LLM_CLIENT, Environment Variable, or OS Configuration, Currently, I have ordered picking Region configuration from Environment Variable, next it checks the opts.ModelID variable (I think), next the LLM_CLIENT for region information, and lastly to default. I feel that this can be simplified
  • similarly Model can be picked from command line parameter --llm-client, & Environment Variable BEDROCK_MODEL. Also LLM_CLIENT can be used to add model information. My suggestion is to keep it simple and reduce the number of options (i.e. maybe only LLM_CLIENT env variable and --llm-client command line)
    Currenly it picks from BEDROCK_MODEL env variable or from opts.ModelID if set

Items around support for models

  • The ListModels() queries all AWS Foundational Models (it doesn't cover Custom Models and other model types) and saves it with the bedrockClient struct. At this the implementation doesn't filter models either by weather it supports streaming or systemPrompts or tools. The logic is that GenerateCompletion(i.e. single turn chats) might work with some models which Send() might not work and so on with SendStreaming(). When the user / system calls StartChat() then this needs to be filtered based on modelID etc.
  • The problem with using StartChat() in the above way requires it to be able to throw errors which is currently not the case
  • There is logic to filter based on tool calling ... this implementation is hard coded to handle errors when using Send() or SendStreaming() ... since controlPlane API doesn't support this ... and will only work when updated frequently in the current fast moving AI world
  • The current configuration works only for On_Demand models, it doesn't work for Provisioned models (including all Anthropic Models and most Amazon models). How critical is this?
  • Another critical piece is that currently kubectl multi-turn work on only on models which support Streaming (which reduces the number of models). Kubectl-AI multi turn chat can be implemented with Send() options as well. And so what are the thoughts on this.
  • Lastly while on this topic, I have added the following code to handle blank or false Candidates in streaming API. I think that this might be something that might not be specific to bedrock but might happen on other providers as well so wondering if this might be better to put in Conversation.go

emit := func(resp *bedrockStreamResponse) bool {
			if resp == nil {
				return true
			}
			if len(resp.Candidates()) == 0 {
				return true
			}
			return yield(resp, nil)
		}

Items around TestCases

  • A number of tests are failing right now, some are there because of feedback, and some because of priorities. Is it required that all tests runs in the CI / CD pipeline
  • TestBedrockClient tests simple testing of Configuration params and ordering for them uses LIVE Integration bedrockClient. I guess using a Mock might be better
  • TestListModel runs LIVE network tests, and is required to understand where Auth fails for someone
  • TestGenerateCompletion tests single-turn chat
  • TestSend runs elaborate LIVE network tests for Streaming and non Streaming cases across all models and needs a couple of minutes to complete

nisranjan and others added 16 commits November 12, 2025 10:06
…onDefinition(), StartChat() on Client and Chat interface
WIP : Expanded documentation on AWS Bedrock support and usage.
Expanded usage instructions and supported models for AWS Bedrock. Removed deprecated model information and added setup details for AWS credentials.
Updated usage instructions and model support details in the Bedrock documentation. Changed default model to google.gemma-3-4b-it and updated region to ap-south-1.
Correct typos and improve clarity in the documentation regarding model support for kubectl-ai.
Removed BEDROCK_MODEL environment variable from docker run command.
Updated the documentation for the Bedrock provider, including changes to model support and authentication instructions.
Clarified model usage and requirements for kubectl-ai, including streaming support and limitations on model availability. Updated links and improved wording for better clarity.
Added usage instructions for command line tool, Docker, and bedrockClient API.
…ectl-ai into aws_bedrock_issues, added the modifications to bedrock.md file
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant