-
Notifications
You must be signed in to change notification settings - Fork 2.3k
feat: add AWS Bedrock support for codebase indexing #8661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add bedrock as a new EmbedderProvider type - Add AWS Bedrock embedding model profiles (titan-embed-text models) - Create BedrockEmbedder class with support for Titan and Cohere models - Add Bedrock configuration support to config manager and interfaces - Update service factory to create BedrockEmbedder instances - Add comprehensive tests for BedrockEmbedder - Add localization strings for Bedrock support Closes #8658
const qdrantApiKey = this.contextProxy?.getSecret("codeIndexQdrantApiKey") ?? "" | ||
// Fix: Read OpenAI Compatible settings from the correct location within codebaseIndexConfig | ||
const openAiCompatibleBaseUrl = codebaseIndexConfig.codebaseIndexOpenAiCompatibleBaseUrl ?? "" | ||
const openAiCompatibleBaseUrl = (codebaseIndexConfig as any).codebaseIndexOpenAiCompatibleBaseUrl ?? "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider defining a proper type for the config object instead of using (codebaseIndexConfig as any)
. This can help avoid runtime errors and improve type safety.
const openAiCompatibleBaseUrl = (codebaseIndexConfig as any).codebaseIndexOpenAiCompatibleBaseUrl ?? "" | |
const openAiCompatibleBaseUrl = codebaseIndexConfig.codebaseIndexOpenAiCompatibleBaseUrl ?? "" |
const itemTokens = Math.ceil(text.length / 4) | ||
|
||
if (itemTokens > MAX_ITEM_TOKENS) { | ||
console.warn( | ||
t("embeddings:textExceedsTokenLimit", { | ||
index: i, | ||
itemTokens, | ||
maxTokens: MAX_ITEM_TOKENS, | ||
}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Token limits for Cohere models through Bedrock are incorrectly enforced. Cohere Embed v3 models have a maximum input token limit of 512, but this code uses OpenAI's MAX_ITEM_TOKENS
(8191) for all Bedrock models. This will cause API errors when processing texts between 512-8191 tokens with Cohere models, as the batching logic won't filter them out.
Consider adding model-specific token limits similar to how GEMINI_MAX_ITEM_TOKENS
is defined, then checking the model ID to apply the appropriate limit. For example:
const maxTokens = model.startsWith('cohere.embed') ? 512 : 8191;
if (itemTokens > maxTokens) { ... }
Summary
This PR adds AWS Bedrock support for codebase indexing, addressing issue #8658.
Motivation
Users who utilize AWS Bedrock for their AI workloads previously needed to set up an OpenAI-compatible API gateway (like https://github.com/aws-samples/bedrock-access-gateway) to use codebase indexing. This PR removes that requirement by adding native AWS Bedrock support.
Changes Made
Core Implementation
bedrock
as a new EmbedderProvider typeBedrockEmbedder
class with full IEmbedder interface complianceConfiguration & Integration
Testing & Quality
Security & Best Practices
How to Test
bedrock
Breaking Changes
None - this is a purely additive change.
Checklist
Closes #8658
Important
Adds AWS Bedrock support for codebase indexing with new embedder provider, configurations, and tests.
bedrock
as a newEmbedderProvider
type ininterfaces/manager.ts
andembeddingModels.ts
.BedrockEmbedder
class inbedrock.ts
with fullIEmbedder
interface compliance.config-manager.ts
to handle Bedrock settings (region, profile).service-factory.ts
to instantiateBedrockEmbedder
.embeddingModels.ts
.__tests__/bedrock.spec.ts
with 23 test cases.embeddings.json
.This description was created by
for 013496e. You can customize this summary. It will automatically update as commits are pushed.