fix:(embeddings-bedrock) correct extraction of provider from model_name #20295

luzanikita · 2025-11-21T14:32:49Z

Description

This pull request fixes a bug in the provider extraction logic for Bedrock
embedding models that caused failures when using regionally-scoped model names
(e.g., "eu.cohere.embed-v4:0"). The issue was that the code was naively splitting
model names on . and taking the first part, which would return the region prefix
instead of the actual provider name, resulting in "Provider not supported" errors.

Bug fix:

Fixed provider extraction to correctly handle both standard 2-part model names
(provider.model) and regionally-scoped 3-part model names
(region.provider.model)
Added a dedicated _get_provider() method with proper logic to extract the
provider name from different model name formats
Replaced all instances of self.model_name.split(".")[0] with calls to the new
_get_provider() method

Provider extraction improvements:

Added a new _get_provider() method to BedrockEmbedding that correctly
extracts the provider name from model names:
- 2-part format: provider.model (e.g., "amazon.titan-embed-text-v1") →
  returns first part ("amazon")
- 3-part format: region.provider.model (e.g., "eu.cohere.embed-v4:0") →
  returns middle part ("cohere")
- Raises ValueError for unexpected formats
Refactored all provider extraction usages in _get_embedding,
_get_text_embeddings, and _aget_embedding to use the new _get_provider()
method for consistency and correctness
Added comprehensive docstring explaining the different model name formats and
their handling

Testing enhancements:

Added test_get_provider_two_part_format() to verify standard model names (e.g.,
"amazon.titan-embed-text-v1", "cohere.embed-english-v3")
Added test_get_provider_three_part_format() to verify regionally-scoped model
names (e.g., "us.amazon.titan-embed-text-v1", "eu.cohere.embed-english-v3",
"global.amazon.titan-embed-text-v2")
Added test_get_provider_invalid_format() to verify proper error handling for
invalid model name formats

Code style and import order:

Reordered imports in base.py for improved readability and consistency
Minor formatting changes to improve code consistency (line length, string
formatting)

Root Cause

The original code used self.model_name.split(".")[0] to extract the provider,
which worked for standard model names like "amazon.titan-embed-text-v1" but
failed for regionally-scoped names like "eu.cohere.embed-v4:0" because it would
return "eu" instead of "cohere", causing the error:

{"output": "Error processing request: Provider not supported", "metadata":
{"error": "Provider not supported"}}

Solution

The new _get_provider() method intelligently handles both formats by counting the
number of dot-separated parts and extracting the correct segment based on the
format.

Fixes # (issue)

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a
detailed README.md for my new integration or package?

Yes
No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating?
(Except for the llama-index-core package)

Yes
No

Type of Change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to
not work as expected)
This change requires a documentation update

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of
impactful unit testing.

I added new unit tests to cover this change
I believe this change is already covered by existing unit tests

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added Google Colab support for the newly added notebooks.
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran uv run make format; uv run make lint to appease the lint gods

…h different formats

Added helper method to correctly extract provider from model_name wit…

2ed5b72

…h different formats

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Nov 21, 2025

luzanikita changed the title ~~Added helper method to correctly extract provider from model_name wit…~~ fix:(embeddings-bedrock)Added helper method to correctly extract provider from model_name Nov 21, 2025

luzanikita changed the title ~~fix:(embeddings-bedrock)Added helper method to correctly extract provider from model_name~~ fix:(embeddings-bedrock) correct extraction of provider from model_name Nov 21, 2025

luzanikita and others added 2 commits November 21, 2025 18:20

Fixed mocking of boto3 client in tests

1eb92db

Merge branch 'main' into bedrock-embedding-provider-fix

1b51493

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix:(embeddings-bedrock) correct extraction of provider from model_name #20295

fix:(embeddings-bedrock) correct extraction of provider from model_name #20295

luzanikita commented Nov 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix:(embeddings-bedrock) correct extraction of provider from model_name #20295

Are you sure you want to change the base?

fix:(embeddings-bedrock) correct extraction of provider from model_name #20295

Conversation

luzanikita commented Nov 21, 2025

Description

Root Cause

Solution

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant