Skip to content

Conversation

@luzanikita
Copy link

Description

This pull request fixes a bug in the provider extraction logic for Bedrock
embedding models that caused failures when using regionally-scoped model names
(e.g., "eu.cohere.embed-v4:0"). The issue was that the code was naively splitting
model names on . and taking the first part, which would return the region prefix
instead of the actual provider name, resulting in "Provider not supported" errors.

Bug fix:

  • Fixed provider extraction to correctly handle both standard 2-part model names
    (provider.model) and regionally-scoped 3-part model names
    (region.provider.model)
  • Added a dedicated _get_provider() method with proper logic to extract the
    provider name from different model name formats
  • Replaced all instances of self.model_name.split(".")[0] with calls to the new
    _get_provider() method

Provider extraction improvements:

  • Added a new _get_provider() method to BedrockEmbedding that correctly
    extracts the provider name from model names:
    • 2-part format: provider.model (e.g., "amazon.titan-embed-text-v1") →
      returns first part ("amazon")
    • 3-part format: region.provider.model (e.g., "eu.cohere.embed-v4:0") →
      returns middle part ("cohere")
    • Raises ValueError for unexpected formats
  • Refactored all provider extraction usages in _get_embedding,
    _get_text_embeddings, and _aget_embedding to use the new _get_provider()
    method for consistency and correctness
  • Added comprehensive docstring explaining the different model name formats and
    their handling

Testing enhancements:

  • Added test_get_provider_two_part_format() to verify standard model names (e.g.,
    "amazon.titan-embed-text-v1", "cohere.embed-english-v3")
  • Added test_get_provider_three_part_format() to verify regionally-scoped model
    names (e.g., "us.amazon.titan-embed-text-v1", "eu.cohere.embed-english-v3",
    "global.amazon.titan-embed-text-v2")
  • Added test_get_provider_invalid_format() to verify proper error handling for
    invalid model name formats

Code style and import order:

  • Reordered imports in base.py for improved readability and consistency
  • Minor formatting changes to improve code consistency (line length, string
    formatting)

Root Cause

The original code used self.model_name.split(".")[0] to extract the provider,
which worked for standard model names like "amazon.titan-embed-text-v1" but
failed for regionally-scoped names like "eu.cohere.embed-v4:0" because it would
return "eu" instead of "cohere", causing the error:

{"output": "Error processing request: Provider not supported", "metadata":
{"error": "Provider not supported"}}

Solution

The new _get_provider() method intelligently handles both formats by counting the
number of dot-separated parts and extracting the correct segment based on the
format.

Fixes # (issue)

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a
detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating?
(Except for the llama-index-core package)

  • Yes
  • No

Type of Change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to
    not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of
impactful unit testing.

  • I added new unit tests to cover this change
  • I believe this change is already covered by existing unit tests

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Nov 21, 2025
@luzanikita luzanikita changed the title Added helper method to correctly extract provider from model_name wit… fix:(embeddings-bedrock)Added helper method to correctly extract provider from model_name Nov 21, 2025
@luzanikita luzanikita changed the title fix:(embeddings-bedrock)Added helper method to correctly extract provider from model_name fix:(embeddings-bedrock) correct extraction of provider from model_name Nov 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant