Skip to content

Bug: Allow Databricks Service Principal Authentication for VectorSearchRetrieverTool. #178

@tomvmeer

Description

@tomvmeer

Using databricks-ai-bridge==0.7.1
databricks-langchain==0.7.1

When initiating the VectorSearchRetrieverTool instance for Langchain, (either with a workspace client or with environment variables), the only supported flow seems to be with personal access token.

See entry point in:
integrations/langchain/src/databricks_langchain/vector_search_retriever_tool.py line 62:
dbvs = DatabricksVectorSearch(**kwargs)

integrations/langchain/src/databricks_langchain/vectorstores.py lines 255-266:

client_args = client_args or {}
client_args.setdefault("disable_notice", True)
            if (
                workspace_client is not None
                and workspace_client.config.auth_type == "model_serving_user_credentials"
            ):
                client_args.setdefault(
                    "credential_strategy", CredentialStrategy.MODEL_SERVING_USER_CREDENTIALS
                )
            self.index = VectorSearchClient(**client_args).get_index(
                endpoint_name=endpoint, index_name=index_name
            )

When using a service principal, workspace_client.config.auth_type is not model_serving_user_credentials and thus the client args remain basically empty. The method would suggest you can set client_args yourself, but actually these are being removed by the retrieval tool:
integrations/langchain/src/databricks_langchain/vector_search_retriever_tool.py lines 52-62:

 def _validate_tool_inputs(self):
        kwargs = {
            "index_name": self.index_name,
            "embedding": self.embedding,
            "text_column": self.text_column,
            "doc_uri": self.doc_uri,
            "primary_key": self.primary_key,
            "columns": self.columns,
            "workspace_client": self.workspace_client,
            "include_score": self.include_score,
        }
        dbvs = DatabricksVectorSearch(**kwargs)

To reproduce, just try to instantiate a VectorSearchRetrieverTool instance with service principal credentials.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions