Skip to content

Managed Identity Access not working for delta lake access #128

@SebastianSchroeder

Description

@SebastianSchroeder

I am trying to read a delta lake hosted on an azure storage account into duckdb (1.4.1). I am running the code inside a docker container as a container app job with two managed identities configured. The code is as follows:

def retrieve_data(delta_table_file_path: str) -> DataFrame:
    storage_account = "mystorageaccount"
    container = "mycontainer"
    data_path = f"abfss://{container}@{storage_account}.dfs.core.windows.net/{delta_table_file_path}"
    client_id = os.getenv("SECOND_MANAGED_IDENTITY_ID")

    with duckdb.connect() as connection:
        connection.install_extension("delta")
        connection.load_extension("delta")
        connection.install_extension("azure")
        connection.load_extension("azure")

        connection.execute(f"""
            CREATE SECRET secret1 (
                TYPE AZURE,
                PROVIDER MANAGED_IDENTITY,
                ACCOUNT_NAME '{storage_account}',
                CLIENT_ID '{client_id}'
            );
        """)

        return connection.execute(f"SELECT * FROM delta_scan('{data_path}')").pl()

I am getting the following error on execution (replaced actual paths with their names in the code):

_duckdb.IOException: IO Error: DeltaKernel ObjectStoreError (8): Error interacting with object store: The operation lacked valid authentication credentials for path delta_table_file_path/_delta_log/_last_checkpoint: Error performing GET https://mystorageaccount.blob.core.windows.net/mycontainer/delta_table_file_path/_delta_log/_last_checkpoint in 60.863775ms - Server returned non-2xx status code: 401 Unauthorized: NoAuthenticationInformationServer failed to authenticate the request. Please refer to the information in the www-authenticate header.

It looks like the authentication credentials are not passed correctly. Am I doing something wrong or is this a bug? I also tried to connect using az instead of abfss with no success.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions