-
Notifications
You must be signed in to change notification settings - Fork 29
Description
I am trying to read a delta lake hosted on an azure storage account into duckdb (1.4.1). I am running the code inside a docker container as a container app job with two managed identities configured. The code is as follows:
def retrieve_data(delta_table_file_path: str) -> DataFrame:
storage_account = "mystorageaccount"
container = "mycontainer"
data_path = f"abfss://{container}@{storage_account}.dfs.core.windows.net/{delta_table_file_path}"
client_id = os.getenv("SECOND_MANAGED_IDENTITY_ID")
with duckdb.connect() as connection:
connection.install_extension("delta")
connection.load_extension("delta")
connection.install_extension("azure")
connection.load_extension("azure")
connection.execute(f"""
CREATE SECRET secret1 (
TYPE AZURE,
PROVIDER MANAGED_IDENTITY,
ACCOUNT_NAME '{storage_account}',
CLIENT_ID '{client_id}'
);
""")
return connection.execute(f"SELECT * FROM delta_scan('{data_path}')").pl()
I am getting the following error on execution (replaced actual paths with their names in the code):
_duckdb.IOException: IO Error: DeltaKernel ObjectStoreError (8): Error interacting with object store: The operation lacked valid authentication credentials for path delta_table_file_path/_delta_log/_last_checkpoint: Error performing GET https://mystorageaccount.blob.core.windows.net/mycontainer/delta_table_file_path/_delta_log/_last_checkpoint in 60.863775ms - Server returned non-2xx status code: 401 Unauthorized:
NoAuthenticationInformationServer failed to authenticate the request. Please refer to the information in the www-authenticate header.
It looks like the authentication credentials are not passed correctly. Am I doing something wrong or is this a bug? I also tried to connect using az instead of abfss with no success.