|
| 1 | +--- |
| 2 | +title: Automatically index Azure Data Lake Storage Changes for DICOM Files |
| 3 | +description: Learn how to configure the DICOM service to react to Data Lake Storage events |
| 4 | +author: wisuga |
| 5 | +ms.service: azure-health-data-services |
| 6 | +ms.subservice: dicom-service |
| 7 | +ms.topic: how-to |
| 8 | +ms.date: 05/31/2025 |
| 9 | +ms.author: wisuga |
| 10 | +--- |
| 11 | + |
| 12 | +# Azure Data Lake Storage Indexing |
| 13 | + |
| 14 | +The [DICOM® service](overview.md) automatically uploads DICOM files to Azure Data Lake Storage when using STOW-RS. This way, users can query their data either using DICOMweb™ APIs, like WADO-RS, or Azure Blob/Data Lake APIs. However, with storage indexing DICOM files will be automatically indexed by the DICOM service after uploading the data directly to the ADLS Gen 2 file system. Whether the files were uploaded using STOW-RS, an Azure Blob SDK, or even `AzCopy`, they can be accessed using either DICOMweb™ APIs or directly through the ADLS Gen 2 file system. |
| 15 | + |
| 16 | +## Prerequisites |
| 17 | + |
| 18 | +* An Azure Storage account configured with Hierarchical Namespaces (HNS) enabled |
| 19 | +* Optionally, a DICOM Service [connected to the Azure Data Lake Storage file system](deploy-dicom-services-in-azure-data-lake.md) |
| 20 | + |
| 21 | +## Configuring Storage Indexing |
| 22 | + |
| 23 | +The DICOM service indexes an ADLS Gen 2 file system by reacting to Blob or Data Lake storage events. These events must be read from an Azure Storage Queue in the Azure Storage Account that contains the file system. Once in the queue, the DICOM service will asynchronously process each event and update the index accordingly. |
| 24 | + |
| 25 | +### Create the Destination for Storage Events |
| 26 | + |
| 27 | +First, create an [Azure Storage Queue](../../storage/queues/storage-queues-introduction.md) in the same Azure Storage Account connected to the DICOM service The DICOM service will also need access to the queue; it will need to be able to dequeue messages as well as enqueue its own messages for errors and complex tasks. So, make sure the same managed identity used by the DICOM service, either user-assigned or system-assigned, has the [**Storage Queue Data Contributor**](../../storage/queues/assign-azure-role-data-access.md) role assigned. |
| 28 | + |
| 29 | +#### [Azure Portal](#tab/queue-portal) |
| 30 | + |
| 31 | +#### [ARM](#tab/queue-arm) |
| 32 | + |
| 33 | +### Publish Storage Events to the Queue |
| 34 | + |
| 35 | +With the Storage Queue in place, events must be published from the Storage Account to an Azure Event Grid System Topic and routed to queue using an Azure Event Grid Subscription. Before creating the event subscription, be sure to grant the role **Storage Queue Data Message Sender** to the event subscription so that it is authorized to send messages. The event subscription can either use a user-assigned or system-assigned managed identity from the system topic to authenticate its operations. |
| 36 | + |
| 37 | +By default, event subscriptions send all of the subscribed event types to their designated output. However, while the DICOM service will gracefully handle any message, it will only process ones that meet the following criteria: |
| 38 | +- The message must be a Base64 `CloudEvent` |
| 39 | +- The event type must be `Microsoft.Storage.BlobCreated` or `Microsoft.Storage.BlobDeleted` |
| 40 | +- The file system must be the same one configured for the DICOM service |
| 41 | +- The file path must be within `AHDS/<workspace-name>/dicom/<dicom-service-name>` |
| 42 | +- The file must be a DICOM file as defined in Part 10 of the DICOM standard |
| 43 | +- The operation must not have been performed by the DICOM service itself |
| 44 | + |
| 45 | +Thankfully, the event subscription can be configured to filter out irrelevant data to avoid unnecessary processing and billing. Make sure to configure filter such that: |
| 46 | +- The subject must begin with `/blobServices/default/containers/<file-system-name>/blobs/AHDS/<workspace-name>/dicom/<dicom-service-name>/` |
| 47 | +- Optionally, the subject ends with `.dcm` |
| 48 | +- Under advanced filters, the key `data.clientRequestId` does not begin with `tag:<workspace-name>-<dicom-service-name>.dicom.azurehealthcareapis.com,` |
| 49 | + |
| 50 | +#### [Azure Portal](#tab/events-portal) |
| 51 | + |
| 52 | +#### [ARM](#tab/events-arm) |
| 53 | + |
| 54 | +1. Use the Azure CLI command [`az deployment group create`](../../azure-resource-manager/bicep/deploy-cli.md) to deploy a system topic and event subscription like below: |
| 55 | + |
| 56 | +### Enable Storage Indexing |
| 57 | + |
| 58 | +Once the event grid subscription has been successfully configured, it's time to let the DICOM service know from where to read the storage events. |
| 59 | + |
| 60 | +#### [ARM](#tab/dicom-arm) |
| 61 | + |
| 62 | +Storage indexing is available starting in the preview ARM version `2025-04-01-preview` which introduced a new property within `storageConfiguration` called `storageIndexingConfiguration.storageEventQueueName`. Deploy, or redeploy, a DICOM service using this new property with the Azure CLI command [`az deployment group create`](../../azure-resource-manager/bicep/deploy-cli.md): |
| 63 | + |
| 64 | +```json |
| 65 | +{ |
| 66 | + "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#", |
| 67 | + "contentVersion": "1.0.0.0", |
| 68 | + "parameters": { |
| 69 | + "workspaceName": { |
| 70 | + "type": "String" |
| 71 | + }, |
| 72 | + "dicomServiceName": { |
| 73 | + "type": "String" |
| 74 | + }, |
| 75 | + "storageAccountResourceId": { |
| 76 | + "type": "String" |
| 77 | + }, |
| 78 | + "fileSystemName": { |
| 79 | + "type": "String" |
| 80 | + }, |
| 81 | + "queueName": { |
| 82 | + "type": "String" |
| 83 | + } |
| 84 | + }, |
| 85 | + "resources": [ |
| 86 | + { |
| 87 | + "type": "Microsoft.HealthcareApis/workspaces/dicomservices", |
| 88 | + "apiVersion": "2024-03-31", |
| 89 | + "name": "[concat(parameters('workspaceName'), '/', parameters('dicomServiceName'))]", |
| 90 | + "location": "[resourceGroup().location]", |
| 91 | + "identity": { |
| 92 | + "type": "SystemAssigned" |
| 93 | + }, |
| 94 | + "properties": { |
| 95 | + "storageConfiguration": { |
| 96 | + "fileSystemName": "[parameters('fileSystemName')]", |
| 97 | + "storageResourceId": "[parameters('storageAccountResourceId')]", |
| 98 | + "storageIndexingConfiguration": { |
| 99 | + "storageEventQueueName": "[parameters('queueName')]" |
| 100 | + } |
| 101 | + } |
| 102 | + } |
| 103 | + } |
| 104 | + ] |
| 105 | +} |
| 106 | +``` |
| 107 | + |
| 108 | +## Diagnosing Issues |
| 109 | + |
| 110 | +:::image type="content" source="media/storage-indexing/diagnostic-logs.png" alt-text="A screenshot of the Azure Portal showing a KQL query of the AHDSDicomAuditLogs table. The example query is filtering for all logs where OperationName is the string 'index-storage'. Undernearth the KQL query is a table of results." lightbox="media/storage-indexing/diagnostic-logs.png"::: |
| 111 | + |
| 112 | +If there is an error when processing an event, the problematic event will be enqueued in a "poison queue" called `<queue-name>-poison` in the same Storage Account. Details about every processed event can be found in the `AHDSDicomAuditLogs` and `AHDSDicomDiagnosticLogs` tables by filtering for all logs where `OperationName = 'index-storage'`. The audit logs will only record when the operation started and completed whereas the diagnostic table will provide details about each operation including any errors, if any. Operations may be correlated across the tables using `CorrelationId`. |
| 113 | + |
| 114 | +Failures are divided into two types: `User` and `Server`. User errors include |
| 115 | + |
| 116 | +## Next Steps |
| 117 | +* [Interact with data using DICOMweb™](dicomweb-standard-apis-with-dicom-services.md) |
0 commit comments