|
| 1 | +--- |
| 2 | +title: Azure Blob Storage event triggers |
| 3 | +--- |
| 4 | + |
| 5 | +You can use Azure Blob Storage events, such as adding new files to—or updating existing files within—Azure Blob Storage containers, to automatically run Unstructured ETL+ workflows |
| 6 | +that rely on those containers as sources. This enables a no-touch approach to having Unstructured automatically process new and updated files in Azure Blob Storage containers as they are added or updated. |
| 7 | + |
| 8 | +This example shows how to automate this process by adding a custom [Azure Function](https://learn.microsoft.com/azure/azure-functions/functions-overview) app to your Azure account. This function app runs |
| 9 | +a function whenever a new or updated file is detected in the specified Azure Blob Storage container. This function then calls the [Unstructured Workflow Endpoint](/api-reference/workflow/overview) to automatically run the |
| 10 | +specified corresponding Unstructured ETL+ workflow within your Unstructured account. |
| 11 | + |
| 12 | +<Note> |
| 13 | + This example uses a custom Azure function that you create and maintain. |
| 14 | + Any issues with file detection, timing, or function invocation could be related to your custom function, |
| 15 | + rather than with Unstructured. If you are getting unexpected or no results, be sure to check your custom |
| 16 | + function's invocation traces first for any informational and error messages. |
| 17 | +</Note> |
| 18 | + |
| 19 | +## Requirements |
| 20 | + |
| 21 | +import GetStartedSimpleApiOnly from '/snippets/general-shared-text/get-started-simple-api-only.mdx' |
| 22 | + |
| 23 | +To use this example, you will need the following: |
| 24 | + |
| 25 | +- An Unstructured account, and an Unstructured API key for your account, as follows: |
| 26 | + |
| 27 | + <GetStartedSimpleApiOnly /> |
| 28 | + |
| 29 | +- The Unstructured Workflow Endpoint URL for your account, as follows: |
| 30 | + |
| 31 | + 1. In the Unstructured UI, click **API Keys** on the sidebar.<br/> |
| 32 | + 2. Note the value of the **Unstructured Workflow Endpoint** field. |
| 33 | + |
| 34 | +## Step 1: Create an Azure Function App |
| 35 | + |
| 36 | +1. Sign in to your [Azure portal](https://portal.azure.com). |
| 37 | +2. Click **+ Create a resource**. |
| 38 | + |
| 39 | + If **Function App** is not visible, in **Search services and marketplace** field, enter **Function App**. |
| 40 | + |
| 41 | +3. Next to **Function App**, click **Create** or **Create > Function App**. |
| 42 | +4. Under **Select a hosting option**, select the radio button next to **Consumption** to create an app that is most compatible with JavaScript. |
| 43 | +5. Click **Select**. |
| 44 | +6. On the **Basics** tab, set the following function app settings: |
| 45 | + |
| 46 | + | Setting | Suggested value | Description | |
| 47 | + |---|---|---| |
| 48 | + | **Subscription** | Your subscription | The Azure subscription within which to create your new function app. | |
| 49 | + | **Resource Group** | **Create new** | After you click **Create new**, enter some name for the new resource group within which to create your new function app. You should create a new resource group because there are known limitations when creating new function apps in an existing resource group. [Learn more](https://learn.microsoft.com/azure/azure-functions/functions-scale#limitations-for-creating-new-function-apps-in-an-existing-resource-group). | |
| 50 | + | **Function App name** | Some globally unique name | Some name that identifies your new function app. Valid characters are `a`-`z` (case insensitive), `0`-`9`, and `-`. | |
| 51 | + | **Operating System** | **Windows** | Choose the operating system for your function app. This example uses Windows. | |
| 52 | + | **Runtime stack** | **Node.js** | Choose a runtime that supports your favorite function programming language. This example uses JavaScript (Node.js). | |
| 53 | + | **Version** | **20 LTS** | Choose the version of your selected runtime. This example uses Node.js 20 LTS. | |
| 54 | + | **Region** | Your preferred region | Select a region that's near you or near other services that your function can access. | |
| 55 | + |
| 56 | +7. Click **Review + create**. |
| 57 | +8. Click **Create**, and wait for the deployment to complete. |
| 58 | +9. After the deployment is complete, click **Go to resource**. |
| 59 | + |
| 60 | +## Step 2: Create a function |
| 61 | + |
| 62 | +1. With the function app open from the previous step, on the sidebar, click **Overview**. |
| 63 | +2. On the **Functions** tab, under **Create in Azure portal**, click **Create function**. |
| 64 | +3. For **Select a template**, select **Azure Blob Storage trigger**, and then click **Next**. |
| 65 | +4. For **Template details**, review the following values: |
| 66 | + |
| 67 | + | Setting | Suggested value | Description | |
| 68 | + |---|---|---| |
| 69 | + | **Function name** | `BlobTrigger1` | The name of the function to create. You can leave the default function name. | |
| 70 | + | **Path** | `samples-workitems/{name}` | The path to the Azure Blob Storage account that the function will monitor. You can leave the default path. | |
| 71 | + | **Storage account connection** | `AzureWebJobsStorage` | You can leave the default storage account connection name. | |
| 72 | + |
| 73 | +  |
| 74 | + |
| 75 | +5. Click **Create**. The function is created, and the **Code + Test** page appears. |
| 76 | + |
| 77 | +## Step 3: Customize the function for your workflow |
| 78 | + |
| 79 | +1. With the **Code + Test** page open from the previous step, on the **Code + Test** tab, replace the the context of the `index.js` file with the following code: |
| 80 | + |
| 81 | + ```javascript |
| 82 | + module.exports = async function (context, myBlob) { |
| 83 | + context.log("JavaScript blob trigger function processed blob \n Blob:", context.bindingData.blobTrigger, "\n Blob Size:", myBlob.length, "Bytes"); |
| 84 | + |
| 85 | + const apiKey = process.env.UNSTRUCTURED_API_KEY; |
| 86 | + const apiUrl = process.env.UNSTRUCTURED_API_URL; |
| 87 | + const headers = { |
| 88 | + "accept": "application/json", |
| 89 | + "unstructured-api-key": apiKey |
| 90 | + }; |
| 91 | + |
| 92 | + try { |
| 93 | + const response = await fetch(apiUrl, { |
| 94 | + method: "POST", |
| 95 | + headers: headers |
| 96 | + }); |
| 97 | + |
| 98 | + const data = await response.json(); |
| 99 | + context.log("POST response:", data); |
| 100 | + } catch (error) { |
| 101 | + context.log.error("Error calling external API:", error); |
| 102 | + } |
| 103 | + }; |
| 104 | + ``` |
| 105 | + |
| 106 | +2. Click **Save**. |
| 107 | +3. In the navigation breadcrumb toward the top of the page, click your function app's name. The function app's settings page appears. |
| 108 | +4. In the sidebar, expand **Settings**, and then click **Environment variables**. |
| 109 | +5. Click **+ Add**. |
| 110 | +6. For **Name**, enter `UNSTRUCTURED_API_URL`. |
| 111 | +7. For **Value**, enter your `<unstructured-api-url>/workflows/<workflow-id>/run`, and replace the following placeholders: |
| 112 | + |
| 113 | + - Replace `<unstructured-api-url>` with your Unstructured Worfklow Endpoint value. |
| 114 | + - Replace `<workflow-id>` with the ID of your Unstructured workflow. For now, because the workflow does not yet exist, enter some fictitious value, such as `1234567890`. You will |
| 115 | + update this value later in Step 6 after you create the workflow. |
| 116 | + |
| 117 | + The **Value** should now look similar to the following: |
| 118 | + |
| 119 | + ```text |
| 120 | + https://platform.unstructuredapp.io/api/v1/workflows/1234567890/run |
| 121 | + ``` |
| 122 | + |
| 123 | +8. Click **Apply**. |
| 124 | +9. Click **+ Add** again. |
| 125 | +10. For **Name**, enter `UNSTRUCTURED_API_KEY`. |
| 126 | +11. For **Value**, enter your Unstructured API key value. |
| 127 | +12. Click **Apply**. |
| 128 | +13. Click **Apply** again, and then click **Confirm**. |
| 129 | + |
| 130 | +## Step 4: Create the Azure Storage container |
| 131 | + |
| 132 | +1. With the function app's settings page open from the previous step, in the sidebar, click **Overview**. |
| 133 | +2. Expand **Essentials**. |
| 134 | +3. Next to **Resource group**, click the resource group link. The resource group's settings page appears. |
| 135 | +4. In the sidebar, click **Overview**. |
| 136 | +5. On the **Resources** tab, click the link next to **Storage account**. The storage account's settings page appears. |
| 137 | +6. In the sidebar, click **Overview**. |
| 138 | +7. On the **Properties** tab, click **Blob service**. |
| 139 | +8. Click **Add container**. |
| 140 | +9. For **Name**, enter `samples-workitems`. This name must match the container name that you specified earlier in the **Path** field (not including `/{name}`) in Step 2. |
| 141 | +10. Click **Create**. |
| 142 | + |
| 143 | +## Step 5: Create the Unstructured workflow |
| 144 | + |
| 145 | +1. Create an Azure Blob Storage source connector in your Unstructured account. [Learn how](/ui/sources/azure-blob-storage). This source connector must reference |
| 146 | + the Azure Storage container that you created earlier in Step 4. |
| 147 | +2. Create a new—or identify an existing—[destination connector](/ui/destinations/overview) in your Unstructured account. |
| 148 | +3. Create a workflow that uses the preceding source and destination connectors. [Learn how](/ui/workflows). |
| 149 | + |
| 150 | +## Step 6: Add the workflow's ID to the function's environment variables |
| 151 | + |
| 152 | +1. Note the ID of the workflow that you created earlier in Step 5. |
| 153 | +2. In the Azure portal, with the storage account's settings page open from Step 4, in the navigation breadcrumb toward the top of the page, click your resource group's name. The resource group's settings page appears. |
| 154 | +3. On the **Resources** tab, click the link next to **Function App**. The function app's settings page appears. |
| 155 | +4. In the sidebar, expand **Settings**, and then click **Environment variables**. |
| 156 | +5. Click `UNSTRUCTURED_API_URL`. |
| 157 | +6. Click the eyball (**Reveal password**) icon. |
| 158 | +7. Replace the fictitious workflow ID from earlier in Step 3 with the ID of the workflow that you created earlier in Step 5. |
| 159 | +8. Click **Apply**. |
| 160 | +9. Click **Apply** again, and then click **Confirm**. |
| 161 | + |
| 162 | +## Step 7: Trigger the function |
| 163 | + |
| 164 | +1. With the function app’s settings page open from the previous step, in the navigation breadcrumb toward the top of the page, click your resource group's name. The resource group's settings page appears. |
| 165 | +2. On the **Resources** tab, click the link next to **Storage account**. The storage account's settings page appears. |
| 166 | +3. In the sidebar, click **Overview**. |
| 167 | +4. On the **Properties** tab, click **Blob service**. |
| 168 | +5. Click the **samples-workitems** link. |
| 169 | +6. Click **Upload**, and follow the on-screen instructions to upload a file to the container. |
| 170 | + |
| 171 | + <Note> |
| 172 | + If you are unable to upload a file to the container, click **Access Control (IAM)** in the sidebar and add an appropriate role assignment that |
| 173 | + enables uploading files to the container, such as **Storage Blob Data Owner**. [Learn how](https://learn.microsoft.com/azure/storage/blobs/assign-azure-role-data-access?tabs=portal) |
| 174 | + </Note> |
| 175 | + |
| 176 | +## Step 8: View trigger results |
| 177 | + |
| 178 | +1. In the Unstructured user interface for your account, click **Jobs** on the sidebar. |
| 179 | +2. In the list of jobs, click the newly running job for your workflow. |
| 180 | +3. After the job status shows **Finished**, go to your destination location to see the results. |
| 181 | + |
| 182 | +## Step 9 (Optional): Delete the Azure resource group |
| 183 | + |
| 184 | +If you are done with this example and do not want to keep the resource group in your account, you can permanently delete it as follows: |
| 185 | + |
| 186 | +1. In the Azure portal, with the storage account's settings page open from Step 7, in the navigation breadcrumb toward the top of the page, click your resource group's name. The resource group's settings page appears. |
| 187 | +2. Click **Delete resource group**. |
| 188 | +3. Enter the resource group's name, and then click **Delete**. The resource group, along with the function app, storage account, and other related resources, are permanently deleted. |
0 commit comments