Skip to content

Commit a4229e0

Browse files
docs: Update service principal steps for Azure (#8131)
1 parent a5c7462 commit a4229e0

File tree

1 file changed

+96
-31
lines changed

1 file changed

+96
-31
lines changed

docs/source/guide/storage.md

Lines changed: 96 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1183,10 +1183,12 @@ You can also create a storage connection using the Label Studio API.
11831183

11841184
### Azure Blob Storage with Service Principal authentication
11851185

1186-
You can use Azure Service Principal authentication to securely connect Label Studio Enterprise to Azure Blob Storage without using storage account keys. Service Principal authentication provides enhanced security through Azure Active Directory (Azure AD) identity and access management, allowing for fine-grained permissions and audit capabilities.
1186+
You can use Azure Service Principal authentication to securely connect Label Studio Enterprise to Azure Blob Storage without using storage account keys. Service Principal authentication provides enhanced security through Entra ID (formerly "Azure Active Directory") identity and access management, allowing for fine-grained permissions and audit capabilities.
11871187

11881188
Service Principal authentication is a secure method that uses Azure AD identity to authenticate applications. Unlike storage account keys that provide full access to the storage account, Service Principal authentication allows you to grant specific permissions and can be easily revoked or rotated.
11891189

1190+
For more information, see [Microsoft - Application and service principal objects in Microsoft Entra ID](https://learn.microsoft.com/en-us/entra/identity-platform/app-objects-and-service-principals).
1191+
11901192
#### Prerequisites
11911193

11921194
- Azure subscription and Storage Account
@@ -1195,42 +1197,105 @@ Service Principal authentication is a secure method that uses Azure AD identity
11951197

11961198
#### Set up a Service Principal in Azure
11971199

1198-
1. Create an App Registration: Azure AD → App registrations → New registration → name it (e.g., "LabelStudio-ServicePrincipal").
1199-
2. Capture IDs: from the app Overview, copy the Directory (tenant) ID and Application (client) ID.
1200-
3. Create a Client Secret: Certificates & secrets → New client secret → copy the Value immediately.
1201-
4. Grant Storage access: Storage Account → Access control (IAM) → Add role assignment → Storage Blob Data Contributor → assign to the App Registration.
1202-
5. Create a container: Data storage → Containers → + Container → set Public access level = Private.
1200+
1. **Add an App Registration:**
1201+
1. From the Azure portal, search or select **Entra ID**.
1202+
2. Select **Add > App registration**.
1203+
2. **Register the application:**
1204+
1. Provide a name (e.g., "LabelStudio-ServicePrincipal").
1205+
2. Select the account type appropriate for your organization.
1206+
3. Leave the redirect URI blank.
1207+
4. Click **Register**.
1208+
3. **Copy required information:**
1209+
1. From the Overview page, copy the following fields: <br/><br/>
1210+
* **Directory (tenant) ID**
1211+
* **Application (client) ID**
1212+
4. **Create a client secret:**
1213+
1. While still on the overview page for your new app, expand the **Manage** menu on the left. Select **Certificates & secrets**.
1214+
2. Click **New client secret**.
1215+
3. Provide a description and select an expiration date. Click **Add**.
1216+
4. Copy the **Value** field. (You will only have one chance to copy this value and then it will be hidden.)
1217+
5. **Grant Storage access:**
1218+
1. Go to the storage account you created as part of the prerequisites.
1219+
2. On the left, select **Access control (IAM)**.
1220+
3. Select **Add role assignment**.
1221+
4. Use the search field to locate **Storage Blob Data Contributor**. Click the role to highlight it.
1222+
5. Select the **Members** tab above.
1223+
6. With **User, group, or service principal** selected, click **Select members**.
1224+
7. Use the search field provided to locate the name of the app you created earlier.
1225+
8. Click **Select**
1226+
9. Click **Review + assign**.
1227+
6. **Create a container:**
1228+
1. While still on the page for your storage account, click **Data storage** on the left.
1229+
2. Select **Containers**
1230+
3. You may already have a container with files, but if you do not, create a new one with private access.
12031231

12041232
!!! warning
1205-
If you plan to use pre-signed URLs, configure CORS on the Storage Account Blob service: methods GET/HEAD/OPTIONS; allowed origins = your Label Studio domain(s); headers = *; exposed headers = *; max age ≈ 3600.
1233+
If you plan to use pre-signed URLs, configure CORS on the Storage Account Blob service. See below.
1234+
1235+
<br/>
1236+
1237+
{% details <b>Configure CORS for the Azure storage account</b> %}
1238+
1239+
If you plan to use pre-signed URLs, configure CORS on the Storage Account Blob service.
1240+
1241+
1. In the Azure portal, navigate to the page for the storage account.
1242+
2. From the menu on the left, scroll down to **Settings > Resource sharing (CORS)**.
1243+
3. Under **Blob service** add the following rule:
1244+
1245+
* **Allowed origins:** `https://app.humansignal.com` (or the domain you are using)
1246+
* **Allowed methods:** `GET, HEAD, OPTIONS`
1247+
* **Allowed headers:** `*`
1248+
* **Exposed headers:** `*`
1249+
* **Max age:** `3600`
1250+
1251+
4. Click **Save**.
1252+
1253+
{% enddetails %}
12061254

12071255
#### Set up connection in the Label Studio UI
12081256

1209-
In the Label Studio UI, do the following to set up the connection:
1257+
From Label Studio, open your project and select **Settings > Cloud Storage** > **Add Source Storage**.
12101258

1211-
1. Open Label Studio in your web browser.
1212-
2. For a specific project, open **Settings > Cloud Storage**.
1213-
3. Click **Add Source Storage**.
1214-
4. In the dialog box that appears, select **Azure Blob Storage with Service Principal** as the storage type.
1215-
5. In the **Storage Name** field, type a name for the storage to appear in the Label Studio UI.
1216-
6. Specify the name of the Azure Storage Account in the **Storage Name** field.
1217-
7. Specify the name of the Azure Blob container, and if relevant, the container prefix to specify an internal folder.
1218-
8. Configure the Service Principal authentication:
1219-
- In the **Tenant ID** field, specify the Directory (tenant) ID from your App Registration.
1220-
- In the **Client ID** field, specify the Application (client) ID from your App Registration.
1221-
- In the **Client Secret** field, specify the client secret value you created.
1222-
9. Adjust the remaining optional parameters:
1223-
- In the **File Filter Regex** field, specify a regular expression to filter bucket objects. Use `.*` to collect all objects.
1224-
- In the **Import method** dropdown, choose how to import your data:
1225-
- **Files** - Automatically creates a task for each storage object (e.g. JPG, MP3, TXT). Use this if your container contains BLOB storage files such as JPG, MP3, or similar file types.
1226-
- **Tasks** - Treat each JSON, JSONL, or Parquet as a task definition (one or more tasks per file). Use this if you have multiple JSON files in the container with one task per JSON file.
1227-
- In the **Use pre-signed URLs (On) / Proxy through Label Studio (Off)** toggle, choose how media is loaded:
1228-
- **ON** (Pre-signed URLs) - All data bypasses the platform and user browsers directly read data from storage.
1229-
- **OFF** (Proxy) - The platform proxies media using its own backend.
1230-
- Set the **Expire pre-signed URLs (minutes)** counter to control how long pre-signed URLs remain valid.
1231-
10. Click **Add Storage**.
1232-
1233-
After adding the storage, click **Sync** to collect tasks from the container, or make an API call to sync import storage.
1259+
Select **Azure Blob Storage with Service Principal** and click **Next**.
1260+
1261+
##### Configure Connection
1262+
1263+
Complete the following fields and then click **Test connection**:
1264+
1265+
<div class="noheader rowheader">
1266+
1267+
| | |
1268+
| --- | --- |
1269+
| Storage Title | Enter a name for the storage connection to appear in Label Studio. |
1270+
| Storage Name | Enter the name of your Azure storage sccount. |
1271+
| Container Name | Enter the name of a container within the Azure storage account. |
1272+
| Tenant ID | Specify the **Directory (tenant) ID** from your App Registration. |
1273+
| Client ID | Specify the **Application (client) ID** from your App Registration. |
1274+
| Client Secret | Specify the **Value** of the client secret you copied earlier. |
1275+
| **Use pre-signed URLs / Proxy through the platform** | Enable or disable pre-signed URLs. [See more.](#Pre-signed-URLs-vs-Storage-proxies) |
1276+
| Expiration minutes | Adjust the counter for how many minutes the pre-signed URLs are valid. |
1277+
1278+
</div>
1279+
1280+
##### Import Settings & Preview
1281+
1282+
Complete the following fields and then click **Load preview** to ensure you are syncing the correct data:
1283+
1284+
<div class="noheader rowheader">
1285+
1286+
| | |
1287+
| --- | --- |
1288+
| Bucket Prefix | Optionally, enter the folder name within the container that you would like to use. For example, `data-set-1` or `data-set-1/subfolder-2`. |
1289+
| Import Method | Select whether you want create a task for each file in your container or whether you would like to use a JSON/JSONL/Parquet file to define the data for each task. |
1290+
| File Name Filter | Specify a regular expression to filter bucket objects. Use `.*` to collect all objects. |
1291+
| Scan all sub-folders | Enable this option to perform a recursive scan across subfolders within your container. |
1292+
1293+
</div>
1294+
1295+
1296+
##### Review & Confirm
1297+
1298+
If everything looks correct, click **Save & Sync** to sync immediately, or click **Save** to save your settings and sync later.
12341299

12351300
#### Create a target storage connection in the Label Studio UI
12361301

0 commit comments

Comments
 (0)