You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/guide/storage.md
+104Lines changed: 104 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1177,6 +1177,110 @@ You can also create a storage connection using the Label Studio API.
1177
1177
- See [Create new import storage](/api#operation/api_storages_azure_create) then [sync the import storage](/api#operation/api_storages_azure_sync_create).
1178
1178
- See [Create export storage](/api#operation/api_storages_export_azure_create) and after annotating, [sync the export storage](/api#operation/api_storages_export_azure_sync_create).
1179
1179
1180
+
1181
+
<div class="enterprise-only">
1182
+
1183
+
1184
+
### Azure Blob Storage with Service Principal authentication
1185
+
1186
+
You can use Azure Service Principal authentication to securely connect Label Studio Enterprise to Azure Blob Storage without using storage account keys. Service Principal authentication provides enhanced security through Azure Active Directory (Azure AD) identity and access management, allowing for fine-grained permissions and audit capabilities.
1187
+
1188
+
Service Principal authentication is a secure method that uses Azure AD identity to authenticate applications. Unlike storage account keys that provide full access to the storage account, Service Principal authentication allows you to grant specific permissions and can be easily revoked or rotated.
1189
+
1190
+
#### Prerequisites
1191
+
1192
+
- Azure subscription and Storage Account
1193
+
- Permission to create App Registrations and assign roles on the Storage Account
1194
+
- A private container for your data (create one if needed)
1195
+
1196
+
#### Set up a Service Principal in Azure
1197
+
1198
+
1. Create an App Registration: Azure AD → App registrations → New registration → name it (e.g., "LabelStudio-ServicePrincipal").
1199
+
2. Capture IDs: from the app Overview, copy the Directory (tenant) ID and Application (client) ID.
1200
+
3. Create a Client Secret: Certificates & secrets → New client secret → copy the Value immediately.
1201
+
4. Grant Storage access: Storage Account → Access control (IAM) → Add role assignment → Storage Blob Data Contributor → assign to the App Registration.
1202
+
5. Create a container: Data storage → Containers → + Container → set Public access level = Private.
1203
+
1204
+
!!! warning
1205
+
If you plan to use pre-signed URLs, configure CORS on the Storage Account Blob service: methods GET/HEAD/OPTIONS; allowed origins = your Label Studio domain(s); headers = *; exposed headers = *; max age ≈ 3600.
1206
+
1207
+
#### Set up import storage in the Label Studio UI
1208
+
1209
+
1. Open your project → **Settings > Cloud Storage** → **Add Source Storage** → select**Azure Blob Storage with Service Principal**.
1210
+
2. Fill the fields exactly as labeled in the UI (matches backend schema):
1211
+
- **Integration Name**: Display name for this connection.
1212
+
- **Storage Name**: Azure Storage Account name (not a URL).
1213
+
- **Container Name** and optional **Container Prefix**.
1214
+
- **Tenant ID**, **Client ID**, **Client Secret**: values from the App Registration.
1215
+
- Optional: **File Filter Regex** to include specific objects.
1216
+
- Import mode: toggle **Treat every items object as an image/src file**
1217
+
- ON = Files (create a task per blob)
1218
+
- OFF = Tasks (JSON/JSONL/Parquet task definitions)
1219
+
- **Use pre-signed URLs** (ON) or proxy (OFF), and **Expiration minutes**.
1220
+
3. Click **Add Storage**, then**Sync** (or use the API) to load tasks.
1221
+
1222
+
UI fields reference
1223
+
1224
+
Navigate to your Azure Storage Account:
1225
+
- Go to **Access control (IAM)**
1226
+
- Click **Add > Add role assignment**
1227
+
- Select the **Storage Blob Data Contributor** role
1228
+
- In the **Members** tab, select**User, group, or service principal**
1229
+
- Search for and selectyour App Registration
1230
+
- Click **Review + assign**
1231
+
1232
+
#### Set up connection in the Label Studio UI
1233
+
1234
+
In the Label Studio UI, do the following to set up the connection:
1235
+
1236
+
1. Open Label Studio in your web browser.
1237
+
2. For a specific project, open **Settings > Cloud Storage**.
1238
+
3. Click **Add Source Storage**.
1239
+
4. In the dialog box that appears, select**Azure Blob Storage with Service Principal** as the storage type.
1240
+
5. In the **Storage Name** field, type a name forthe storage to appearin the Label Studio UI.
1241
+
6. Specify the name of the Azure Storage Account in the **Storage Name** field.
1242
+
7. Specify the name of the Azure Blob container, and if relevant, the container prefix to specify an internal folder.
1243
+
8. Configure the Service Principal authentication:
1244
+
- In the **Tenant ID** field, specify the Directory (tenant) ID from your App Registration.
1245
+
- In the **Client ID** field, specify the Application (client) ID from your App Registration.
1246
+
- In the **Client Secret** field, specify the client secret value you created.
1247
+
9. Adjust the remaining optional parameters:
1248
+
- In the **File Filter Regex** field, specify a regular expression to filter bucket objects. Use `.*` to collect all objects.
1249
+
- In the **Import method** dropdown, choose how to import your data:
1250
+
- **Files** - Automatically creates a task for each storage object (e.g. JPG, MP3, TXT). Use this if your container contains BLOB storage files such as JPG, MP3, or similar file types.
1251
+
- **Tasks** - Treat each JSON, JSONL, or Parquet as a task definition (one or more tasks per file). Use this if you have multiple JSON files in the container with one task per JSON file.
1252
+
- In the **Use pre-signed URLs (On) / Proxy through Label Studio (Off)** toggle, choose how media is loaded:
1253
+
- **ON** (Pre-signed URLs) - All data bypasses the platform and user browsers directly read data from storage.
1254
+
- **OFF** (Proxy) - The platform proxies media using its own backend.
1255
+
- Set the **Expire pre-signed URLs (minutes)** counter to control how long pre-signed URLs remain valid.
1256
+
10. Click **Add Storage**.
1257
+
1258
+
After adding the storage, click **Sync** to collect tasks from the container, or make an API call to sync import storage.
1259
+
1260
+
#### Create a target storage connection in the Label Studio UI
1261
+
1262
+
Repeat the steps from the previous section but using **Add Target Storage**. Use the same fields:
These are included in the built-in **Storage Blob Data Contributor** role.
1273
+
1274
+
#### Validate and troubleshoot
1275
+
1276
+
- After adding the storage, the connection is checked. If it fails, verify:
1277
+
- Tenant ID, Client ID, Client Secret values (no extra spaces; secret not expired)
1278
+
- Storage account and container names (case-sensitive)
1279
+
- Role assignment: App Registration has Storage Blob Data Contributor on the Storage Account
1280
+
- CORS is set when using pre-signed URLs; try proxy mode if testing
1281
+
1282
+
</div>
1283
+
1180
1284
## Redis database
1181
1285
1182
1286
You can also store your tasks and annotations in a [Redis database](https://redis.io/). You must store the tasks and annotations in different databases. You might want to use a Redis database if you find that relying on a file-based cloud storage connection is slow for your datasets.
0 commit comments