Skip to content

Commit 4b5d455

Browse files
authored
Merge pull request #302737 from EdB-MSFT/lake-updates-1507-1
Lake updates 1507 1
2 parents f9e0e77 + c4c851a commit 4b5d455

11 files changed

+65
-65
lines changed

articles/sentinel/graph/kql-jobs.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ ms.collection: ms-security
1919
# Create KQL jobs in the Microsoft Sentinel data lake (preview)
2020

2121

22-
A job is a one-time or scheduled task that runs a KQL (Kusto Query Language) query against the data in the lake tier to promote the results to the analytics tier. Once in the analytics tier, use the Advanced hunting KQL editor to query the data. Promoting data to the analytics tier has the following benefits:
22+
A job is a one-time or scheduled task that runs a KQL (Kusto Query Language) query against the data in the lake tier to promote the results to the analytics tier. Once in the analytics tier, use the advanced hunting KQL editor to query the data. Promoting data to the analytics tier has the following benefits:
2323

2424
+ Combine current and historical data in the analytics tier to run advanced analytics and machine learning models on your data.
2525

@@ -30,7 +30,7 @@ A job is a one-time or scheduled task that runs a KQL (Kusto Query Language) que
3030
> [!NOTE]
3131
> Storage in the analytics tier incurs higher billing rates than in the data lake tier. To reduce costs, only promote data that you need to analyze further. Use the KQL in your query to project only the columns you need, and filter the data to reduce the amount of data promoted to the analytics tier.
3232
33-
When promoting data to the analytics tier, make sure that the destination workspace is visible in the Advanced hunting query editor. You can only query connected workspaces in the Advanced hunting query editor. You will not be able to see data promoted to workspaces that aren't connected or to the default workspace in Advance hunting. For more information on connected workspaces, see [Connect a workspace](/defender-xdr/advanced-hunting-microsoft-defender#connect-a-workspace). You can promote data to a new table or append the results to an existing table in the analytics tier. When creating a new table, the table name is suffixed with *_KQL_CL* to indicate that the table was created by a KQL job.
33+
When promoting data to the analytics tier, make sure that the destination workspace is visible in the advanced hunting query editor. You can only query connected workspaces in the advanced hunting query editor. You will not be able to see data promoted to workspaces that aren't connected or to the default workspace in advance hunting. For more information on connected workspaces, see [Connect a workspace](/defender-xdr/advanced-hunting-microsoft-defender#connect-a-workspace). You can promote data to a new table or append the results to an existing table in the analytics tier. When creating a new table, the table name is suffixed with *_KQL_CL* to indicate that the table was created by a KQL job.
3434

3535

3636
You can create a job by selecting the **Create job** button a KQL query tab or directly from the **Jobs** management page or by. For more information on the Jobs management page, see [Manage jobs in the Microsoft Sentinel data lake](kql-manage-jobs.md).
@@ -145,7 +145,7 @@ The following standard columns aren't supported for export. These columns are ov
145145

146146
+ `TimeGenerated` will be overwritten if it's older that 2 days. To preserve the original event time, we recommend writing the source timestamp to a separate column.
147147

148-
For service limits, see [Microsoft Sentinel data lake (preview) service limits](sentinel-lake-service-limits.md#service-limits-for-kql-jobs).
148+
For service limits, see [Microsoft Sentinel data lake (preview) service limits](sentinel-lake-service-limits.md#service-parameters-and-limits-for-kql-jobs).
149149

150150
> [!NOTE]
151151
> Partial results may be promoted if the job's query exceeds the one hour limit.

articles/sentinel/graph/kql-queries.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,7 @@ For sample queries, see [KQL sample queries for the data lake](kql-samples.md).
120120
+ Calling external data via KQL query against the data lake isn't supported.
121121
+ `Ingestion_time()` function isn't supported on tables in data lake.
122122

123-
For service limits, see [Microsoft Sentinel data lake (preview) service limits](sentinel-lake-service-limits.md#service-limits-for-kql-queries-in-the-lake-tier).
123+
For service limits, see [Microsoft Sentinel data lake (preview) service limits](sentinel-lake-service-limits.md#service-parameters-and-limits-for-kql-queries-in-the-lake-tier).
124124

125125
For troubleshooting KQL queries, see [Troubleshoot KQL queries in the Microsoft Sentinel data lake](kql-troubleshoot.md).
126126

articles/sentinel/graph/notebook-jobs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ The page shows a list of jobs and their types. Select a notebook job to view its
108108

109109
## Service limits and troubleshooting
110110

111-
For a list of service limits for the Microsoft Sentinel data lake, see [Microsoft Sentinel data lake service limits](sentinel-lake-service-limits.md#service-limits-for-vs-code-notebooks).
111+
For a list of service limits for the Microsoft Sentinel data lake, see [Microsoft Sentinel data lake service limits](sentinel-lake-service-limits.md#service-parameters-and-limits-for-vs-code-notebooks).
112112

113113

114114
For information on troubleshooting, see [Run notebooks on the Microsoft Sentinel data lake (preview)](notebooks.md#service-limits).

articles/sentinel/graph/notebooks.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.author: edbaynash
77
ms.topic: how-to
88
ms.service: microsoft-sentinel
99
ms.subservice: sentinel-graph
10-
ms.date: 07/09/2025
10+
ms.date: 07/15/2025
1111

1212

1313
# Customer intent: As a security engineer or data scientist, I want to explore and analyze security data in the Microsoft Sentinel data lake using Jupyter notebooks, so that I can gain insights and build advanced analytics solutions.
@@ -173,7 +173,7 @@ You can schedule jobs to run at specific times or intervals using the Microsoft
173173

174174
## Service limits
175175

176-
For a list of service limits for the Microsoft Sentinel data lake, see [Microsoft Sentinel data lake service limits](sentinel-lake-service-limits.md#service-limits-for-vs-code-notebooks).
176+
For a list of service limits for the Microsoft Sentinel data lake, see [Microsoft Sentinel data lake service limits](sentinel-lake-service-limits.md#service-parameters-and-limits-for-vs-code-notebooks).
177177

178178
## Troubleshooting
179179

@@ -186,6 +186,7 @@ The following table lists common errors you may encounter when working with note
186186
| Spark compute | Unable to access Spark Pool – 403 Forbidden. | Output channel – “Window”. | Spark pools aren't displayed. | User doesn't have the required roles to run interactive notebook or schedule job. | Check if you have the required role for interactive notebooks or notebook jobs. |
187187
| Spark compute | Spark Pool – \<name\> – is being upgraded. | Toast alert. | One of the Spark pools is Not available. | Spark pool is being upgraded to the latest version of Microsoft Sentinel Provider. | Wait for ~20-30 mins for the Pool to be available. |
188188
| Spark compute | An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. : org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results (4.0 GB) is bigger than spark.driver.maxResultSize (4.0 GB) | Inline. | Driver memory exceeded or executor failure. | Job ran out of driver memory, or one or more executors failed. | View job run logs or optimize your query. Avoid using toPandas() on large datasets. Consider setting `spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "true")` if needed. |
189+
|Spark compute| Failed to connect to the remote Jupyter Server 'https://api.securityplatform.microsoft.com/spark-notebook/interactive'. Verify the server is running and reachable.| Toast alert | User stopped the session, and failed to connect to server. | User stopped the session. | Run the cell again to reconnect the session.|
189190
| VS Code Runtime | Kernel with id – k1 - has been disposed. | Output channel – “Jupyter”. | Kernel not connected. | VS Code lost connection to the compute kernel. | Reselect the Spark pool and execute a cell. |
190191
| VS Code Runtime | ModuleNotFoundError: No module named 'MicrosoftSentinelProvider'. | Inline. | Module not found. | Missing import for example, Microsoft Sentinel Library library | Run the setup/init cell again. |
191192
| VS Code Runtime | Cell In[{cell number}], line 1 if: ^ SyntaxError: invalid syntax. | Inline. | Invalid syntax. | Python or PySpark syntax error. | Review code syntax; check for missing colons, parentheses, or quotes. |

articles/sentinel/graph/sentinel-lake-overview.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -122,3 +122,4 @@ For more information on using the Microsoft Sentinel data lake, see the followin
122122
+ [KQL and the Microsoft Sentinel data lake (preview)](kql-overview.md)
123123
+ [Permissions for the Microsoft Sentinel data lake (preview)](../roles.md#roles-and-permissions-for-the-microsoft-sentinel-data-lake-preview)
124124
+ [Manage data tiers and retention in Microsoft Defender Portal (preview)](https://aka.ms/manage-data-defender-portal-overview)
125+
+ [Manage and monitor costs for Microsoft Sentinel](../billing-monitor-costs.md)

articles/sentinel/graph/sentinel-lake-service-limits.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@ ms.author: edbaynash
1313
---
1414

1515

16-
# Microsoft Sentinel data lake (preview) service limits
16+
# Microsoft Sentinel data lake (preview) service parameters and limits
1717

18-
The following service limits apply to the Microsoft Sentinel data lake (preview) service.
18+
The following service parameters and limits apply to the Microsoft Sentinel data lake (preview) service.
1919

2020
[!INCLUDE [Service limits for VS Code notebooks](../includes/service-limits-notebooks.md)]
2121

articles/sentinel/includes/service-limits-kql-jobs.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,18 +2,18 @@
22
author: EdB-MSFT
33
ms.author: edbayansh
44
ms.topic: include
5-
ms.date: 06/30/2025
5+
ms.date: 07/15/2025
66
---
77

8-
## Service limits for KQL jobs
8+
## Service parameters and limits for KQL jobs
99

10-
The following table lists the service limits for KQL jobs in the Microsoft Sentinel data lake (Preview).
10+
The following table lists the service parameters and limits for KQL jobs in the Microsoft Sentinel data lake (Preview).
1111

12-
| Description | Limit
13-
|-------------------------------------|-----------------|
14-
| Job query execution timeout | 1 hour |
15-
| Query time range | Up to 12 years |
16-
| Jobs per tenant (enabled jobs) | 100 (soft limit) |
17-
| Concurrent job execution per tenant | 3 |
18-
| Query scope | Single workspace |
19-
| Number of output tables per job | 1 |
12+
| Category | Parameter/limit |
13+
|-------------------------------------|---------------------|
14+
| Concurrent job execution per tenant | 3 |
15+
| Job query execution timeout | 1 hour |
16+
| Jobs per tenant (enabled jobs) | 100 |
17+
| Number of output tables per job | 1 |
18+
| Query scope | Single workspace |
19+
| Query time range | Up to 12 years |

articles/sentinel/includes/service-limits-kql-queries.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,18 +2,18 @@
22
author: EdB-MSFT
33
ms.author: edbayansh
44
ms.topic: include
5-
ms.date: 06/30/2025
5+
ms.date: 07/15/2025
66
---
77

8-
## Service limits for KQL queries in the lake tier.
8+
## Service parameters and limits for KQL queries in the lake tier
99

10-
The following limitations apply when writing queries in Microsoft Sentinel data lake (Preview).
10+
The following service parameters limitations apply when writing queries in Microsoft Sentinel data lake (Preview).
1111

12-
|Category | Limit|
13-
|---|---|
14-
| Query result rows| 30,000 rows |
15-
| Query result data | 64 MB. |
16-
| Query timeout | 8 minutes. |
17-
|Queryable time range | Up to 12 years, depending on data retention. |
18-
| Concurrent interactive queries| 45 per minute|
19-
| Query Scope | Single workspace |
12+
| Category | Parameter/limit |
13+
|----------------------------|-----------------------------------------------|
14+
| Concurrent interactive queries | 45 per minute |
15+
| Query result data | 64 MB |
16+
| Query result rows | 30,000 rows |
17+
| Query Scope | Single workspace |
18+
| Query timeout | 8 minutes |
19+
| Queryable time range | Up to 12 years, depending on data retention. |

articles/sentinel/includes/service-limits-notebooks.md

Lines changed: 15 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -2,27 +2,23 @@
22
author: EdB-MSFT
33
ms.author: edbayansh
44
ms.topic: include
5-
ms.date: 06/30/2025
5+
ms.date: 07/15/2025
66
---
77

8-
## Service limits for VS Code Notebooks
8+
## Service parameters and limits for VS Code Notebooks
99

1010

11-
The following section lists the service limits for Microsoft Sentinel data lake (Preview) when using VS Code Notebooks.
11+
The following section lists the service parameters and limits for Microsoft Sentinel data lake (Preview) when using VS Code Notebooks.
1212

13-
+ Spark compute session takes about 5-6 minutes to start. You can view the status of the session at the bottom of your VS Code Notebook.
14-
+ Only [Azure Synapse libraries 3.4](https://github.com/microsoft/synapse-spark-runtime/tree/main#readme) and the Microsoft Sentinel Provider library for abstracted functions are supported for querying lake. Pip installs or custom libraries aren't supported.
15-
16-
17-
| Category | Limit |
18-
|----------|-------|
19-
| Session start-up time | 5 minutes |
20-
| Interactive: Session inactivity timeout | 20 minutes |
21-
| Interactive: Query timeout | 2 hours |
22-
| Gateway web socket timeout | 2 hours |
23-
| VS Code UX limit to display records | 100,000 rows |
24-
| Supported libraries | Azure Synapse libraries, Microsoft Sentinel Provider. Pip install and custom libraries aren't supported |
25-
| Language | Python |
26-
| Max concurrent users on interactive querying | 8-10 on Large pool |
27-
| Max concurrent notebook jobs | 3, subsequent jobs are queued |
28-
| Custom table in the analytics tier | Custom tables in analytics tier can't be deleted from a notebook; Use Log Analytics to delete these tables. For more information, see [Add or delete tables and columns in Azure Monitor Logs](/azure/azure-monitor/logs/create-custom-table?tabs=azure-portal-1%2Cazure-portal-2%2Cazure-portal-3#delete-a-table)|
13+
|Category|Parameter/limit|
14+
|---|---|
15+
|Custom table in the analytics tier|Custom tables in analytics tier can't be deleted from a notebook; Use Log Analytics to delete these tables. For more information, see [Add or delete tables and columns in Azure Monitor Logs](/azure/azure-monitor/logs/create-custom-table?tabs=azure-portal-1%2Cazure-portal-2%2Cazure-portal-3#delete-a-table)|
16+
|Gateway web socket timeout|2 hours|
17+
|Interactive query timeout|2 hours|
18+
|Interactive session inactivity timeout|20 minutes|
19+
|Language|Python|
20+
|Max concurrent notebook jobs|3, subsequent jobs are queued|
21+
|Max concurrent users on interactive querying|8-10 on Large pool|
22+
|Session start-up time|Spark compute session takes about 5-6 minutes to start. You can view the status of the session at the bottom of your VS Code Notebook.|
23+
|Supported libraries|Only [Azure Synapse libraries 3.4](https://github.com/microsoft/synapse-spark-runtime/tree/main#readme) and the Microsoft Sentinel Provider library for abstracted functions are supported for querying the data lake. Pip installs or custom libraries aren't supported.|
24+
|VS Code UX limit to display records|100,000 rows|

articles/sentinel/includes/service-limits-table-manaement-ingestion.md

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,22 +2,24 @@
22
author: EdB-MSFT
33
ms.author: edbayansh
44
ms.topic: include
5-
ms.date: 06/30/2025
5+
ms.date: 07/15/2025
66
---
77

8-
## Service limits for tables, data management, and ingestion
8+
## Service parameters and limits for tables, data management, and ingestion
99

1010
> [!NOTE]
11-
> During public preview Microsoft Sentinel data lake uses a single region. Your primary and other workspaces must be in the same region as your tenant’s home region. Only workspaces in the same region as your tenant’s home region can be attached to the data lake.
11+
> During preview Microsoft Sentinel data lake uses a single region. Your primary and other workspaces must be in the same region as your tenant’s home region. Only workspaces in the same region as your tenant’s home region can be attached to the data lake.
1212
13-
The following table lists the service limits for the Microsoft Sentinel data lake (Preview) service related to table management, data ingestion, and retention. These limits include, but aren't limited to, Azure Resource Graph data, Microsoft 365 data, and data mirroring.
14-
15-
| Description | Limit |
16-
|-----------------------------------------------------|------------------------------|
17-
| Lake Retention (Aux) | 12 years |
18-
| Lake Retention (Asset data) | 12 years |
19-
| Ingestion requests per minute to a data collection endpoint | 15,000 |
20-
| Data ingestion per minute to a data collection endpoint | 50 GB |
21-
| Maximum size for field values (Log Analytics) | 32 KB (truncated above the limit) |
22-
| Default ingestion volume rate threshold in LALog Analytics workspaces | 6 GB/min uncompressed |
13+
The following table lists the service parameters and limits for the Microsoft Sentinel data lake (preview) service related to table management, data ingestion, and retention. These limits include, but aren't limited to, Azure Resource Graph data, Microsoft 365 data, and data mirroring.
2314

15+
| Category | Parameter/limit |
16+
|--------------------------------------------------|----------------------------------------------|
17+
| Data ingestion per minute to a data collection endpoint | 50 GB |
18+
| Default ingestion volume rate threshold in LALog Analytics workspaces | 6 GB/min uncompressed |
19+
| Ingestion requests per minute to a data collection endpoint | 15,000 |
20+
| Lake Retention (Asset data) | 12 years |
21+
| Lake Retention (Aux) | 12 years |
22+
| Maximum size for field values (Log Analytics) | 32 KB (truncated above the limit) |
23+
| Table setup latency during onboarding | 90-120 minutes |
24+
| New table setup latency | 90-120 minutes |
25+
| Switching data between tiers latency | 90-120 minutes |

0 commit comments

Comments
 (0)