Skip to content

Commit 59950fa

Browse files
authored
Merge pull request #152027 from v-lanjli/uploadanewdoc
upload a new spark doc
2 parents dc6ecb9 + b4d8ac1 commit 59950fa

File tree

7 files changed

+223
-0
lines changed

7 files changed

+223
-0
lines changed
Lines changed: 221 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,221 @@
1+
---
2+
title: Use Azure Log Analytics to collect and visualize metrics and logs (Preview)
3+
description: Learn how to enable the Synapse built-in Azure Log Analytics connector for collecting and sending the Apache Spark application metrics and logs to your Azure Log Analytics workspace.
4+
services: synapse-analytics
5+
author: jejiang
6+
ms.author: jejiang
7+
ms.reviewer: jrasnick
8+
ms.service: synapse-analytics
9+
ms.topic: tutorial
10+
ms.subservice: spark
11+
ms.date: 03/25/2021
12+
ms.custom: references_regions
13+
---
14+
# Tutorial: Use Azure Log Analytics to collect and visualize metrics and logs (Preview)
15+
16+
In this tutorial, you will learn how to enable the Synapse built-in Azure Log Analytics connector for collecting and sending the Apache Spark application metrics and logs to your [Azure Log Analytics workspace](/azure/azure-monitor/logs/quick-create-workspace). You can then leverage an Azure monitor workbook to visualize the metrics and logs.
17+
18+
## Configure Azure Log Analytics Workspace information in Synapse Studio
19+
20+
### Step 1: Create an Azure Log Analytics workspace
21+
22+
You can follow below documents to create a Log Analytics workspace:
23+
- [Create a Log Analytics workspace in the Azure portal](https://docs.microsoft.com/azure/azure-monitor/logs/quick-create-workspace)
24+
- [Create a Log Analytics workspace with Azure CLI](https://docs.microsoft.com/azure/azure-monitor/logs/quick-create-workspace-cli)
25+
- [Create and configure a Log Analytics workspace in Azure Monitor using PowerShell](https://docs.microsoft.com/azure/azure-monitor/logs/powershell-workspace-configuration)
26+
27+
### Step 2: Prepare a Spark configuration file
28+
29+
#### Option 1. Configure with Azure Log Analytics Workspace ID and Key
30+
31+
Copy the following Spark configuration, save it as **"spark_loganalytics_conf.txt"** and fill the parameters:
32+
33+
- `<LOG_ANALYTICS_WORKSPACE_ID>`: Azure Log Analytics workspace ID.
34+
- `<LOG_ANALYTICS_WORKSPACE_KEY>`: Azure Log Analytics key: **Azure portal > Azure Log Analytics workspace > Agents management > Primary key**
35+
36+
```properties
37+
spark.synapse.logAnalytics.enabled true
38+
spark.synapse.logAnalytics.workspaceId <LOG_ANALYTICS_WORKSPACE_ID>
39+
spark.synapse.logAnalytics.secret <LOG_ANALYTICS_WORKSPACE_KEY>
40+
```
41+
42+
#### Option 2. Configure with an Azure Key Vault
43+
44+
> [!NOTE]
45+
>
46+
> You need to grant read secret permission to the users who will submit Spark applications. Please see [provide access to Key Vault keys, certificates, and secrets with an Azure role-based access control](https://docs.microsoft.com/azure/key-vault/general/rbac-guide)
47+
48+
To configure an Azure Key Vault to store the workspace key, follow the steps:
49+
50+
1. Create and navigate to your key vault in the Azure portal
51+
2. On the Key Vault settings pages, select **Secrets**.
52+
3. Click on **Generate/Import**.
53+
4. On the **Create a secret** screen choose the following values:
54+
- **Name**: Type a name for the secret, type `"SparkLogAnalyticsSecret"` as default.
55+
- **Value**: Type the **<LOG_ANALYTICS_WORKSPACE_KEY>** for the secret.
56+
- Leave the other values to their defaults. Click **Create**.
57+
5. Copy the following Spark configuration, save it as **"spark_loganalytics_conf.txt"** and fill the parameters:
58+
59+
- `<LOG_ANALYTICS_WORKSPACE_ID>`: Azure Log Analytics workspace ID.
60+
- `<AZURE_KEY_VAULT_NAME>`: The Azure Key Vault name you configured.
61+
- `<AZURE_KEY_VAULT_SECRET_KEY_NAME>` (Optional): The secret name in the Azure Key Vault for workspace key, default: "SparkLogAnalyticsSecret".
62+
63+
```properties
64+
spark.synapse.logAnalytics.enabled true
65+
spark.synapse.logAnalytics.workspaceId <LOG_ANALYTICS_WORKSPACE_ID>
66+
spark.synapse.logAnalytics.keyVault.name <AZURE_KEY_VAULT_NAME>
67+
spark.synapse.logAnalytics.keyVault.key.secret <AZURE_KEY_VAULT_SECRET_KEY_NAME>
68+
```
69+
70+
> [!NOTE]
71+
>
72+
> You can also store the Log Analytics workspace id to Azure Key vault. Please refer to the above steps and store the workspace id with secret name `"SparkLogAnalyticsWorkspaceId"`. Or use the config `spark.synapse.logAnalytics.keyVault.key.workspaceId` to specify the workspace id secret name in Azure Key vault.
73+
74+
#### Option 3. Configure with an Azure Key Vault linked service
75+
76+
> [!NOTE]
77+
>
78+
> You need to grant read secret permission to the Synapse workspace. Please see [provide access to Key Vault keys, certificates, and secrets with an Azure role-based access control](https://docs.microsoft.com/azure/key-vault/general/rbac-guide)
79+
80+
To configure an Azure Key Vault linked service in Synapse Studio to store the workspace key, follow the steps:
81+
82+
1. Follow all the steps in the `Option 2. Configure with an Azure Key Vault` section.
83+
2. Create an Azure Key vault linked service in Synapse Studio:
84+
85+
a. Navigate to **Synapse Studio > Manage > Linked services**, click **New** button.
86+
87+
b. Search **Azure Key Vault** in the search box.
88+
89+
c. Type a name for the linked service.
90+
91+
d. Choose your Azure key vault. Click **Create**.
92+
93+
3. Add a `spark.synapse.logAnalytics.keyVault.linkedServiceName` item to Spark configuration.
94+
95+
```properties
96+
spark.synapse.logAnalytics.enabled true
97+
spark.synapse.logAnalytics.workspaceId <LOG_ANALYTICS_WORKSPACE_ID>
98+
spark.synapse.logAnalytics.keyVault.name <AZURE_KEY_VAULT_NAME>
99+
spark.synapse.logAnalytics.keyVault.key.secret <AZURE_KEY_VAULT_SECRET_KEY_NAME>
100+
spark.synapse.logAnalytics.keyVault.linkedServiceName <LINKED_SERVICE_NAME>
101+
```
102+
103+
#### Available Spark Configuration
104+
105+
| Configuration Name | Default Value | Description |
106+
| --------------------------------------------------- | ---------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
107+
| spark.synapse.logAnalytics.enabled | false | To enable the Azure Log Analytics sink for the Spark applications, true. Otherwise, false. |
108+
| spark.synapse.logAnalytics.workspaceId | - | The destination Azure Log Analytics workspace ID |
109+
| spark.synapse.logAnalytics.secret | - | The destination Azure Log Analytics workspace secret. |
110+
| spark.synapse.logAnalytics.keyVault.linkedServiceName | - | Azure Key vault linked service name for the Azure Log Analytics workspace ID and key |
111+
| spark.synapse.logAnalytics.keyVault.name | - | Azure Key vault name for the Azure Log Analytics ID and key |
112+
| spark.synapse.logAnalytics.keyVault.key.workspaceId | SparkLogAnalyticsWorkspaceId | Azure Key vault secret name for the Azure Log Analytics workspace ID |
113+
| spark.synapse.logAnalytics.keyVault.key.secret | SparkLogAnalyticsSecret | Azure Key vault secret name for the Azure Log Analytics workspace key |
114+
| spark.synapse.logAnalytics.keyVault.uriSuffix | ods.opinsights.azure.com | The destination Azure Log Analytics workspace [URI suffix][uri_suffix]. If your Azure Log Analytics Workspace is not in Azure global, you need to update the URI suffix according to the respective cloud. |
115+
116+
> [!NOTE]
117+
> - For Azure China clouds, the "spark.synapse.logAnalytics.keyVault.uriSuffix" parameter should be "ods.opinsights.azure.cn".
118+
> - For Azure Gov clouds, the "spark.synapse.logAnalytics.keyVault.uriSuffix" parameter should be "ods.opinsights.azure.us".
119+
120+
[uri_suffix]: https://docs.microsoft.com/azure/azure-monitor/logs/data-collector-api#request-uri
121+
122+
123+
### Step 3: Upload your Spark configuration to a Spark pool
124+
You can upload the configuration file to your Synapse Spark pool in Synapse Studio.
125+
126+
1. Navigate to your Apache Spark pool in the Azure Synapse Studio (Manage -> Apache Spark pools)
127+
2. Click the **"..."** button on the right of your Apache Spark pool
128+
3. Select Apache Spark configuration
129+
4. Click **Upload** and choose the **"spark_loganalytics_conf.txt"** created.
130+
5. Click **Upload** and **Apply**.
131+
132+
> [!div class="mx-imgBorder"]
133+
> ![spark pool configuration](./media/apache-spark-azure-log-analytics/spark-pool-configuration.png)
134+
135+
> [!NOTE]
136+
>
137+
> All the Spark application submitted to the Spark pool above will use the configuration setting to push the Spark application metrics and logs to your specified Azure Log Analytics workspace.
138+
139+
## Submit a Spark application and view the logs and metrics in Azure Log Analytics
140+
141+
1. You can submit a Spark application to the Spark pool configured in the previous step, using one of the following ways:
142+
- Run a Synapse Studio notebook.
143+
- Submit a Synapse Apache Spark batch job through Spark job definition.
144+
- Run a Pipeline that contains Spark activity.
145+
146+
2. Go to the specified Azure Log Analytics Workspace, then view the application metrics and logs when the Spark application starts to run.
147+
148+
## Use the Sample Azure Log Analytics Workbook to visualize the metrics and logs
149+
150+
1. [Download the workbook](https://aka.ms/SynapseSparkLogAnalyticsWorkbook) here.
151+
2. Open and **Copy** the workbook file content.
152+
3. Navigate to Azure Log Analytics workbook ([Azure portal](https://portal.azure.com/) > Log Analytics workspace > Workbooks)
153+
4. Open the **"Empty"** Azure Log Analytics Workbook, in **"Advanced Editor"** mode (press the </> icon).
154+
5. **Paste** over any json that exists.
155+
6. Then Press **Apply** then **Done Editing**.
156+
157+
> [!div class="mx-imgBorder"]
158+
> ![new workbook](./media/apache-spark-azure-log-analytics/new-workbook.png)
159+
160+
> [!div class="mx-imgBorder"]
161+
> ![import workbook](./media/apache-spark-azure-log-analytics/import-workbook.png)
162+
163+
Then, submit your Apache Spark application to the configured Spark pool. After the application goes to running state, choose the running application in the workbook dropdown list.
164+
165+
> [!div class="mx-imgBorder"]
166+
> ![workbook imange](./media/apache-spark-azure-log-analytics/workbook.png)
167+
168+
And you can customize the workbook by Kusto query and configure alerts.
169+
170+
> [!div class="mx-imgBorder"]
171+
> ![kusto query and alerts](./media/apache-spark-azure-log-analytics/kusto-query-and-alerts.png)
172+
173+
## Sample Kusto queries
174+
175+
1. Query Spark events example.
176+
177+
```kusto
178+
SparkListenerEvent_CL
179+
| where workspaceName_s == "{SynapseWorkspace}" and clusterName_s == "{SparkPool}" and livyId_s == "{LivyId}"
180+
| order by TimeGenerated desc
181+
| limit 100
182+
```
183+
184+
2. Query Spark application driver and executors logs example.
185+
186+
```kusto
187+
SparkLoggingEvent_CL
188+
| where workspaceName_s == "{SynapseWorkspace}" and clusterName_s == "{SparkPool}" and livyId_s == "{LivyId}"
189+
| order by TimeGenerated desc
190+
| limit 100
191+
```
192+
193+
3. Query Spark metrics example.
194+
195+
```kusto
196+
SparkMetrics_CL
197+
| where workspaceName_s == "{SynapseWorkspace}" and clusterName_s == "{SparkPool}" and livyId_s == "{LivyId}"
198+
| where name_s endswith "jvm.total.used"
199+
| summarize max(value_d) by bin(TimeGenerated, 30s), executorId_s
200+
| order by TimeGenerated asc
201+
```
202+
203+
## Create and manage alerts using Azure Log Analytics
204+
205+
Azure Monitor alerts allow users to use a Log Analytics query to evaluate metrics and logs every set frequency, and fire an alert based on the results.
206+
207+
For more information, see [Create, view, and manage log alerts using Azure Monitor](https://docs.microsoft.com/azure/azure-monitor/alerts/alerts-log).
208+
209+
## Limitation
210+
211+
- Azure Synapse Analytics workspace with [managed virtual network](https://docs.microsoft.com/azure/synapse-analytics/security/synapse-workspace-managed-vnet) enabled is not supported.
212+
- The following regions aren't currently supported:
213+
- East US 2
214+
- Norway East
215+
- UAE North
216+
217+
## Next steps
218+
219+
- Learn how to [Use serverless Apache Spark pool in Synapse Studio](https://docs.microsoft.com/azure/synapse-analytics/quickstart-create-apache-spark-pool-studio).
220+
- Learn how to [Run a Spark application in notebook](https://docs.microsoft.com/azure/synapse-analytics/spark/apache-spark-development-using-notebooks).
221+
- Learn how to [Create Apache Spark job definition in Synapse Studio](https://docs.microsoft.com/azure/synapse-analytics/spark/apache-spark-job-definitions).
351 KB
Loading
549 KB
Loading
419 KB
Loading
265 KB
Loading
331 KB
Loading

articles/synapse-analytics/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -588,6 +588,8 @@ items:
588588
href: ./spark/connect-monitor-azure-synapse-spark-application-level-metrics.md
589589
- name: Monitor Apache Spark Application-level metrics with Prometheus and Grafana
590590
href: ./spark/use-prometheus-grafana-to-monitor-apache-spark-application-level-metrics.md
591+
- name: Use Azure Log Analytics to collect and visualize metrics and logs (Preview)
592+
href: ./spark/apache-spark-azure-log-analytics.md
591593
- name: Concepts
592594
items:
593595
- name: Apache Spark

0 commit comments

Comments
 (0)