You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/spark/apache-spark-manage-pool-packages.md
+16-6Lines changed: 16 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -24,9 +24,9 @@ After the changes are saved, a Spark job runs the installation and caches the re
24
24
> - Altering the PySpark, Python, Scala/Java, .NET, R, or Spark version isn't supported.
25
25
> - Installing packages from external repositories like PyPI, Conda-Forge, or the default Conda channels isn't supported within data exfiltration protection enabled workspaces.
26
26
27
-
## Manage packages from Synapse Studio or the Azure portal
27
+
## Manage packages from Synapse Studio or Azure portal
28
28
29
-
You can update or add libraries to a Spark pool from either the Azure portal or Synapse Studio.
29
+
Spark pool libraries can be managed either from Synapse Studio or the Azure portal.
30
30
31
31
### [Azure portal](#tab/azure-portal)
32
32
@@ -38,6 +38,16 @@ You can update or add libraries to a Spark pool from either the Azure portal or
38
38
39
39
:::image type="content" source="./media/apache-spark-azure-portal-add-libraries/apache-spark-add-library-azure.png" alt-text="Screenshot that highlights the upload environment configuration file button." lightbox="./media/apache-spark-azure-portal-add-libraries/apache-spark-add-library-azure.png":::
40
40
41
+
1. For Python feed libraries, upload the environment configuration file using the file selector in the **Packages** section of the page.
42
+
43
+
1. You can also select additional **workspace packages** to add Jar, Wheel, or Tar.gz files to your pool.
44
+
45
+
1. You can also remove deprecated packages from **Workspace packages** section, then your pool no longer attaches these packages.
46
+
47
+
1. After you save your changes, a system job is triggered to install and cache the specified libraries. This process helps reduce overall session startup time.
48
+
49
+
1. After the job successfully completes, all new sessions pick up the updated pool libraries.
50
+
41
51
### [Synapse Studio](#tab/synapse-studio)
42
52
43
53
1. In Synapse Studio, select **Manage** from the main navigation panel and then select **Apache Spark pools**.
@@ -46,8 +56,6 @@ You can update or add libraries to a Spark pool from either the Azure portal or
46
56
47
57
:::image type="content" source="./media/apache-spark-azure-portal-add-libraries/studio-update-libraries.png" alt-text="Screenshot that highlights the logs of library installation." lightbox="./media/apache-spark-azure-portal-add-libraries/studio-update-libraries.png":::
48
58
49
-
---
50
-
51
59
1. For Python feed libraries, upload the environment configuration file using the file selector in the **Packages** section of the page.
52
60
53
61
1. You can also select additional **workspace packages** to add Jar, Wheel, or Tar.gz files to your pool.
@@ -58,6 +66,8 @@ You can update or add libraries to a Spark pool from either the Azure portal or
58
66
59
67
1. After the job successfully completes, all new sessions pick up the updated pool libraries.
60
68
69
+
---
70
+
61
71
> [!IMPORTANT]
62
72
> By selecting the option to **Force new settings**, you're ending the all current sessions for the selected Spark pool. Once the sessions are ended, you must wait for the pool to restart.
63
73
>
@@ -75,13 +85,13 @@ To view these logs:
75
85
76
86
1. Select the system Spark application job that corresponds to your pool update. These system jobs run under the *SystemReservedJob-LibraryManagement* title.
77
87
78
-
:::image type="content" source="./media/apache-spark-azure-portal-add-libraries/system-reserved-library-job.png" alt-text="Screenshot that highlights system reserved library job." lightbox="./media/apache-spark-azure-portal-add-libraries/system-reserved-library-job.png":::
88
+
:::image type="content" source="./media/apache-spark-azure-portal-add-libraries/system-reserved-library-job.png" alt-text="Screenshot that highlights system reserved library job." lightbox="./media/apache-spark-azure-portal-add-libraries/system-reserved-library-job.png":::
79
89
80
90
1. Switch to view the **driver** and **stdout** logs.
81
91
82
92
1. The results contain the logs related to the installation of your dependencies.
83
93
84
-
:::image type="content" source="./media/apache-spark-azure-portal-add-libraries/system-reserved-library-job-results.png" alt-text="Screenshot that highlights system reserved library job results." lightbox="./media/apache-spark-azure-portal-add-libraries/system-reserved-library-job-results.png":::
94
+
:::image type="content" source="./media/apache-spark-azure-portal-add-libraries/system-reserved-library-job-results.png" alt-text="Screenshot that highlights system reserved library job results." lightbox="./media/apache-spark-azure-portal-add-libraries/system-reserved-library-job-results.png":::
Copy file name to clipboardExpand all lines: articles/synapse-analytics/sql-data-warehouse/sql-data-warehouse-manage-compute-overview.md
+6-4Lines changed: 6 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: Manage compute resource for dedicated SQL pool
2
+
title: Manage compute resources for dedicated SQL pool
3
3
description: Learn how to scale, pause, or resume compute resources for dedicated SQL pool (formerly SQL DW) in Azure Synapse Analytics.
4
4
author: WilliamDAssafMSFT
5
5
ms.author: wiassaf
@@ -11,7 +11,7 @@ ms.custom:
11
11
- azure-synapse
12
12
---
13
13
14
-
# Manage compute for dedicated SQL pool
14
+
# Manage compute resources for dedicated SQL pool
15
15
16
16
This article explains how to manage compute resources for dedicated SQL pool (formerly SQL DW) in Azure Synapse Analytics. You can lower costs by pausing the dedicated SQL pool, or scale the dedicated SQL pool to meet performance demands.
17
17
@@ -27,7 +27,9 @@ You can scale out or scale back compute by adjusting the [data warehouse units (
27
27
28
28
For scale-out steps, see the quickstarts for the [Azure portal](quickstart-scale-compute-portal.md), [PowerShell](quickstart-scale-compute-powershell.md), or [T-SQL](quickstart-scale-compute-tsql.md). You can also perform scale-out operations using a [REST API](sql-data-warehouse-manage-compute-rest-api.md#scale-compute).
29
29
30
-
To perform a scale operation, dedicated SQL pool first kills all incoming queries and then rolls back transactions to ensure a consistent state. Scaling only occurs once the transaction rollback is complete. For a scale operation, the system detaches the storage layer from the compute nodes, adds compute nodes, and then reattaches the storage layer to the compute layer. Each dedicated SQL pool is stored as 60 distributions, which are evenly distributed to the compute nodes. Adding more compute nodes adds more compute power. As the number of compute nodes increases, the number of distributions per compute node decreases, providing more compute power for your queries. Likewise, decreasing DWUs reduces the number of compute nodes, which reduces the compute resources for queries.
30
+
To perform a scale operation, dedicated SQL pool first kills all incoming queries and then rolls back transactions to ensure a consistent state. Scaling only occurs once the transaction rollback is complete. For a scale operation, the system detaches the storage layer from the compute nodes, adds compute nodes, and then reattaches the storage layer to the compute layer.
31
+
32
+
Each dedicated SQL pool is stored as 60 distributions, which are evenly distributed to the compute nodes. Adding more compute nodes adds more compute power. As the number of compute nodes increases, the number of distributions per compute node decreases, providing more compute power for your queries. Likewise, decreasing DWUs reduces the number of compute nodes, which reduces the compute resources for queries.
31
33
32
34
The following table shows how the number of distributions per compute node changes as the DWUs change. DW30000c provides 60 compute nodes and achieves much higher query performance than DW100c.
33
35
@@ -50,7 +52,7 @@ The following table shows how the number of distributions per compute node chang
50
52
| DW15000c | 30 | 2 |
51
53
| DW30000c | 60 | 1 |
52
54
53
-
## Find the right size of DWUs
55
+
## Finding the right size of data warehouse units
54
56
55
57
To see the performance benefits of scaling out, especially for larger data warehouse units, you want to use at least a 1-TB data set. To find the best number of DWUs for your dedicated SQL pool, try scaling up and down. Run a few queries with different numbers of DWUs after loading your data. Since scaling is quick, you can try various performance levels in an hour or less.
0 commit comments