You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/database-designer/overview-database-templates.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,15 +7,15 @@ ms.reviewer: wiassaf
7
7
ms.service: azure-synapse-analytics
8
8
ms.subservice: database-editor
9
9
ms.topic: overview
10
-
ms.date: 10/16/2023
10
+
ms.date: 02/25/2025
11
11
ms.custom: template-overview
12
12
---
13
13
14
14
# What are Azure Synapse database templates?
15
15
16
16
Data takes many forms as it moves from source systems to data warehouses and data marts with the intent to solve business problems. Database templates can help with the transformation of data into insights.
17
17
18
-
Database templates are a set of business and technical data definitions that are pre-designed to meet the needs of a particular industry. They act as blueprints that provide common elements derived from best practices, government regulations, and the complex data and analytic needs of an industry-specific organization.
18
+
Database templates are a set of business and technical data definitions that are predesigned to meet the needs of a particular industry. They act as blueprints that provide common elements derived from best practices, government regulations, and the complex data and analytic needs of an industry-specific organization.
19
19
20
20
These schema blueprints can be used by organizations to plan, architect, and design data solutions for data governance, reporting, business intelligence, and advanced analytics. The data models provide integrated business-wide information architectures that can help you implement, in a timely and predictable way, a proven industry data architecture.
21
21
@@ -50,7 +50,7 @@ Currently, you can choose from the following database templates in Azure Synapse
50
50
***Freight & Logistics** - For companies that provide freight and logistics services.
51
51
***Fund Management** - For companies that manage investment funds for investors.
52
52
***Genomics** - For companies acquiring and analyzing genomic data about human beings or other species.
53
-
***Government** - For organizations controlling, regulating or providing services to a country/region, state or province, or community.
53
+
***Government** - For organizations controlling, regulating, or providing services to a country/region, state or province, or community.
54
54
***Healthcare Insurance** - For organizations providing insurance to cover healthcare needs (sometimes known as Payors).
55
55
***Healthcare Provider** - For organizations providing healthcare services.
56
56
***Life Insurance & Annuities** - For companies that provide life insurance, sell annuities, or both.
@@ -66,10 +66,10 @@ Currently, you can choose from the following database templates in Azure Synapse
66
66
***Travel Services** - For companies providing booking services for airlines, hotels, car rentals, cruises, and vacation packages.
67
67
***Utilities** - For gas, electric, and water utilities; power generators; and water desalinators.
68
68
***Wireless** - For companies providing a range of wireless telecommunications services.
69
-
69
+
70
70
As emission and carbon management is an important discussion in all industries, so we've included those components in all the available database templates. These components make it easy for companies who need to track and report their direct and indirect greenhouse gas emissions.
71
71
72
-
## Next steps
72
+
## Related content
73
73
74
74
Continue to explore the capabilities of the database designer using the links below.
Copy file name to clipboardExpand all lines: articles/synapse-analytics/sql/data-loading-best-practices.md
+14-14Lines changed: 14 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,10 +4,10 @@ description: Recommendations and performance optimizations for loading data into
4
4
author: joannapea
5
5
ms.author: joanpo
6
6
ms.reviewer: wiassaf
7
-
ms.date: 08/26/2021
7
+
ms.date: 02/25/2025
8
8
ms.service: azure-synapse-analytics
9
9
ms.subservice: sql
10
-
ms.topic: conceptual
10
+
ms.topic: concept-article
11
11
ms.custom: azure-synapse
12
12
---
13
13
@@ -29,9 +29,9 @@ Split large compressed files into smaller compressed files.
29
29
30
30
## Run loads with enough compute
31
31
32
-
For fastest loading speed, run only one load job at a time. If that is not feasible, run a minimal number of loads concurrently. If you expect a large loading job, consider scaling up your dedicated SQL pool before the load.
32
+
For fastest loading speed, run only one load job at a time. If that isn't feasible, run a minimal number of loads concurrently. If you expect a large loading job, consider scaling up your dedicated SQL pool before the load.
33
33
34
-
To run loads with appropriate compute resources, create loading users designated for running loads. Assign each loading user to a specific resource class or workload group. To run a load, sign in as one of the loading users, and then run the load. The load runs with the user's resource class. This method is simpler than trying to change a user's resource class to fit the current resource class need.
34
+
To run loads with appropriate compute resources, create loading users designated for running loads. Assign each loading user to a specific resource class or workload group. To run a load, sign in as one of the loading users, and then run the load. The load runs with the user's resource class. This method is simpler than trying to change a user's resource class to fit the current resource class need.
35
35
36
36
37
37
### Create a loading user
@@ -70,13 +70,13 @@ Connect to the dedicated SQL pool and create a user. The following code assumes
70
70
71
71
<br><br>
72
72
>[!IMPORTANT]
73
-
>This is an extreme example of allocating 100% resources of the SQL pool to a single load. This will give you a maximum concurrency of 1. Be aware that this should be used only for the initial load where you will need to create additional workload groups with their own configurations to balance resources across your workloads.
73
+
>This is an extreme example of allocating 100% resources of the SQL pool to a single load. This will give you a maximum concurrency of 1. Be aware that this should be used only for the initial load where you'll need to create other workload groups with their own configurations to balance resources across your workloads.
74
74
75
75
To run a load with resources for the loading workload group, sign in as loader and run the load.
76
76
77
77
## Allow multiple users to load
78
78
79
-
There is often a need to have multiple users load data into a data warehouse. Loading with the [CREATE TABLE AS SELECT (Transact-SQL)](/sql/t-sql/statements/create-table-as-select-azure-sql-data-warehouse?view=azure-sqldw-latest&preserve-view=true) requires CONTROL permissions of the database. The CONTROL permission gives control access to all schemas. You might not want all loading users to have control access on all schemas. To limit permissions, use the DENY CONTROL statement.
79
+
There's often a need to have multiple users load data into a data warehouse. Loading with the [CREATE TABLE AS SELECT (Transact-SQL)](/sql/t-sql/statements/create-table-as-select-azure-sql-data-warehouse?view=azure-sqldw-latest&preserve-view=true) requires CONTROL permissions of the database. The CONTROL permission gives control access to all schemas. You might not want all loading users to have control access on all schemas. To limit permissions, use the DENY CONTROL statement.
80
80
81
81
For example, consider database schemas, schema_A for dept A, and schema_B for dept B. Let database users user_A and user_B be users for PolyBase loading in dept A and B, respectively. They both have been granted CONTROL database permissions. The creators of schema A and B now lock down their schemas using DENY:
82
82
@@ -89,7 +89,7 @@ User_A and user_B are now locked out from the other dept's schema.
89
89
90
90
## Load to a staging table
91
91
92
-
To achieve the fastest loading speed for moving data into a data warehouse table, load data into a staging table. Define the staging table as a heap and use round-robin for the distribution option.
92
+
To achieve the fastest loading speed for moving data into a data warehouse table, load data into a staging table. Define the staging table as a heap and use round-robin for the distribution option.
93
93
94
94
Consider that loading is usually a two-step process in which you first load to a staging table and then insert the data into a production data warehouse table. If the production table uses a hash distribution, the total time to load and insert might be faster if you define the staging table with the hash distribution. Loading to the staging table takes longer, but the second step of inserting the rows to the production table does not incur data movement across the distributions.
95
95
@@ -103,26 +103,26 @@ Columnstore indexes require large amounts of memory to compress data into high-q
103
103
## Increase batch size when using SQLBulkCopy API or BCP
104
104
105
105
106
-
Loading with the COPY statement will provide the highest throughput with dedicated SQL pools. If you cannot use the COPY to load and must use the [SqLBulkCopy API](/dotnet/api/system.data.sqlclient.sqlbulkcopy?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json) or [bcp](/sql/tools/bcp-utility?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true), you should consider increasing batch size for better throughput.
106
+
Loading with the COPY statement will provide the highest throughput with dedicated SQL pools. If you can't use the COPY to load and must use the [SqLBulkCopy API](/dotnet/api/system.data.sqlclient.sqlbulkcopy?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json) or [bcp](/sql/tools/bcp-utility?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true), you should consider increasing batch size for better throughput.
107
107
108
108
> [!TIP]
109
109
> A batch size between 100 K to 1M rows is the recommended baseline for determining optimal batch size capacity.
110
110
111
111
## Manage loading failures
112
112
113
-
A load using an external table can fail with the error *"Query aborted-- the maximum reject threshold was reached while reading from an external source"*. This message indicates that your external data contains dirty records. A data record is considered dirty if the data types and number of columns do not match the column definitions of the external table, or if the data doesn't conform to the specified external file format.
113
+
A load using an external table can fail with the error *"Query aborted-- the maximum reject threshold was reached while reading from an external source"*. This message indicates that your external data contains dirty records. A data record is considered dirty if the data types and number of columns don't match the column definitions of the external table, or if the data doesn't conform to the specified external file format.
114
114
115
115
To fix the dirty records, ensure that your external table and external file format definitions are correct and your external data conforms to these definitions. In case a subset of external data records are dirty, you can choose to reject these records for your queries by using the reject options in ['CREATE EXTERNAL TABLE'](/sql/t-sql/statements/create-external-table-transact-sql?view=azure-sqldw-latest&preserve-view=true) .
116
116
117
117
## Insert data into a production table
118
118
119
-
A one-time load to a small table with an [INSERT statement](/sql/t-sql/statements/insert-transact-sql?view=azure-sqldw-latest&preserve-view=true), or even a periodic reload of a look-up might perform good enough with a statement like `INSERT INTO MyLookup VALUES (1, 'Type 1')`. However, singleton inserts are not as efficient as performing a bulk-load.
119
+
A one-time load to a small table with an [INSERT statement](/sql/t-sql/statements/insert-transact-sql?view=azure-sqldw-latest&preserve-view=true), or even a periodic reload of a look-up might perform good enough with a statement like `INSERT INTO MyLookup VALUES (1, 'Type 1')`. However, singleton inserts aren't as efficient as performing a bulk-load.
120
120
121
-
If you have thousands or more single inserts throughout the day, batch the inserts so you can bulk load them. Develop your processes to append the single inserts to a file, and then create another process that periodically loads the file.
121
+
If you have thousands or more single inserts throughout the day, batch the inserts so you can bulk load them. Develop your processes to append the single inserts to a file, and then create another process that periodically loads the file.
122
122
123
123
## Create statistics after the load
124
124
125
-
To improve query performance, it's important to create statistics on all columns of all tables after the first load, or major changes occur in the data. Create statistics can be done manually or you can enable [auto-create statistics](../sql-data-warehouse/sql-data-warehouse-tables-statistics.md?context=/azure/synapse-analytics/context/context).
125
+
To improve query performance, it's important to create statistics on all columns of all tables after the first load, or major changes occur in the data. Create statistics can be done manually or you can enable [autocreate statistics](../sql-data-warehouse/sql-data-warehouse-tables-statistics.md?context=/azure/synapse-analytics/context/context).
126
126
127
127
For a detailed explanation of statistics, see [Statistics](develop-tables-statistics.md). The following example shows how to manually create statistics on five columns of the Customer_Speed table.
128
128
@@ -136,7 +136,7 @@ create statistics [YearMeasured] on [Customer_Speed] ([YearMeasured]);
136
136
137
137
## Rotate storage keys
138
138
139
-
It is good security practice to change the access key to your blob storage on a regular basis. You have two storage keys for your blob storage account, which enables you to transition the keys.
139
+
It's good security practice to change the access key to your blob storage regularly. You have two storage keys for your blob storage account, which enables you to transition the keys.
140
140
141
141
To rotate Azure Storage account keys:
142
142
@@ -158,7 +158,7 @@ ALTER DATABASE SCOPED CREDENTIAL my_credential WITH IDENTITY = 'my_identity', SE
158
158
159
159
No other changes to underlying external data sources are needed.
160
160
161
-
## Next steps
161
+
## Related content
162
162
163
163
- To learn more about PolyBase and designing an Extract, Load, and Transform (ELT) process, see [Design ELT for Azure Synapse Analytics](../sql-data-warehouse/design-elt-data-loading.md?context=/azure/synapse-analytics/context/context).
164
164
- For a loading tutorial, [Use PolyBase to load data from Azure blob storage to Azure Synapse Analytics](../sql-data-warehouse/load-data-from-azure-blob-storage-using-copy.md?bc=%2fazure%2fsynapse-analytics%2fbreadcrumb%2ftoc.json&toc=%2fazure%2fsynapse-analytics%2ftoc.json).
0 commit comments