You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/migration-guides/netezza/1-design-performance-migration.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ This article is part one of a seven part series that provides guidance on how to
21
21
> [!TIP]
22
22
> More than just a database—the Azure environment includes a comprehensive set of capabilities and tools.
23
23
24
-
Due to end of support from IBM, existing users of Netezza data warehouse systems want to take advantage of the innovations provided by newer environments such as cloud, IaaS, and PaaS, and to delegate things like infrastructure maintenance and platform development to the cloud provider.
24
+
Due to end of support from IBM, many existing users of Netezza data warehouse systems want to take advantage of the innovations provided by newer environments such as cloud, IaaS, and PaaS, and to delegate tasks like infrastructure maintenance and platform development to the cloud provider.
25
25
26
26
Although Netezza and Azure Synapse are both SQL databases designed to use massively parallel processing (MPP) techniques to achieve high query performance on exceptionally large data volumes, there are some basic differences in approach:
27
27
@@ -31,7 +31,7 @@ Although Netezza and Azure Synapse are both SQL databases designed to use massiv
31
31
32
32
- Azure Synapse can be paused or resized as required to reduce resource utilization and cost.
33
33
34
-
Microsoft Azure is a globally available, highly secure, scalable cloud environment that includes Azure Synapse in an ecosystem of supporting tools and capabilities. The next diagram summarizes the Synapse ecosystem.
34
+
Microsoft Azure is a globally available, highly secure, scalable cloud environment, that includes Azure Synapse and an ecosystem of supporting tools and capabilities. The next diagram summarizes the Azure Synapse ecosystem.
35
35
36
36
:::image type="content" source="../media/1-design-performance-migration/azure-synapse-ecosystem.png" border="true" alt-text="Chart showing the Azure Synapse ecosystem of supporting tools and capabilities.":::
37
37
@@ -77,15 +77,15 @@ Legacy Netezza environments have typically evolved over time to encompass multip
77
77
78
78
- Prove the viability of migrating to Azure Synapse by quickly delivering the benefits of the new environment.
79
79
80
-
- Allow the in-house technical staff to gain relevant experience of the processes and tools involved which can be used in migrations to other areas.
80
+
- Allow the in-house technical staff to gain relevant experience of the processes and tools involved, which can be used in migrations to other areas.
81
81
82
82
- Create a template for further migrations specific to the source Netezza environment and the current tools and processes that are already in place.
83
83
84
84
A good candidate for an initial migration from the Netezza environment that would enable the items above, is typically one that implements a BI/Analytics workload (rather than an OLTP workload) with a data model that can be migrated with minimal modifications—normally a start or snowflake schema.
85
85
86
86
The migration data volume for the initial exercise should be large enough to demonstrate the capabilities and benefits of the Azure Synapse environment while quickly demonstrating the value—typically in the 1-10TB range.
87
87
88
-
To minimize the risk and reduce implementation time for the initial migration project, confine the scope of the migration to just the data marts. However, this won't address the broader topics such as ETL migration and historical data migration as part of the initial migration project. Address these topics in later phases of the project, once the migrated data mart layer is back filled with the data and processes required to build them.
88
+
To minimize the risk and reduce implementation time for the initial migration project, confine the scope of the migration to just the data marts. However, this won't address the broader topics such as ETL migration and historical data migration as part of the initial migration project. Address these topics in later phases of the project, once the migrated data mart layer is backfilled with the data and processes required to build them.
89
89
90
90
#### Lift and shift as-is versus a phased approach incorporating changes
91
91
@@ -102,15 +102,15 @@ This is a good fit for existing Netezza environments where a single data mart is
102
102
103
103
##### Phased approach incorporating modifications
104
104
105
-
In cases where a legacy warehouse has evolved over a long time, you may need to reengineer to maintain the required performance levels or to support new data like IoT steams. Migrate to Azure Synapse to get the benefits of a scalable cloud environment as part of the re-engineering process. Migration could include a change in the underlying data model, such as a move from an Inmon model to a data vault.
105
+
In cases where a legacy warehouse has evolved over a long time, you might need to re-engineer to maintain the required performance levels or to support new data, such as Internet of Things (IoT) streams. Migrate to Azure Synapse to get the benefits of a scalable cloud environment as part of the re-engineering process. Migration could include a change in the underlying data model, such as a move from an Inmon model to a data vault.
106
106
107
107
Microsoft recommends moving the existing data model as-is to Azure and using the performance and flexibility of the Azure environment to apply the re-engineering changes, leveraging Azure's capabilities to make the changes without impacting the existing source system.
108
108
109
109
#### Use Azure Data Factory to implement a metadata-driven migration
110
110
111
-
Automate and orchestrate the migration process by making use of the capabilities in the Azure environment. This approach minimizes the impact on the existing Netezza environment, which may already be running close to full capacity.
111
+
Automate and orchestrate the migration process by using the capabilities of the Azure environment. This approach minimizes the impact on the existing Netezza environment, which may already be running close to full capacity.
112
112
113
-
Data Factory is a cloud-based data integration service that allows creation of data-driven workflows in the cloud for orchestrating and automating data movement and data transformation. Using Data Factory, you can create and schedule data-driven workflows—called pipelines—to ingest data from disparate data stores. It can process and transform data by using compute services such as Azure HDInsight Hadoop, Spark, Azure Data Lake Analytics, and Azure Machine Learning.
113
+
Azure Data Factory is a cloud-based data integration service that allows creation of data-driven workflows in the cloud for orchestrating and automating data movement and data transformation. Using Data Factory, you can create and schedule data-driven workflows—called pipelines—to ingest data from disparate data stores. Data Factory can process and transform data by using compute services such as Azure HDInsight Hadoop, Spark, Azure Data Lake Analytics, and Azure Machine Learning.
114
114
115
115
By creating metadata to list the data tables to be migrated and their location, you can use the Data Factory facilities to manage the migration process.
116
116
@@ -126,7 +126,7 @@ In a Netezza environment, there are often multiple separate databases for indivi
126
126
> [!TIP]
127
127
> Replace Netezza-specific features with Azure Synapse features.
128
128
129
-
Querying within the Azure Synapse environment is limited to a single database. Schemas are used to separate the tables into logically separate groups. Therefore, we recommend using a series of schemas within the target Azure Synapse to mimic any separate databases migrated from the Netezza environment. If the Netezza environment already uses schemas, you may need to use a new naming convention to move the existing Netezza tables and views to the new environment—for example, concatenate the existing Netezza schema and table names into the new Azure Synapse table name and use schema names in the new environment to maintain the original separate database names. Schema consolidation naming can have dots—however, Synapse Spark may have issues. You can use SQL views over the underlying tables to maintain the logical structures, but there are some potential downsides to this approach:
129
+
Querying within the Azure Synapse environment is limited to a single database. Schemas are used to separate the tables into logically separate groups. Therefore, we recommend using a series of schemas within the target Azure Synapse to mimic any separate databases migrated from the Netezza environment. If the Netezza environment already uses schemas, you may need to use a new naming convention to move the existing Netezza tables and views to the new environment—for example, concatenate the existing Netezza schema and table names into the new Azure Synapse table name and use schema names in the new environment to maintain the original separate database names. Schema consolidation naming can have dots—however, Azure Synapse Spark may have issues. You can use SQL views over the underlying tables to maintain the logical structures, but there are some potential downsides to this approach:
130
130
131
131
- Views in Azure Synapse are read-only, so any updates to the data must take place on the underlying base tables.
132
132
@@ -232,13 +232,13 @@ Most modern database products allow for procedures to be stored within the datab
232
232
233
233
A stored procedure typically contains SQL statements and some procedural logic, and may return data or a status.
234
234
235
-
Azure Synapse Analytics from Azure SQL Data Warehouse also supports stored procedures using T-SQL. If you must migrate stored procedures, recode these procedures for their new environment.
235
+
Azure Synapse Analytics also supports stored procedures using T-SQL. If you must migrate stored procedures, recode these procedures for their new environment.
236
236
237
237
##### Sequences
238
238
239
239
In Netezza, a sequence is a named database object created via `CREATE SEQUENCE` that can provide the unique value via the `NEXT VALUE FOR` method. Use these to generate unique numbers for use as surrogate key values for primary key values.
240
240
241
-
Within Azure Synapse, there's no `CREATE SEQUENCE`. Sequences are handled via use of `IDENTITY` columns or using SQL code to create the next sequence number in a series.
241
+
Within Azure Synapse, there's no `CREATE SEQUENCE`. Sequences are handled via use of [IDENTITY](/sql/t-sql/statements/create-table-transact-sql-identity-property?msclkid=8ab663accfd311ec87a587f5923eaa7b) columns or using SQL code to create the next sequence number in a series.
242
242
243
243
### Extracting metadata and data from a Netezza environment
244
244
@@ -271,7 +271,7 @@ If sufficient network bandwidth is available, extract data directly from an on-p
271
271
272
272
Recommended data formats for the extracted data include delimited text files (also called Comma Separated Values or CSV), Optimized Row Columnar (ORC), or Parquet files.
273
273
274
-
For more detailed information on the process of migrating data and ETL from a Netezza environment, see Section 2.1. Data Migration ETL and Load from Netezza.
274
+
For more information about the process of migrating data and ETL from a Netezza environment, see [Data migration, ETL, and load for Netezza migration](1-design-performance-migration.md).
275
275
276
276
## Performance recommendations for Netezza migrations
277
277
@@ -331,4 +331,4 @@ Use [Workload management](/azure/synapse-analytics/sql-data-warehouse/sql-data-w
331
331
332
332
## Next steps
333
333
334
-
To learn more about ETL and load for Netezza migration, see the next article in this series: [Data migration, ETL, and load for Netezza migration](2-etl-load-migration-considerations.md)].
334
+
To learn more about ETL and load for Netezza migration, see the next article in this series: [Data migration, ETL, and load for Netezza migration](2-etl-load-migration-considerations.md).
Copy file name to clipboardExpand all lines: articles/synapse-analytics/migration-guides/netezza/2-etl-load-migration-considerations.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -105,7 +105,7 @@ The primary drivers for choosing a virtual data mart implementation over a physi
105
105
106
106
- Lower total cost of ownership—a virtualized implementation requires fewer data stores and copies of data.
107
107
108
-
- Elimination of ETL jobs to migrate and simplify DW architecture in a virtualized environment.
108
+
- Elimination of ETL jobs to migrate and simplify data warehouse architecture in a virtualized environment.
109
109
110
110
- Performance—although physical data marts have historically been more performant, virtualization products now implement intelligent caching techniques to mitigate.
111
111
@@ -210,7 +210,7 @@ In the preceding flowchart, decision 1 relates to a high-level decision about wh
210
210
> [!TIP]
211
211
> Leverage investment in existing third-party tools to reduce cost and risk.
212
212
213
-
If a third-party ETL tool is already in use, and especially if there's a large investment in skills or several existing workflows and schedules use that tool, then decision 3 is whether the tool can efficiently support Azure Synapse as a target environment. Ideally, the tool will include 'native' connectors that can leverage Azure facilities like PolyBase or [COPY INTO](/sql/t-sql/statements/copy-into-transact-sql), for the most efficient parallel data loading. There's a way to call an external process, such as PolyBase or COPY INTO, and pass in the appropriate parameters. In this case, leverage existing skills and workflows, with Azure Synapse as the new target environment.
213
+
If a third-party ETL tool is already in use, and especially if there's a large investment in skills or several existing workflows and schedules use that tool, then decision 3 is whether the tool can efficiently support Azure Synapse as a target environment. Ideally, the tool will include 'native' connectors that can leverage Azure facilities like PolyBase or [COPY INTO](/sql/t-sql/statements/copy-into-transact-sql), for the most efficient parallel data loading. There's a way to call an external process, such as PolyBase or `COPY INTO`, and pass in the appropriate parameters. In this case, leverage existing skills and workflows, with Azure Synapse as the new target environment.
214
214
215
215
If you decide to retain an existing third-party ETL tool, there may be benefits to running that tool within the Azure environment (rather than on an existing on-premises ETL server) and having Azure Data Factory handle the overall orchestration of the existing workflows. One particular benefit is that less data needs to be downloaded from Azure, processed, and then uploaded back into Azure. So, decision 4 is whether to leave the existing tool running as-is or to move it into the Azure environment to achieve cost, performance, and scalability benefits.
216
216
@@ -223,7 +223,7 @@ If some or all the existing Netezza warehouse ETL/ELT processing is handled by c
223
223
224
224
Some elements of the ETL process are easy to migrate. For example, by simple bulk data load into a staging table from an external file. It may even be possible to automate those parts of the process, for example, by using PolyBase instead of nzload. Other parts of the process that contain arbitrary complex SQL and/or stored procedures will take more time to re-engineer.
225
225
226
-
One way of testing Netezza SQL for compatibility with Azure Synapse is to capture some representative SQL statements from Netezza query history, then prefix those queries with 'EXPLAIN', and then (assuming a like-for-like migrated data model in Azure Synapse) run those EXPLAIN statements in Azure Synapse. Any incompatible SQL will generate an error, and the error information can determine the scale of the recoding task.
226
+
One way of testing Netezza SQL for compatibility with Azure Synapse is to capture some representative SQL statements from Netezza query history, then prefix those queries with `EXPLAIN`, and then (assuming a like-for-like migrated data model in Azure Synapse) run those EXPLAIN statements in Azure Synapse. Any incompatible SQL will generate an error, and the error information can determine the scale of the recoding task.
227
227
228
228
[Microsoft partners](/azure/sql-data-warehouse/sql-data-warehouse-partner-data-integration) offer tools and services to migrate Netezza SQL and stored procedures to Azure Synapse.
229
229
@@ -306,4 +306,4 @@ To summarize, our recommendations for migrating data and associated ETL processe
306
306
307
307
## Next steps
308
308
309
-
To learn more about security access operations, see the next article in this series: [Security, access, and operations for Netezza migrations](3-security-access-operations.md)].
309
+
To learn more about security access operations, see the next article in this series: [Security, access, and operations for Netezza migrations](3-security-access-operations.md).
Copy file name to clipboardExpand all lines: articles/synapse-analytics/migration-guides/netezza/3-security-access-operations.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ This article is part three of a seven part series that provides guidance on how
18
18
19
19
## Security considerations
20
20
21
-
This article discusses the methods of connection for existing legacy Teradata environments and how they can be migrated to Azure Synapse with minimal risk and user impact.
21
+
This article discusses the methods of connection for existing legacy Netezza environments and how they can be migrated to Azure Synapse with minimal risk and user impact.
22
22
23
23
It's assumed that there's a requirement to migrate the existing methods of connection and user/role/permission structure as-is. If this isn't the case, then use Azure utilities such as Azure portal to create and manage a new security regime.
24
24
@@ -313,4 +313,4 @@ Adding more compute nodes adds more compute power and ability to leverage more p
313
313
314
314
## Next steps
315
315
316
-
To learn more about visualization and reporting, see the next article in this series: [Visualization and reporting for Netezza migrations](4-visualization-reporting.md)].
316
+
To learn more about visualization and reporting, see the next article in this series: [Visualization and reporting for Netezza migrations](4-visualization-reporting.md).
0 commit comments