You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: Azure Synapse Link for Azure Cosmos DB, benefits, and when to use it
3
-
description: Learn about Azure Synapse Link for Azure Cosmos DB. Synapse Link lets you run near real-time analytics using Azure Synapse Analytics over operational data (HTAP) in Azure Cosmos DB.
3
+
description: Learn about Azure Synapse Link for Azure Cosmos DB. Synapse Link lets you run near real-time analytics (HTAP) using Azure Synapse Analytics over operational data in Azure Cosmos DB.
4
4
author: srchi
5
5
ms.author: srchi
6
6
ms.service: cosmos-db
@@ -9,42 +9,50 @@ ms.date: 05/19/2020
9
9
ms.reviewer: sngun
10
10
---
11
11
12
-
# What is Azure Synapse Link for Azure Cosmos DB (preview)?
12
+
# What is Azure Synapse Link for Azure Cosmos DB (Preview)?
13
13
14
14
> [!IMPORTANT]
15
15
> Azure Synapse Link for Azure Cosmos DB is currently in preview. This preview version is provided without a service level agreement, and it's not recommended for production workloads. For more information, see [Supplemental terms of use for Microsoft Azure previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
16
16
17
17
Azure Synapse Link for Azure Cosmos DB is a cloud-native hybrid transactional and analytical processing (HTAP) capability that enables you to run near real-time analytics over operational data in Azure Cosmos DB. Azure Synapse Link creates a tight seamless integration between Azure Cosmos DB and Azure Synapse Analytics.
18
18
19
-
Using [Azure Cosmos DB analytical store](analytical-store-introduction.md), a fully isolated column store, Azure Synapse Link enables no Extract-Transform-Load (ETL) analytics in [Azure Synapse Analytics](../synapse-analytics/overview-what-is.md) against your operational data at scale. Business analysts, data engineers and data scientists can now use Synapse Spark or Synapse SQL in interchangeably to run near real-time business intelligence, analytics, and machine learning pipelines. You can achieve all without impacting the performance of your transactional workloads on Azure Cosmos DB.
19
+
Using [Azure Cosmos DB analytical store](analytical-store-introduction.md), a fully isolated column store, Azure Synapse Link enables no Extract-Transform-Load (ETL) analytics in [Azure Synapse Analytics](../synapse-analytics/overview-what-is.md) against your operational data at scale. Business analysts, data engineers and data scientists can now use Synapse Spark or Synapse SQL interchangeably to run near real-time business intelligence, analytics, and machine learning pipelines. You can achieve this without impacting the performance of your transactional workloads on Azure Cosmos DB.
20
20
21
21
The following image shows the Azure Synapse Link integration with Azure Cosmos DB and Azure Synapse Analytics:
22
22
23
23

24
24
25
-
## <aid="synapse-link-benefits"></a> Benefits of Synapse Link
25
+
## <aid="synapse-link-benefits"></a> Benefits
26
26
27
-
To analyze large operational datasets without impacting the performance of mission-critical transactional workloads, traditionally, the operational data in Azure Cosmos DB is extracted and processed by Extract-Transform-Load (ETL) pipelines. ETL pipelines require many layers of data movement resulting in much operational complexity, performance impact on your transactional workloads. It also increases the latency to analyze the operational data from the time of origin.
27
+
To analyze large operational datasets while minimizing the impact on the performance of mission-critical transactional workloads, traditionally, the operational data in Azure Cosmos DB is extracted and processed by Extract-Transform-Load (ETL) pipelines. ETL pipelines require many layers of data movement resulting in much operational complexity, and performance impact on your transactional workloads. It also increases the latency to analyze the operational data from the time of origin.
28
28
29
-
When compared to the traditional ETL solutions, Synapse Link for Azure Cosmos DB offers several advantages such as:
29
+
When compared to the traditional ETL-based solutions, Azure Synapse Link for Azure Cosmos DB offers several advantages such as:
30
30
31
-
### Reduced complexity with No ETL analytics
31
+
### Reduced complexity with No ETL jobs to manage
32
32
33
-
Synapse Link allows you to directly access Azure Cosmos DB analytical store in Azure Synapse Analytics without any connectors. Azure Cosmos DB analytical store automatically stores operational data in a query-optimized column format, which you can later analyze in near real-time using Synapse Analytics, without complex data movement. Any updates made to the operational data are visible in the analytical store in near realtime with no ETL or change feed. You can query the analytical data directly using Synapse Analytics.
33
+
Azure Synapse Link allows you to directly access Azure Cosmos DB analytical store using Azure Synapse Analytics without complex data movement. Any updates made to the operational data are visible in the analytical store in near real-time with no ETL or change feed. You can run large scale analytics against analytical store, from Synapse Analytics, without additional data transformation.
34
34
35
-
### Performance isolation from transactional workloads
35
+
### Near real-time insights into your operational data
36
36
37
-
With Synapse Link, you can run analytical queries against an Azure Cosmos DB analytical store (a separate column store) while the transactional operations are processed using provisioned throughput for the transactional workload (a row-based transactional store). The analytical workload traffic is served independent of the transactional workload traffic without any impact on the throughput provisioned for your operational data.
37
+
You can now get rich insights on your operational data in near real-time, using Azure Synapse Link. ETL-based systems tend to have higher latency for analyzing your operational data, due to many layers to extract, transform and load the operational data. With native integration of Azure Cosmos DB analytical store with Azure Synapse Analytics, you can analyze operational data in near real-time enabling new business scenarios.
38
+
39
+
40
+
### No impact on operational workloads
41
+
42
+
With Azure Synapse Link, you can run analytical queries against an Azure Cosmos DB analytical store (a separate column store) while the transactional operations are processed using provisioned throughput for the transactional workload (a row-based transactional store). The analytical workload is served independent of the transactional workload traffic without consuming any of the throughput provisioned for your operational data.
38
43
39
44
### Optimized for large-scale analytics workloads
40
45
41
-
Azure Cosmos DB analytical store is optimized to provide scalability, elasticity, and performance for analytical workloads without any dependency on the compute run-times. The storage technology is self-managed to optimize your analytics workloads without manual efforts. With built-in support into Azure Synapse Analytics, accessing this storage layer provides simplicity and high performance.
46
+
Azure Cosmos DB analytical store is optimized to provide scalability, elasticity, and performance for analytical workloads without any dependency on the compute run-times. The storage technology is self-managed to optimize your analytics workloads. With built-in support into Azure Synapse Analytics, accessing this storage layer provides simplicity and high performance.
42
47
43
48
### Cost effective
44
49
45
-
Synapse Link eliminates the extra layers of storage and compute required in traditional ETL pipelines to analyze the operational data. You can get a cost-optimized, fully managed solution for operational analytics especially with growing data volumes. Azure Cosmos DB analytical store follows a consumption-based pricing model, which is based on data storage and queries executed. It doesn’t require you to provision any throughput, as you do today for the transactional workloads. Accessing your data with highly elastic compute engines from Azure Synapse Analytics makes the overall cost of running storage and compute efficient.
50
+
With Azure Synapse Link, you can get a cost-optimized, fully managed solution for operational analytics. It eliminates the extra layers of storage and compute required in traditional ETL pipelines for analyzing operational data.
51
+
52
+
Azure Cosmos DB analytical store follows a consumption-based pricing model, which is based on data storage and analytical read/write operationsand queries executed . It doesn’t require you to provision any throughput, as you do today for the transactional workloads. Accessing your data with highly elastic compute engines from Azure Synapse Analytics makes the overall cost of running storage and compute very efficient.
46
53
47
-
### Analytics for globally distributed, multi master data
54
+
55
+
### Analytics for locally available, globally distributed, multi master data
48
56
49
57
You can run analytical queries effectively against the nearest regional copy of the data in Azure Cosmos DB. Azure Cosmos DB provides the state-of-the-art capability to run the globally distributed analytical workloads along with transactional workloads in an active-active manner.
50
58
@@ -54,24 +62,24 @@ Synapse Link brings together Azure Cosmos DB analytical store with Azure Synapse
54
62
55
63
### Azure Cosmos DB analytical store
56
64
57
-
Azure Cosmos DB analytical store is a columnar representation of your operational data in Azure Cosmos DB. This analytical store is suitable for fast, cost effective queries on large operational data sets, without copying data and impacting the performance of your transactional workloads.
65
+
Azure Cosmos DB analytical store is a column-oriented representation of your operational data in Azure Cosmos DB. This analytical store is suitable for fast, cost effective queries on large operational data sets, without copying data and impacting the performance of your transactional workloads.
58
66
59
-
Analytical store automatically picks up high frequency inserts, updates, deletes in your transactional workloads in near real time, as a fully managed capability (“auto-sync”) of Azure Cosmos DB. No change feed or ETL is required. It contains the complete version history of all the transactional updates that occurred in your Azure Cosmos DB container.
67
+
Analytical store automatically picks up high frequency inserts, updates, deletes in your transactional workloads in near real time, as a fully managed capability (“auto-sync”) of Azure Cosmos DB. No change feed or ETL is required.
60
68
61
69
If you have a globally distributed Azure Cosmos DB account, after you enable analytical store for a container, it will be available in all regions for that account. For more information on the analytical store, see [Azure Cosmos DB Analytical store overview](analytical-store-introduction.md) article.
62
70
63
71
### <aid="synapse-link-integration"></a>Integration with Azure Synapse Analytics
64
72
65
73
With Synapse Link, you can now directly connect to your Azure Cosmos DB containers from Azure Synapse Analytics and access the analytical store with no separate connectors. Azure Synapse Analytics currently supports Synapse Link with [Synapse Apache Spark](../synapse-analytics/spark/apache-spark-concepts.md) and [Synapse SQL Serverless](../synapse-analytics/sql/on-demand-workspace-overview.md).
66
74
67
-
You can query the data from Azure Cosmos DB analytical store simultaneously, with interop across different analytics run times supported by Synapse Analytics. You can query and analyze the analytical store using:
75
+
You can query the data from Azure Cosmos DB analytical store simultaneously, with interop across different analytics run times supported by Azure Synapse Analytics. No additional data transformations are required to analyze the operational data. You can query and analyze the analytical store data using:
68
76
69
-
* Synapse Apache Spark with full support for Scala, Python, SparkSQL, and C#. Synapse Spark is central to data engineering and science scenarios
77
+
* Synapse Apache Spark with full support for Scala, Python, SparkSQL, and C#. Synapse Spark is central to data engineering and data science scenarios
70
78
71
79
* SQL serverless with T-SQL language and support for familiar BI tools (for example, Power BI Premium, etc.)
72
80
73
81
> [!NOTE]
74
-
> From Azure Synapse Analytics, you can access both analytical and transactional stores in your Azure Cosmos DB container. If you want to run large-scale analytics or scans on your operational data, we recommend that you use analytical store to avoid performance impact on transactional workloads.
82
+
> From Azure Synapse Analytics, you can access both analytical and transactional stores in your Azure Cosmos DB container. However, if you want to run large-scale analytics or scans on your operational data, we recommend that you use analytical store to avoid performance impact on transactional workloads.
75
83
76
84
> [!NOTE]
77
85
> You can run analytics with low latency in an Azure region by connecting your Azure Cosmos DB container to Synapse runtime in that region.
@@ -84,52 +92,49 @@ This integration enables the following HTAP scenarios for different users:
84
92
85
93
* A data scientist who wants to use Synapse Spark to find a feature to improve their model and train that model without doing complex data engineering. They can also write the results of the model post inference into Azure Cosmos DB for real-time scoring on the data through Spark Synapse.
86
94
87
-
* A data engineer who wants to make data accessible for consumers of operational data in Azure Cosmos DB by creating SQL or Spark tables over the containers without manual ETL processes.
95
+
* A data engineer who wants to make data accessible for consumers, by creating SQL or Spark tables over Azure Cosmos DB containers without manual ETL processes.
88
96
89
97
For more information on Azure Synapse Analytics runtime support for Azure Cosmos DB, see [Azure Synapse Analytics for Cosmos DB support]().
90
98
91
99
## When to use Azure Synapse Link for Azure Cosmos DB?
92
100
93
101
Synapse Link is recommended in the following cases:
94
102
95
-
* If you are a new or an existing Azure Cosmos DB customer and you want to run analytics, BI, and machine learning over your operational data. In such cases, Synapse Link provides a more integrated analytics experience without impacting your transactional store’s provisioned throughput. For example:
103
+
* If you are an Azure Cosmos DB customer and you want to run analytics, BI, and machine learning over your operational data. In such cases, Synapse Link provides a more integrated analytics experience without impacting your transactional store’s provisioned throughput. For example:
96
104
97
105
* If you are running analytics or BI on your Azure Cosmos DB operational data directly using separate connectors today, or
98
106
99
-
* If you are running manual ETL processes to extract operational data into a separate analytics system.
100
-
101
-
* If you are looking for archival solutions and cost savings for your Azure Cosmos DB data. Instead of using a separate cold storage system by manually running ETL processes, you can use the analytical store, which has a consumption-based pricing model and doesn’t require you to provision any RU/s.
102
-
103
-
Synapse Link is not recommended if you are looking for traditional data warehouse requirements such as high concurrency, workload management, and persistence of aggregates across multiple data sources. For more information, see [common scenarios that can be powered with Synapse Link for Azure Cosmos DB](analytics-usecases.md).
107
+
* If you are running ETL processes to extract operational data into a separate analytics system.
108
+
109
+
In such cases, Synapse Link provides a more integrated analytics experience without impacting your transactional store’s provisioned throughput.
104
110
105
-
## Supported regions
111
+
Synapse Link is not recommended if you are looking for traditional data warehouse requirements such as high concurrency, workload management, and persistence of aggregates across multiple data sources. For more information, see [common scenarios that can be powered with Azure Synapse Link for Azure Cosmos DB](synapse-link-use-cases.md).
106
112
107
-
Synapse Link for Azure Cosmos DB is currently available in the following Azure regions: US West Central, East US, West US2, North Europe, West Europe, South Central US, Southeast Asia, Australia East, East U2, UK South.
108
113
109
114
## Limitations
110
115
111
-
* During the public preview, Synapse Link is supported only for the Azure Cosmos DB SQL (Core) API. Support for Azure Cosmos DB’s API for MongoDB & Cassandra API are currently under a gated preview. To request access to the gated preview, email the [Azure Cosmos DB team](mailto:[email protected]).
116
+
* During the public preview, Azure Synapse Link is supported only for the Azure Cosmos DB SQL (Core) API. Support for Azure Cosmos DB’s API for MongoDB & Cassandra API are currently under a gated preview. To request access to the gated preview, email the [Azure Cosmos DB team](mailto:[email protected]).
112
117
113
118
* Currently, the analytical store can only be enabled for new containers (both in new and existing Azure Cosmos DB accounts).
114
119
115
120
* Accessing the Azure Cosmos DB analytic store with Synapse SQL serverless is currently under gated preview. To request access, email the [Azure Cosmos DB team](mailto:[email protected]).
116
121
117
-
* Accessing the Azure Cosmos DB analytics store with Synapse SQL provisioned is currently not available.
122
+
* Accessing the Azure Cosmos DB analytics store with Synapse SQL provisioned is currently not available.
118
123
119
124
## Pricing
120
125
121
-
The billing model of Azure Synapse Link includes the costs incurred by using the Azure Cosmos DB analytical store and the Synapse runtime. To learn more, see the [Azure Cosmos DB analytical store pricing](analytical-store-introduction.md#analytical-store-pricing) and [Azure Synapse Analytics pricing]() articles.
126
+
The billing model of Azure Synapse Link translates to the costs incurred by using the Azure Cosmos DB analytical store and the Synapse runtime. To learn more, see the [Azure Cosmos DB analytical store pricing](analytical-store-introduction.md#analytical-store-pricing) and [Azure Synapse Analytics pricing]() articles.
122
127
123
128
## Next steps
124
129
125
130
To learn more, see the following docs:
126
131
127
-
*[Get started with Azure Synapse Link for Azure Cosmos DB.](configure-synapse-link.md)
132
+
*[Azure Cosmos DB analytical store overview](analytical-store-introduction.md)
128
133
129
-
*[Azure Cosmos DB analytical store overview.](analytical-store-introduction.md)
134
+
*[Get started with Azure Synapse Link for Azure Cosmos DB](configure-synapse-link.md)
130
135
131
-
*[What is supported in Azure Synapse Analytics run time.]()
136
+
*[What is supported in Azure Synapse Analytics run time]()
132
137
133
-
*[Frequently asked questions about Azure Synapse Link for Azure Cosmos DB.](synapse-link-frequently-asked-questions.md)
138
+
*[Frequently asked questions about Azure Synapse Link for Azure Cosmos DB](synapse-link-frequently-asked-questions.md)
134
139
135
-
*[Azure Synapse Link for Azure Cosmos DB Use cases.](synapse-link-use-cases.md)
140
+
*[Azure Synapse Link for Azure Cosmos DB Use cases](synapse-link-use-cases.md)
0 commit comments