Skip to content

Commit fcbda40

Browse files
Updated Synapse Link Overview doc
Per team feedback and other critical edits we noticed during reviews.
1 parent 150faf4 commit fcbda40

File tree

1 file changed

+40
-35
lines changed

1 file changed

+40
-35
lines changed

articles/cosmos-db/synapse-link.md

Lines changed: 40 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Azure Synapse Link for Azure Cosmos DB, benefits, and when to use it
3-
description: Learn about Azure Synapse Link for Azure Cosmos DB. Synapse Link lets you run near real-time analytics using Azure Synapse Analytics over operational data (HTAP) in Azure Cosmos DB.
3+
description: Learn about Azure Synapse Link for Azure Cosmos DB. Synapse Link lets you run near real-time analytics (HTAP) using Azure Synapse Analytics over operational data in Azure Cosmos DB.
44
author: srchi
55
ms.author: srchi
66
ms.service: cosmos-db
@@ -9,42 +9,50 @@ ms.date: 05/19/2020
99
ms.reviewer: sngun
1010
---
1111

12-
# What is Azure Synapse Link for Azure Cosmos DB (preview)?
12+
# What is Azure Synapse Link for Azure Cosmos DB (Preview)?
1313

1414
> [!IMPORTANT]
1515
> Azure Synapse Link for Azure Cosmos DB is currently in preview. This preview version is provided without a service level agreement, and it's not recommended for production workloads. For more information, see [Supplemental terms of use for Microsoft Azure previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
1616
1717
Azure Synapse Link for Azure Cosmos DB is a cloud-native hybrid transactional and analytical processing (HTAP) capability that enables you to run near real-time analytics over operational data in Azure Cosmos DB. Azure Synapse Link creates a tight seamless integration between Azure Cosmos DB and Azure Synapse Analytics.
1818

19-
Using [Azure Cosmos DB analytical store](analytical-store-introduction.md), a fully isolated column store, Azure Synapse Link enables no Extract-Transform-Load (ETL) analytics in [Azure Synapse Analytics](../synapse-analytics/overview-what-is.md) against your operational data at scale. Business analysts, data engineers and data scientists can now use Synapse Spark or Synapse SQL in interchangeably to run near real-time business intelligence, analytics, and machine learning pipelines. You can achieve all without impacting the performance of your transactional workloads on Azure Cosmos DB.
19+
Using [Azure Cosmos DB analytical store](analytical-store-introduction.md), a fully isolated column store, Azure Synapse Link enables no Extract-Transform-Load (ETL) analytics in [Azure Synapse Analytics](../synapse-analytics/overview-what-is.md) against your operational data at scale. Business analysts, data engineers and data scientists can now use Synapse Spark or Synapse SQL interchangeably to run near real-time business intelligence, analytics, and machine learning pipelines. You can achieve this without impacting the performance of your transactional workloads on Azure Cosmos DB.
2020

2121
The following image shows the Azure Synapse Link integration with Azure Cosmos DB and Azure Synapse Analytics:
2222

2323
![Architecture diagram for Azure Synapse Analytics integration with Azure Cosmos DB](./media/synapse-link/synapse-analytics-cosmos-db-architecture.png)
2424

25-
## <a id="synapse-link-benefits"></a> Benefits of Synapse Link
25+
## <a id="synapse-link-benefits"></a> Benefits
2626

27-
To analyze large operational datasets without impacting the performance of mission-critical transactional workloads, traditionally, the operational data in Azure Cosmos DB is extracted and processed by Extract-Transform-Load (ETL) pipelines. ETL pipelines require many layers of data movement resulting in much operational complexity, performance impact on your transactional workloads. It also increases the latency to analyze the operational data from the time of origin.
27+
To analyze large operational datasets while minimizing the impact on the performance of mission-critical transactional workloads, traditionally, the operational data in Azure Cosmos DB is extracted and processed by Extract-Transform-Load (ETL) pipelines. ETL pipelines require many layers of data movement resulting in much operational complexity, and performance impact on your transactional workloads. It also increases the latency to analyze the operational data from the time of origin.
2828

29-
When compared to the traditional ETL solutions, Synapse Link for Azure Cosmos DB offers several advantages such as:
29+
When compared to the traditional ETL-based solutions, Azure Synapse Link for Azure Cosmos DB offers several advantages such as:
3030

31-
### Reduced complexity with No ETL analytics
31+
### Reduced complexity with No ETL jobs to manage
3232

33-
Synapse Link allows you to directly access Azure Cosmos DB analytical store in Azure Synapse Analytics without any connectors. Azure Cosmos DB analytical store automatically stores operational data in a query-optimized column format, which you can later analyze in near real-time using Synapse Analytics, without complex data movement. Any updates made to the operational data are visible in the analytical store in near real time with no ETL or change feed. You can query the analytical data directly using Synapse Analytics.
33+
Azure Synapse Link allows you to directly access Azure Cosmos DB analytical store using Azure Synapse Analytics without complex data movement. Any updates made to the operational data are visible in the analytical store in near real-time with no ETL or change feed. You can run large scale analytics against analytical store, from Synapse Analytics, without additional data transformation.
3434

35-
### Performance isolation from transactional workloads
35+
### Near real-time insights into your operational data
3636

37-
With Synapse Link, you can run analytical queries against an Azure Cosmos DB analytical store (a separate column store) while the transactional operations are processed using provisioned throughput for the transactional workload (a row-based transactional store). The analytical workload traffic is served independent of the transactional workload traffic without any impact on the throughput provisioned for your operational data.
37+
You can now get rich insights on your operational data in near real-time, using Azure Synapse Link. ETL-based systems tend to have higher latency for analyzing your operational data, due to many layers to extract, transform and load the operational data. With native integration of Azure Cosmos DB analytical store with Azure Synapse Analytics, you can analyze operational data in near real-time enabling new business scenarios.
38+
39+
40+
### No impact on operational workloads
41+
42+
With Azure Synapse Link, you can run analytical queries against an Azure Cosmos DB analytical store (a separate column store) while the transactional operations are processed using provisioned throughput for the transactional workload (a row-based transactional store). The analytical workload is served independent of the transactional workload traffic without consuming any of the throughput provisioned for your operational data.
3843

3944
### Optimized for large-scale analytics workloads
4045

41-
Azure Cosmos DB analytical store is optimized to provide scalability, elasticity, and performance for analytical workloads without any dependency on the compute run-times. The storage technology is self-managed to optimize your analytics workloads without manual efforts. With built-in support into Azure Synapse Analytics, accessing this storage layer provides simplicity and high performance.
46+
Azure Cosmos DB analytical store is optimized to provide scalability, elasticity, and performance for analytical workloads without any dependency on the compute run-times. The storage technology is self-managed to optimize your analytics workloads. With built-in support into Azure Synapse Analytics, accessing this storage layer provides simplicity and high performance.
4247

4348
### Cost effective
4449

45-
Synapse Link eliminates the extra layers of storage and compute required in traditional ETL pipelines to analyze the operational data. You can get a cost-optimized, fully managed solution for operational analytics especially with growing data volumes. Azure Cosmos DB analytical store follows a consumption-based pricing model, which is based on data storage and queries executed. It doesn’t require you to provision any throughput, as you do today for the transactional workloads. Accessing your data with highly elastic compute engines from Azure Synapse Analytics makes the overall cost of running storage and compute efficient.
50+
With Azure Synapse Link, you can get a cost-optimized, fully managed solution for operational analytics. It eliminates the extra layers of storage and compute required in traditional ETL pipelines for analyzing operational data.
51+
52+
Azure Cosmos DB analytical store follows a consumption-based pricing model, which is based on data storage and analytical read/write operationsand queries executed . It doesn’t require you to provision any throughput, as you do today for the transactional workloads. Accessing your data with highly elastic compute engines from Azure Synapse Analytics makes the overall cost of running storage and compute very efficient.
4653

47-
### Analytics for globally distributed, multi master data
54+
55+
### Analytics for locally available, globally distributed, multi master data
4856

4957
You can run analytical queries effectively against the nearest regional copy of the data in Azure Cosmos DB. Azure Cosmos DB provides the state-of-the-art capability to run the globally distributed analytical workloads along with transactional workloads in an active-active manner.
5058

@@ -54,24 +62,24 @@ Synapse Link brings together Azure Cosmos DB analytical store with Azure Synapse
5462

5563
### Azure Cosmos DB analytical store
5664

57-
Azure Cosmos DB analytical store is a columnar representation of your operational data in Azure Cosmos DB. This analytical store is suitable for fast, cost effective queries on large operational data sets, without copying data and impacting the performance of your transactional workloads.
65+
Azure Cosmos DB analytical store is a column-oriented representation of your operational data in Azure Cosmos DB. This analytical store is suitable for fast, cost effective queries on large operational data sets, without copying data and impacting the performance of your transactional workloads.
5866

59-
Analytical store automatically picks up high frequency inserts, updates, deletes in your transactional workloads in near real time, as a fully managed capability (“auto-sync”) of Azure Cosmos DB. No change feed or ETL is required. It contains the complete version history of all the transactional updates that occurred in your Azure Cosmos DB container.
67+
Analytical store automatically picks up high frequency inserts, updates, deletes in your transactional workloads in near real time, as a fully managed capability (“auto-sync”) of Azure Cosmos DB. No change feed or ETL is required.
6068

6169
If you have a globally distributed Azure Cosmos DB account, after you enable analytical store for a container, it will be available in all regions for that account. For more information on the analytical store, see [Azure Cosmos DB Analytical store overview](analytical-store-introduction.md) article.
6270

6371
### <a id="synapse-link-integration"></a>Integration with Azure Synapse Analytics
6472

6573
With Synapse Link, you can now directly connect to your Azure Cosmos DB containers from Azure Synapse Analytics and access the analytical store with no separate connectors. Azure Synapse Analytics currently supports Synapse Link with [Synapse Apache Spark](../synapse-analytics/spark/apache-spark-concepts.md) and [Synapse SQL Serverless](../synapse-analytics/sql/on-demand-workspace-overview.md).
6674

67-
You can query the data from Azure Cosmos DB analytical store simultaneously, with interop across different analytics run times supported by Synapse Analytics. You can query and analyze the analytical store using:
75+
You can query the data from Azure Cosmos DB analytical store simultaneously, with interop across different analytics run times supported by Azure Synapse Analytics. No additional data transformations are required to analyze the operational data. You can query and analyze the analytical store data using:
6876

69-
* Synapse Apache Spark with full support for Scala, Python, SparkSQL, and C#. Synapse Spark is central to data engineering and science scenarios
77+
* Synapse Apache Spark with full support for Scala, Python, SparkSQL, and C#. Synapse Spark is central to data engineering and data science scenarios
7078

7179
* SQL serverless with T-SQL language and support for familiar BI tools (for example, Power BI Premium, etc.)
7280

7381
> [!NOTE]
74-
> From Azure Synapse Analytics, you can access both analytical and transactional stores in your Azure Cosmos DB container. If you want to run large-scale analytics or scans on your operational data, we recommend that you use analytical store to avoid performance impact on transactional workloads.
82+
> From Azure Synapse Analytics, you can access both analytical and transactional stores in your Azure Cosmos DB container. However, if you want to run large-scale analytics or scans on your operational data, we recommend that you use analytical store to avoid performance impact on transactional workloads.
7583
7684
> [!NOTE]
7785
> You can run analytics with low latency in an Azure region by connecting your Azure Cosmos DB container to Synapse runtime in that region.
@@ -84,52 +92,49 @@ This integration enables the following HTAP scenarios for different users:
8492

8593
* A data scientist who wants to use Synapse Spark to find a feature to improve their model and train that model without doing complex data engineering. They can also write the results of the model post inference into Azure Cosmos DB for real-time scoring on the data through Spark Synapse.
8694

87-
* A data engineer who wants to make data accessible for consumers of operational data in Azure Cosmos DB by creating SQL or Spark tables over the containers without manual ETL processes.
95+
* A data engineer who wants to make data accessible for consumers, by creating SQL or Spark tables over Azure Cosmos DB containers without manual ETL processes.
8896

8997
For more information on Azure Synapse Analytics runtime support for Azure Cosmos DB, see [Azure Synapse Analytics for Cosmos DB support]().
9098

9199
## When to use Azure Synapse Link for Azure Cosmos DB?
92100

93101
Synapse Link is recommended in the following cases:
94102

95-
* If you are a new or an existing Azure Cosmos DB customer and you want to run analytics, BI, and machine learning over your operational data. In such cases, Synapse Link provides a more integrated analytics experience without impacting your transactional store’s provisioned throughput. For example:
103+
* If you are an Azure Cosmos DB customer and you want to run analytics, BI, and machine learning over your operational data. In such cases, Synapse Link provides a more integrated analytics experience without impacting your transactional store’s provisioned throughput. For example:
96104

97105
* If you are running analytics or BI on your Azure Cosmos DB operational data directly using separate connectors today, or
98106

99-
* If you are running manual ETL processes to extract operational data into a separate analytics system.
100-
101-
* If you are looking for archival solutions and cost savings for your Azure Cosmos DB data. Instead of using a separate cold storage system by manually running ETL processes, you can use the analytical store, which has a consumption-based pricing model and doesn’t require you to provision any RU/s.
102-
103-
Synapse Link is not recommended if you are looking for traditional data warehouse requirements such as high concurrency, workload management, and persistence of aggregates across multiple data sources. For more information, see [common scenarios that can be powered with Synapse Link for Azure Cosmos DB](analytics-usecases.md).
107+
* If you are running ETL processes to extract operational data into a separate analytics system.
108+
109+
In such cases, Synapse Link provides a more integrated analytics experience without impacting your transactional store’s provisioned throughput.
104110

105-
## Supported regions
111+
Synapse Link is not recommended if you are looking for traditional data warehouse requirements such as high concurrency, workload management, and persistence of aggregates across multiple data sources. For more information, see [common scenarios that can be powered with Azure Synapse Link for Azure Cosmos DB](synapse-link-use-cases.md).
106112

107-
Synapse Link for Azure Cosmos DB is currently available in the following Azure regions: US West Central, East US, West US2, North Europe, West Europe, South Central US, Southeast Asia, Australia East, East U2, UK South.
108113

109114
## Limitations
110115

111-
* During the public preview, Synapse Link is supported only for the Azure Cosmos DB SQL (Core) API. Support for Azure Cosmos DB’s API for MongoDB & Cassandra API are currently under a gated preview. To request access to the gated preview, email the [Azure Cosmos DB team](mailto:[email protected]).
116+
* During the public preview, Azure Synapse Link is supported only for the Azure Cosmos DB SQL (Core) API. Support for Azure Cosmos DB’s API for MongoDB & Cassandra API are currently under a gated preview. To request access to the gated preview, email the [Azure Cosmos DB team](mailto:[email protected]).
112117

113118
* Currently, the analytical store can only be enabled for new containers (both in new and existing Azure Cosmos DB accounts).
114119

115120
* Accessing the Azure Cosmos DB analytic store with Synapse SQL serverless is currently under gated preview. To request access, email the [Azure Cosmos DB team](mailto:[email protected]).
116121

117-
* Accessing the Azure Cosmos DB analytics store with Synapse SQL provisioned is currently not available.
122+
* Accessing the Azure Cosmos DB analytics store with Synapse SQL provisioned is currently not available.
118123

119124
## Pricing
120125

121-
The billing model of Azure Synapse Link includes the costs incurred by using the Azure Cosmos DB analytical store and the Synapse runtime. To learn more, see the [Azure Cosmos DB analytical store pricing](analytical-store-introduction.md#analytical-store-pricing) and [Azure Synapse Analytics pricing]() articles.
126+
The billing model of Azure Synapse Link translates to the costs incurred by using the Azure Cosmos DB analytical store and the Synapse runtime. To learn more, see the [Azure Cosmos DB analytical store pricing](analytical-store-introduction.md#analytical-store-pricing) and [Azure Synapse Analytics pricing]() articles.
122127

123128
## Next steps
124129

125130
To learn more, see the following docs:
126131

127-
* [Get started with Azure Synapse Link for Azure Cosmos DB.](configure-synapse-link.md)
132+
* [Azure Cosmos DB analytical store overview](analytical-store-introduction.md)
128133

129-
* [Azure Cosmos DB analytical store overview.](analytical-store-introduction.md)
134+
* [Get started with Azure Synapse Link for Azure Cosmos DB](configure-synapse-link.md)
130135

131-
* [What is supported in Azure Synapse Analytics run time.]()
136+
* [What is supported in Azure Synapse Analytics run time]()
132137

133-
* [Frequently asked questions about Azure Synapse Link for Azure Cosmos DB.](synapse-link-frequently-asked-questions.md)
138+
* [Frequently asked questions about Azure Synapse Link for Azure Cosmos DB](synapse-link-frequently-asked-questions.md)
134139

135-
* [Azure Synapse Link for Azure Cosmos DB Use cases.](synapse-link-use-cases.md)
140+
* [Azure Synapse Link for Azure Cosmos DB Use cases](synapse-link-use-cases.md)

0 commit comments

Comments
 (0)