Skip to content

Commit 766ecea

Browse files
committed
Merge branch 'release-build-synapse' of https://github.com/MicrosoftDocs/azure-docs-pr into release-build-synapse
2 parents 7f9e16d + a5071d2 commit 766ecea

File tree

4 files changed

+184
-11
lines changed

4 files changed

+184
-11
lines changed
Lines changed: 20 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: Synapse Link for Cosmos DB supported features
3-
description: Understand the current list of actions supported by Synapse Link for Cosmos DB
2+
title: Azure Synapse Link for Cosmos DB supported features
3+
description: Understand the current list of actions supported by Azure Synapse Link for Cosmos DB
44
services: synapse-analytics
55
author: ArnoMicrosoft
66
ms.service: synapse-analytics
@@ -11,27 +11,34 @@ ms.author: acomet
1111
ms.reviewer: jrasnick
1212
---
1313

14-
# Synapse Link for Azure Cosmos DB supported features
14+
# Azure Synapse Link for Azure Cosmos DB supported features
1515

16-
This article describes what functionalities are currently supported in Synapse Link for Azure Cosmos DB.
16+
This article describes what functionalities are currently supported in Azure Synapse Link for Azure Cosmos DB.
1717

1818
## Azure Synapse support
1919

20+
There are two types of containers in Azure Cosmos DB:
21+
* HTAP container - A container with Synapse Link enabled. This container has both transactional store and analytical store.
22+
* OLTP container - A container with only transaction store; Synapse Link is not enabled.
23+
24+
You can connect to Cosmos DB container without enabling Synapse Link, in which case you can only read/write to the transactional store.
25+
2026
Here is list of the currently supported features within Synapse Link for Cosmos DB.
2127

22-
| Category | Description |Spark | SQL serverless |
28+
| Category | Description |[Spark](https://docs.microsoft.com/azure/synapse-analytics/sql/on-demand-workspace-overview) | [SQL serverless](https://docs.microsoft.com/azure/synapse-analytics/sql/on-demand-workspace-overview) |
2329
| :-------------------- | :----------------------------------------------------------- |:----------------------------------------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- |
2430
| **Run-time Support** |Support for read or write by Azure Synapse run-time|| [Contact Us](mailto:[email protected]?subject=[Enable%20Preview%20Feature]%20SQL%20serverless%20for%20Cosmos%20DB)|
2531
| **Cosmos DB API support** |API support as a Synapse Link| SQL / Mongo DB | SQL / Mongo DB |
2632
| **Object** |Objects such as table that can be created, pointing directly to Azure Cosmos DB container| View, Table | View |
27-
| **Read** |Read data from an Azure Cosmos DB container| OLTP / HTAP | HTAP |
33+
| **Read** |Read data from an Azure Cosmos DB container| OLTP / HTAP | HTAP |
2834
| **Write** |Write data from run-time into an Azure Cosmos DB container| OLTP | n/a |
2935

30-
Writing back into an Azure Cosmos DB container from Spark only happens through the transactional store of Azure Cosmos DB and will impact the transactional performance of Azure Cosmos DB by consuming Request Units. Data will be automatically replicated into the analytical store if analytical store is enabled at the database level.
36+
* If you write data into an Azure Cosmos DB container from Spark happens through the transactional store of Azure Cosmos DB and will impact the transactional performance of Azure Cosmos DB by consuming Request Units.
37+
* SQL pool integration through external tables is currently not supported.
3138

3239
## Supported code-generated actions for Spark
3340

34-
| Gesture | Description |OLTP only container |HTAP container |
41+
| Gesture | Description |OLTP |HTAP |
3542
| :-------------------- | :----------------------------------------------------------- |:----------------------------------------------------------- |:----------------------------------------------------------- |
3643
| **Load to DataFrame** |Load and read data into a Spark DataFrame |X||
3744
| **Create Spark table** |Create a table pointing to an Azure Cosmos DB container|X||
@@ -43,11 +50,14 @@ Writing back into an Azure Cosmos DB container from Spark only happens through t
4350

4451
## Supported code-generated actions for SQL serverless
4552

46-
| Gesture | Description |OLTP only container |HTAP container |
53+
| Gesture | Description |OLTP |HTAP |
54+
=======
4755
| :-------------------- | :----------------------------------------------------------- |:----------------------------------------------------------- |:----------------------------------------------------------- |
4856
| **Select top 100** |Preview top 100 items from a container|X||
4957
| **Create view** |Create a view to directly have BI access in a container through Synapse SQL|X||
5058

5159
## Next steps
5260

53-
See the [Connect to Synapse Link for Azure Cosmos DB quickstart](../quickstart-connect-synapse-link-cosmos-db.md)
61+
See how to [connect to Synapse Link for Azure Cosmos DB](./how-to-connect-synapse-link-cosmos-db.md)
62+
63+
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
---
2+
title: Connect to Azure Synapse Link for Cosmos DB
3+
description: How to connect a Cosmos DB to a Synapse workspace with Azure Synapse Link
4+
services: synapse-analytics
5+
author: ArnoMicrosoft
6+
ms.service: synapse-analytics
7+
ms.topic: quickstart
8+
ms.subservice:
9+
ms.date: 04/21/2020
10+
ms.author: acomet
11+
ms.reviewer: jrasnick
12+
---
13+
14+
# Connect to Azure Synapse Link for Azure Cosmos DB
15+
16+
This article describes how to access an Azure Cosmos DB database from Azure Synapse Analytics Studio with Azure Synapse Link.
17+
18+
## Prerequisites
19+
20+
Before you connect an Azure Cosmos DB account to your workspace, there are a few things that you need.
21+
22+
* Existing Azure Cosmos DB account or create a new account following this [quickstart](https://docs.microsoft.com/azure/cosmos-db/how-to-manage-database-account)
23+
* Existing Synapse workspace or create a new workspace following this [quickstart](https://docs.microsoft.com/azure/synapse-analytics/quickstart-create-workspace)
24+
25+
## Enable Azure Cosmos DB analytical store
26+
27+
To run large-scale analytics into Azure Cosmos DB without impacting your operational performance, we recommend enabling Synapse Link for Azure Cosmos DB. Synapse Link brings HTAP capability to a container and built-in support in Azure Synapse.
28+
29+
## Connect an Azure Cosmos DB database to a Synapse workspace
30+
31+
Connecting an Azure Cosmos DB database is done as linked service. A Cosmos DB linked service enables users to browse and explore data, read, and write from Synapse Spark or SQL into Azure Cosmos DB.
32+
33+
From the Data Object Explorer, you can directly connect an Azure Cosmos DB database by doing the following steps:
34+
1. Select ***+*** icon near Data
35+
2. Select **Connect to external data**
36+
3. Select the API that you want to connect to: SQL or MongoDB
37+
4. Select ***Continue***
38+
5. Name the linked service. The name will be displayed in the Object Explorer and used by Synapse run-times to connect to the database and containers. We recommend using a friendly name.
39+
6. Select the **Cosmos DB account name** and **database name**
40+
7. (Optional) If no region is specified, Synapse run-time operations will be routed toward the nearest region where the analytical store is enabled. However you can set manually which region you want your users to access Cosmos DB analytical store. Select **Additional connection properties** and then **New**. Under **Property Name**, write ***PreferredRegions*** and set the **Value** to the region you want (example: WestUS2, there is no space between words and number)
41+
8. Select ***Create***
42+
43+
Azure Cosmos DB database are visible under the tab **Linked** in the Azure Cosmos DB section. With Azure Cosmos DB, you can differentiate an HTAP enabled container from an OLTP only container through the following icons:
44+
45+
**OLTP only container**:
46+
47+
![OLTP container](../media/quickstart-connect-synapse-link-cosmosdb/oltp-container.png)
48+
49+
**HTAP enabled container**:
50+
51+
![HTAP container](../media/quickstart-connect-synapse-link-cosmosdb/htap-container.png)
52+
53+
## Quickly interact with code-generated actions
54+
55+
By right-clicking into a container, you have list of gestures that will trigger a Spark or SQL run-time. Writing into a container will happen through the Transactional Store of Azure Cosmos DB and will consume Request Units.
56+
57+
## Next steps
58+
59+
* [Learn what is supported between Synapse and Azure Cosmos DB](./concept-synapse-link-cosmos-db-support.md)
60+
* [Learn how to query the analytical store with Spark](./how-to-query-analytical-store-spark.md)
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
---
2+
title: Query Cosmos DB analytical with Synapse Spark
3+
description: How to query Cosmos DB analytical with Synapse Spark
4+
services: synapse-analytics
5+
author: ArnoMicrosoft
6+
ms.service: synapse-analytics
7+
ms.topic: quickstart
8+
ms.subservice:
9+
ms.date: 05/06/2020
10+
ms.author: acomet
11+
ms.reviewer: jrasnick
12+
---
13+
14+
# Query Cosmos DB analytical with Synapse Spark
15+
16+
This article gives some examples on how you can interact with the analytical store from Synapse gestures. Those gestures are visible when you right-click on a container. With gestures, you can quickly generate code and tweak it to your needs. They are also perfect for discovering data with a single click.
17+
18+
## Load to DataFrame
19+
20+
In this step, you will read data from Azure Cosmos DB analytical store in a Spark DataFrame. It will display 10 rows from the DataFrame called ***df***. Once your data is into dataframe, you can perform additional analysis. This operation does not impact the transactional store.
21+
22+
```python
23+
# To select a preferred list of regions in a multi-region Cosmos DB account, add .option("spark.cosmos.preferredRegions", "<Region1>,<Region2>")
24+
25+
df = spark.read.format("cosmos.olap")\
26+
.option("spark.synapse.linkedService", "INFERRED")\
27+
.option("spark.cosmos.container", "INFERRED")\
28+
.load()
29+
30+
​df.show(10)
31+
```
32+
33+
## Create Spark table
34+
35+
In this gesture, you will create a Spark table pointing to the container you selected. That operation does not incur any data movement. If you decide to delete that table, the underlying container (and corresponding analytical store) won't be impacted. This scenario is convenient to reuse tables through third-party tools and provide accessibility to the data for the run-time.
36+
37+
```sql
38+
%%sql
39+
-- To select a preferred list of regions in a multi-region Cosmos DB account, add spark.cosmos.preferredRegions '<Region1>,<Region2>' in the config options
40+
41+
create table call_center using cosmos.olap options (
42+
spark.synapse.linkedService 'INFERRED',
43+
spark.cosmos.container 'INFERRED'
44+
)
45+
```
46+
47+
## Write DataFrame to container
48+
In this gesture, you will write a dataframe into a container. This operation will impact the transactional performance and consume Request Units. Using Azure Cosmos DB transactional performance is ideal for write transactions. Make sure that you replace **YOURDATAFRAME** by the dataframe that you want to write back.
49+
50+
```python
51+
# Write a Spark DataFrame into a Cosmos DB container
52+
# To select a preferred list of regions in a multi-region Cosmos DB account, add .option("spark.cosmos.preferredRegions", "<Region1>,<Region2>")
53+
54+
55+
YOURDATAFRAME.write.format("cosmos.oltp")\
56+
.option("spark.synapse.linkedService", "INFERRED")\
57+
.option("spark.cosmos.container", "INFERRED")\
58+
.option("spark.cosmos.write.upsertEnabled", "true")\
59+
.mode('append')\
60+
.save()
61+
```
62+
63+
## Load streaming DataFrame from container
64+
In this gesture, you will use Spark Streaming capability to load data from a container into a dataframe. The data will be stored into the primary data lake account (and file system) that you connected to the workspace. If the folder /localReadCheckpointFolder is not created, it will be automatically created. This operation will impact the transactional performance of Cosmos DB.
65+
66+
```python
67+
# To select a preferred list of regions in a multi-region Cosmos DB account, add .option("spark.cosmos.preferredRegions", "<Region1>,<Region2>")
68+
69+
dfStream = spark.readStream\
70+
.format("cosmos.oltp")\
71+
.option("spark.synapse.linkedService", "INFERRED")\
72+
.option("spark.cosmos.container", "INFERRED")\
73+
.option("spark.cosmos.changeFeed.readEnabled", "true")\
74+
.option("spark.cosmos.changeFeed.startFromTheBeginning", "true")\
75+
.option("spark.cosmos.changeFeed.checkpointLocation", "/localReadCheckpointFolder")\
76+
.option("spark.cosmos.changeFeed.queryName", "streamQuery")\
77+
.load()
78+
```
79+
80+
## Write streaming DataFrame to container
81+
In this gesture, you will write a streaming dataframe into the Cosmos DB container you selected. If the folder /localReadCheckpointFolder is not created, it will be automatically created. This operation will impact the transactional performance of Cosmos DB.
82+
83+
```python
84+
# To select a preferred list of regions in a multi-region Cosmos DB account, add .option("spark.cosmos.preferredRegions", "<Region1>,<Region2>")
85+
86+
streamQuery = dfStream\
87+
.writeStream\
88+
.format("cosmos.oltp")\
89+
.outputMode("append")\
90+
.option("checkpointLocation", "/localWriteCheckpointFolder")\
91+
.option("spark.synapse.linkedService", "INFERRED")\
92+
.option("spark.cosmos.container", "trafficSourceColl_sink")\
93+
.option("spark.cosmos.connection.mode", "gateway")\
94+
.start()
95+
96+
streamQuery.awaitTermination()
97+
```

articles/synapse-analytics/toc.yml

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@
8080
href: ../data-factory/concepts-data-flow-overview.md?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json
8181
- name: Maintenance schedule
8282
href: ./sql-data-warehouse/maintenance-scheduling.md?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json
83-
- name: Backup and restore
83+
- name: Back up and restore
8484
href: ./sql-data-warehouse/backup-and-restore.md?toc=/azure/synapse-analytics/toc.json&bc=/azure/synapse-analytics/breadcrumb/toc.json
8585
- name: Monitoring
8686
items:
@@ -412,6 +412,12 @@
412412
href: data-integration/data-integration-data-lake.md
413413
- name: Transform & analyze data
414414
items:
415+
- name: Synapse Link
416+
items:
417+
- name: Connect to Synapse Link for Cosmos DB
418+
href: ./synapse-link/how-to-connect-synapse-link-cosmos-db.md
419+
- name: Query analytical store with Spark
420+
href: ./synapse-link/how-to-query-analytical-store-spark.md
415421
- name: Synapse SQL
416422
items:
417423
- name: Query data in storage

0 commit comments

Comments
 (0)