Skip to content

Commit baadb56

Browse files
committed
Merge branch 'arnaud-synapse-562020' of https://github.com/ArnoMicrosoft/azure-docs-pr into aron_cosmosdb
2 parents ec8a2ed + c9fd92a commit baadb56

File tree

10 files changed

+231
-121
lines changed

10 files changed

+231
-121
lines changed

articles/synapse-analytics/cosmos-db-integration/concept-cosmos-db-support.md

Lines changed: 0 additions & 49 deletions
This file was deleted.

articles/synapse-analytics/includes/note-preview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,4 @@ ms.author: jrasnick
1010
> This preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities.
1111
> For more information, see [Supplemental Terms of Use for Microsoft Azure Previews](https://azure.microsoft.com/support/legal/preview-supplemental-terms/).
1212
>
13-
> To access the preview features of Azure Synapse, request access [here](https://aka.ms/synapsepreview). Microsoft will triage all requests and respond as soon as possible.
13+

articles/synapse-analytics/quickstart-connect-cosmos-db.md

Lines changed: 0 additions & 58 deletions
This file was deleted.

articles/synapse-analytics/sql-data-warehouse/sql-data-warehouse-overview-what-is.md

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -15,23 +15,19 @@ ms.reviewer: igorstan
1515
# What is Azure Synapse Analytics (formerly SQL DW)?
1616

1717
> [!NOTE]
18-
> Try the latest Azure Synapse features such as workspaces, Spark, SQL on demand, and the integrated Synapse Studio experience
19-
> by [requesting access to Azure Synapse (workspaces preview)](https://aka.ms/synapsepreview).
20-
>
2118
>Explore the [Azure Synapse (workspaces preview) documentation](../overview-what-is.md).
19+
>
2220
2321
Azure Synapse is an analytics service that brings together enterprise data warehousing and Big Data analytics. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources—at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data for immediate BI and machine learning needs.
2422

2523
Azure Synapse has four components:
2624

2725
- Synapse SQL: Complete T-SQL based analytics – Generally Available
2826
- SQL pool (pay per DWU provisioned)
29-
- SQL on-demand (pay per TB processed) – (Preview)
30-
- Spark: Deeply integrated Apache Spark (Preview)
31-
- Synapse Pipelines: Hybrid data integration (Preview)
32-
- Studio: Unified user experience. (Preview)
33-
34-
27+
- SQL on-demand (pay per TB processed) (preview)
28+
- Spark: Deeply integrated Apache Spark (preview)
29+
- Synapse Pipelines: Hybrid data integration (preview)
30+
- Studio: Unified user experience. (preview)
3531

3632
## Synapse SQL pool in Azure Synapse
3733

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
---
2+
title: Azure Synapse Link for Cosmos DB supported features
3+
description: Understand the current list of actions supported by Azure Synapse Link for Cosmos DB
4+
services: synapse-analytics
5+
author: ArnoMicrosoft
6+
ms.service: synapse-analytics
7+
ms.topic: quickstart
8+
ms.subservice:
9+
ms.date: 04/21/2020
10+
ms.author: acomet
11+
ms.reviewer: jrasnick
12+
---
13+
14+
# Azure Synapse Link for Azure Cosmos DB supported features
15+
16+
This article describes what functionalities are currently supported in Azure Synapse Link for Azure Cosmos DB.
17+
18+
## Azure Synapse support
19+
20+
There are two types of containers in Azure Cosmos DB:
21+
* HTAP container - A container with Synapse Link enabled. This container has both transactional store and analytical store.
22+
* OLTP container - A container with only transaction store; Synapse Link is not enabled.
23+
24+
You can connect to Cosmos DB container without enabling Synapse Link, in which case you can only read/write to the transactional store.
25+
26+
Here is list of the currently supported features within Synapse Link for Cosmos DB.
27+
28+
| Category | Description |[Spark](https://docs.microsoft.com/azure/synapse-analytics/sql/on-demand-workspace-overview) | [SQL serverless](https://docs.microsoft.com/azure/synapse-analytics/sql/on-demand-workspace-overview) |
29+
| :-------------------- | :----------------------------------------------------------- |:----------------------------------------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- |
30+
| **Run-time Support** |Support for read or write by Azure Synapse run-time|| [Contact Us](mailto:[email protected]?subject=[Enable%20Preview%20Feature]%20SQL%20serverless%20for%20Cosmos%20DB)|
31+
| **Cosmos DB API support** |API support as a Synapse Link| SQL / Mongo DB | SQL / Mongo DB |
32+
| **Object** |Objects such as table that can be created, pointing directly to Azure Cosmos DB container| View, Table | View |
33+
| **Read** |Read data from an Azure Cosmos DB container| OLTP / HTAP | HTAP |
34+
| **Write** |Write data from run-time into an Azure Cosmos DB container| OLTP | n/a |
35+
36+
* If you write data into an Azure Cosmos DB container from Spark happens through the transactional store of Azure Cosmos DB and will impact the transactional performance of Azure Cosmos DB by consuming Request Units.
37+
* SQL pool integration through external tables is currently not supported.
38+
39+
## Supported code-generated actions for Spark
40+
41+
| Gesture | Description |OLTP |HTAP |
42+
| :-------------------- | :----------------------------------------------------------- |:----------------------------------------------------------- |:----------------------------------------------------------- |
43+
| **Load to DataFrame** |Load and read data into a Spark DataFrame |X||
44+
| **Create Spark table** |Create a table pointing to an Azure Cosmos DB container|X||
45+
| **Write DataFrame to container** |Write data into a container|||
46+
| **Load streaming DataFrame from container** |Stream data using Azure Cosmos DB change feed|||
47+
| **Write streaming DataFrame to container** |Stream data using Azure Cosmos DB change feed|||
48+
49+
50+
51+
## Supported code-generated actions for SQL serverless
52+
53+
| Gesture | Description |OLTP |HTAP |
54+
| :-------------------- | :----------------------------------------------------------- |:----------------------------------------------------------- |:----------------------------------------------------------- |
55+
| **Select top 100** |Preview top 100 items from a container|X||
56+
| **Create view** |Create a view to directly have BI access in a container through Synapse SQL|X||
57+
58+
## Next steps
59+
60+
See how to [connect to Synapse Link for Azure Cosmos DB](./how-to-connect-synapse-link-cosmos-db.md)
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
---
2+
title: Connect to Azure Synapse Link for Cosmos DB
3+
description: How to connect a Cosmos DB to a Synapse workspace with Azure Synapse Link
4+
services: synapse-analytics
5+
author: ArnoMicrosoft
6+
ms.service: synapse-analytics
7+
ms.topic: quickstart
8+
ms.subservice:
9+
ms.date: 04/21/2020
10+
ms.author: acomet
11+
ms.reviewer: jrasnick
12+
---
13+
14+
# Connect to Azure Synapse Link for Azure Cosmos DB
15+
16+
This article describes how to access an Azure Cosmos DB database from Azure Synapse Analytics Studio with Azure Synapse Link.
17+
18+
## Prerequisites
19+
20+
Before you connect an Azure Cosmos DB account to your workspace, there are a few things that you need.
21+
22+
* Existing Azure Cosmos DB account or create a new account following this [quickstart](https://docs.microsoft.com/azure/cosmos-db/how-to-manage-database-account)
23+
* Existing Synapse workspace or create a new workspace following this [quickstart](https://docs.microsoft.com/azure/synapse-analytics/quickstart-create-workspace)
24+
25+
## Enable Azure Cosmos DB analytical store
26+
27+
To run large-scale analytics into Azure Cosmos DB without impacting your operational performance, we recommend enabling Synapse Link for Azure Cosmos DB. Synapse Link brings HTAP capability to a container and built-in support in Azure Synapse.
28+
29+
## Connect an Azure Cosmos DB database to a Synapse workspace
30+
31+
Connecting an Azure Cosmos DB database is done as linked service. A Cosmos DB linked service enables users to browse and explore data, read, and write from Synapse Spark or SQL into Azure Cosmos DB.
32+
33+
From the Data Object Explorer, you can directly connect an Azure Cosmos DB database by doing the following steps:
34+
1. Select ***+*** icon near Data
35+
2. Select **Connect to external data**
36+
3. Select the API that you want to connect to: SQL or MongoDB
37+
4. Select ***Continue***
38+
5. Name the linked service. The name will be displayed in the Object Explorer and used by Synapse run-times to connect to the database and containers. We recommend using a friendly name.
39+
6. Select the **Cosmos DB account name** and **database name**
40+
7. (Optional) If no region is specified, Synapse run-time operations will be routed toward the nearest region where the analytical store is enabled. However you can set manually which region you want your users to access Cosmos DB analytical store. Select **Additional connection properties** and then **New**. Under **Property Name**, write ***PreferredRegions*** and set the **Value** to the region you want (example: WestUS2, there is no space between words and number)
41+
8. Select ***Create***
42+
43+
Azure Cosmos DB database are visible under the tab **Linked** in the Azure Cosmos DB section. With Azure Cosmos DB, you can differentiate an HTAP enabled container from an OLTP only container through the following icons:
44+
45+
**OLTP only container**:
46+
47+
![OLTP container](../media/quickstart-connect-synapse-link-cosmosdb/oltp-container.png)
48+
49+
**HTAP enabled container**:
50+
51+
![HTAP container](../media/quickstart-connect-synapse-link-cosmosdb/htap-container.png)
52+
53+
## Quickly interact with code-generated actions
54+
55+
By right-clicking into a container, you have list of gestures that will trigger a Spark or SQL run-time. Writing into a container will happen through the Transactional Store of Azure Cosmos DB and will consume Request Units.
56+
57+
## Next steps
58+
59+
* [Learn what is supported between Synapse and Azure Cosmos DB](./concept-synapse-link-cosmos-db-support.md)
60+
* [Learn how to query the analytical store with Spark](./how-to-query-analytical-store-spark.md)
Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
---
2+
title: Query Cosmos DB analytical with Synapse Spark
3+
description: How to query Cosmos DB analytical with Synapse Spark
4+
services: synapse-analytics
5+
author: ArnoMicrosoft
6+
ms.service: synapse-analytics
7+
ms.topic: quickstart
8+
ms.subservice:
9+
ms.date: 05/06/2020
10+
ms.author: acomet
11+
ms.reviewer: jrasnick
12+
---
13+
14+
# Query Cosmos DB analytical with Synapse Spark
15+
16+
This article gives some examples on how you can interact with the analytical store from Synapse gestures. Those gestures are visible when you right-click on a container. With gestures, you can quickly generate code and tweak it to your needs. They are also perfect for discovering data with a single click.
17+
18+
## Load to DataFrame
19+
20+
In this step, you will read data from Azure Cosmos DB analytical store in a Spark DataFrame. It will display 10 rows from the DataFrame called ***df***. Once your data is into dataframe, you can perform additional analysis. This operation does not impact the transactional store.
21+
22+
```python
23+
# To select a preferred list of regions in a multi-region Cosmos DB account, add .option("spark.cosmos.preferredRegions", "<Region1>,<Region2>")
24+
25+
df = spark.read.format("cosmos.olap")\
26+
.option("spark.synapse.linkedService", "INFERRED")\
27+
.option("spark.cosmos.container", "INFERRED")\
28+
.load()
29+
30+
​df.show(10)
31+
```
32+
33+
## Create Spark table
34+
35+
In this gesture, you will create a Spark table pointing to the container you selected. That operation does not incur any data movement. If you decide to delete that table, the underlying container (and corresponding analytical store) won't be impacted. This scenario is convenient to reuse tables through third-party tools and provide accessibility to the data for the run-time.
36+
37+
```sql
38+
%%sql
39+
-- To select a preferred list of regions in a multi-region Cosmos DB account, add spark.cosmos.preferredRegions '<Region1>,<Region2>' in the config options
40+
41+
create table call_center using cosmos.olap options (
42+
spark.synapse.linkedService 'INFERRED',
43+
spark.cosmos.container 'INFERRED'
44+
)
45+
```
46+
47+
## Write DataFrame to container
48+
In this gesture, you will write a dataframe into a container. This operation will impact the transactional performance and consume Request Units. Using Azure Cosmos DB transactional performance is ideal for write transactions. Make sure that you replace **YOURDATAFRAME** by the dataframe that you want to write back.
49+
50+
```python
51+
# Write a Spark DataFrame into a Cosmos DB container
52+
# To select a preferred list of regions in a multi-region Cosmos DB account, add .option("spark.cosmos.preferredRegions", "<Region1>,<Region2>")
53+
54+
55+
YOURDATAFRAME.write.format("cosmos.oltp")\
56+
.option("spark.synapse.linkedService", "INFERRED")\
57+
.option("spark.cosmos.container", "INFERRED")\
58+
.option("spark.cosmos.write.upsertEnabled", "true")\
59+
.mode('append')\
60+
.save()
61+
```
62+
63+
## Load streaming DataFrame from container
64+
In this gesture, you will use Spark Streaming capability to load data from a container into a dataframe. The data will be stored into the primary data lake account (and file system) that you connected to the workspace. If the folder /localReadCheckpointFolder is not created, it will be automatically created. This operation will impact the transactional performance of Cosmos DB.
65+
66+
```python
67+
# To select a preferred list of regions in a multi-region Cosmos DB account, add .option("spark.cosmos.preferredRegions", "<Region1>,<Region2>")
68+
69+
dfStream = spark.readStream\
70+
.format("cosmos.oltp")\
71+
.option("spark.synapse.linkedService", "INFERRED")\
72+
.option("spark.cosmos.container", "INFERRED")\
73+
.option("spark.cosmos.changeFeed.readEnabled", "true")\
74+
.option("spark.cosmos.changeFeed.startFromTheBeginning", "true")\
75+
.option("spark.cosmos.changeFeed.checkpointLocation", "/localReadCheckpointFolder")\
76+
.option("spark.cosmos.changeFeed.queryName", "streamQuery")\
77+
.load()
78+
```
79+
80+
## Write streaming DataFrame to container
81+
In this gesture, you will write a streaming dataframe into the Cosmos DB container you selected. If the folder /localReadCheckpointFolder is not created, it will be automatically created. This operation will impact the transactional performance of Cosmos DB.
82+
83+
```python
84+
# To select a preferred list of regions in a multi-region Cosmos DB account, add .option("spark.cosmos.preferredRegions", "<Region1>,<Region2>")
85+
86+
streamQuery = dfStream\
87+
.writeStream\
88+
.format("cosmos.oltp")\
89+
.outputMode("append")\
90+
.option("checkpointLocation", "/localWriteCheckpointFolder")\
91+
.option("spark.synapse.linkedService", "INFERRED")\
92+
.option("spark.cosmos.container", "trafficSourceColl_sink")\
93+
.option("spark.cosmos.connection.mode", "gateway")\
94+
.start()
95+
96+
streamQuery.awaitTermination()
97+
```

0 commit comments

Comments
 (0)