Skip to content

Commit f52ab0a

Browse files
committed
New articles and minor changes
1 parent de413b1 commit f52ab0a

File tree

4 files changed

+109
-5
lines changed

4 files changed

+109
-5
lines changed

articles/synapse-analytics/synapse-link/concept-synapse-link-cosmos-db-support.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,4 +50,4 @@ Writing back into an Azure Cosmos DB container from Spark only happens through t
5050

5151
## Next steps
5252

53-
See the [Connect to Synapse Link for Azure Cosmos DB quickstart](../quickstart-connect-synapse-link-cosmos-db.md)
53+
See how to [connect to Synapse Link for Azure Cosmos DB](./how-to-connect-synapse-link-cosmos-db.md)

articles/synapse-analytics/quickstart-connect-synapse-link-cosmos-db.md renamed to articles/synapse-analytics/synapse-link/how-to-connect-synapse-link-cosmos-db.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ ms.reviewer: jrasnick
1313

1414
# Connect to Synapse Link for Azure Cosmos DB
1515

16-
This article describes how to access an Azure Cosmos DB database from Azure Synapse Analytics studio with Synapse Link.
16+
This article describes how to access an Azure Cosmos DB database from Azure Synapse Analytics Studio with Synapse Link.
1717

1818
## Prerequisites
1919

@@ -56,4 +56,5 @@ By right-clicking into a container, you have list of gestures that will trigger
5656

5757
## Next steps
5858

59-
* [Learn what is supported between Synapse and Azure Cosmos DB](./synapse-link/concept-synapse-link-cosmos-db-support.md)
59+
* [Learn what is supported between Synapse and Azure Cosmos DB](./concept-synapse-link-cosmos-db-support.md)
60+
* [Learn how to query the analytical store with Spark](./how-to-query-analytical-store-spark.md)
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
---
2+
title: Query Cosmos DB analytical with Synapse Spark
3+
description: How to query Cosmos DB analytical with Synapse Spark
4+
services: synapse-analytics
5+
author: ArnoMicrosoft
6+
ms.service: synapse-analytics
7+
ms.topic: quickstart
8+
ms.subservice:
9+
ms.date: 05/06/2020
10+
ms.author: acomet
11+
ms.reviewer: jrasnick
12+
---
13+
14+
# Query Cosmos DB analytical with Synapse Spark
15+
16+
This article gives some examples on how you can interact with the analytical store from Synapse gestures. Those gestures are visible when you right-click on a container.
17+
18+
When you right click into a container, Synapse will be able to infer which linked service, database and container it refers to. Such gestures are very simple to get quickly code and tweak it to your needs but they are also perfect for discovering data in a single click.
19+
20+
## Load to DataFrame
21+
22+
In this step, you will read from Azure Cosmos DB analytical store into a Spark DataFrame and display 10 rows from the DataFrame called df. Once your data is into dataframe, you can perform additional analysis. This operation does not impact the transactional store.
23+
24+
```python
25+
# To select a preferred list of regions in a multi-region Cosmos DB account, add .option("spark.cosmos.preferredRegions", "<Region1>,<Region2>")
26+
27+
df = spark.read.format("cosmos.olap")\
28+
.option("spark.synapse.linkedService", "INFERRED")\
29+
.option("spark.cosmos.container", "INFERRED")\
30+
.load()
31+
32+
​df.show(10)
33+
```
34+
35+
## Create Spark table
36+
37+
In this gesture, you will create a Spark table pointing to the container you selected. That operation does not incur any data movement. If you decide to delete that table, the underlying container (and corresponding analytical store) won't be impacted. This scenario is very convenient to reuse tables through 3rd party tools and provide accessibility to the data for the run-time.
38+
39+
```sql
40+
%%sql
41+
-- To select a preferred list of regions in a multi-region Cosmos DB account, add spark.cosmos.preferredRegions '<Region1>,<Region2>' in the config options
42+
43+
create table call_center using cosmos.olap options (
44+
spark.synapse.linkedService 'INFERRED',
45+
spark.cosmos.container 'INFERRED'
46+
)
47+
```
48+
49+
## Write DataFrame to container
50+
In this gesture, you will write back a dataframe into a container. This operation will impact the transactional performance and consume Request Units. Using Azure Cosmos DB transactional performance will optimize the speed and reliability of those write transactions. Make sure that you replace **YOURDATAFRAME** by the dataframe that you want to write back.
51+
52+
```python
53+
# Write a Spark DataFrame into a Cosmos DB container
54+
# To select a preferred list of regions in a multi-region Cosmos DB account, add .option("spark.cosmos.preferredRegions", "<Region1>,<Region2>")
55+
56+
57+
YOURDATAFRAME.write.format("cosmos.oltp")\
58+
.option("spark.synapse.linkedService", "INFERRED")\
59+
.option("spark.cosmos.container", "INFERRED")\
60+
.option("spark.cosmos.write.upsertEnabled", "true")\
61+
.mode('append')\
62+
.save()
63+
```
64+
65+
## Load streaming DataFrame from container
66+
In this gesture, you will use Spark Streaming capability with change feed support to load data from a container into a dataframe with data being stored into the primary data lake account that you connected to the workspace. If the folder /localReadCheckpointFolder is not created, it will be automatically created. This operation will impact the transactional performance of Cosmos DB.
67+
68+
```python
69+
# To select a preferred list of regions in a multi-region Cosmos DB account, add .option("spark.cosmos.preferredRegions", "<Region1>,<Region2>")
70+
71+
dfStream = spark.readStream\
72+
.format("cosmos.oltp")\
73+
.option("spark.synapse.linkedService", "INFERRED")\
74+
.option("spark.cosmos.container", "INFERRED")\
75+
.option("spark.cosmos.changeFeed.readEnabled", "true")\
76+
.option("spark.cosmos.changeFeed.startFromTheBeginning", "true")\
77+
.option("spark.cosmos.changeFeed.checkpointLocation", "/localReadCheckpointFolder")\
78+
.option("spark.cosmos.changeFeed.queryName", "streamQuery")\
79+
.load()
80+
```
81+
82+
## Write streaming DataFrame to container
83+
In this gesture, you will write a streaming dataframe into the Cosmos DB container you selected. If the folder /localReadCheckpointFolder is not created, it will be automatically created. This operation will impact the transactional performance of Cosmos DB.
84+
85+
```python
86+
# To select a preferred list of regions in a multi-region Cosmos DB account, add .option("spark.cosmos.preferredRegions", "<Region1>,<Region2>")
87+
88+
streamQuery = dfStream\
89+
.writeStream\
90+
.format("cosmos.oltp")\
91+
.outputMode("append")\
92+
.option("checkpointLocation", "/localWriteCheckpointFolder")\
93+
.option("spark.synapse.linkedService", "INFERRED")\
94+
.option("spark.cosmos.container", "trafficSourceColl_sink")\
95+
.option("spark.cosmos.connection.mode", "gateway")\
96+
.start()
97+
98+
streamQuery.awaitTermination()
99+
```

articles/synapse-analytics/toc.yml

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,6 @@
2828
href: quickstart-create-sql-pool.md
2929
- name: Use SQL on-demand
3030
href: quickstart-sql-on-demand.md
31-
- name: Connect to Synapse Link for Cosmos DB
32-
href: quickstart-connect-synapse-link-cosmos-db.md
3331
- name: Tutorials
3432
items:
3533
# - name: Get started
@@ -408,6 +406,12 @@
408406
href: data-integration/data-integration-data-lake.md
409407
- name: Transform & analyze data
410408
items:
409+
- name: Synapse Link
410+
items:
411+
- name: Connect to Synapse Link for Cosmos DB
412+
href: how-to-connect-synapse-link-cosmos-db.md
413+
- name: Query analytical store with Spark
414+
href: how-to-query-analytical-store-spark.md
411415
- name: Synapse SQL
412416
items:
413417
- name: Query data in storage

0 commit comments

Comments
 (0)