Skip to content

Commit 859c178

Browse files
committed
updates
1 parent 54cef28 commit 859c178

File tree

2 files changed

+25
-16
lines changed

2 files changed

+25
-16
lines changed

articles/synapse-analytics/get-started.md

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,9 @@ Once your Synapse workspace is created, you have two ways to open Synapse Studio
103103
|---|---|---|
104104
|**Apache Spark pool name**|`Spark1`
105105
|**Node size**| `Small`|
106-
|**Number of nodes**| Set the minimum to 3 and the maximum to 3
106+
|**Number of nodes**| Set the minimum to 3 and the maximum to 3|
107+
|||
108+
107109
* Select **Review+create** and then select **Create**.
108110
* Your Apache Spark pool will be ready in a few seconds.
109111

@@ -116,6 +118,7 @@ Once your Synapse workspace is created, you have two ways to open Synapse Studio
116118
> Spark databases are independently created from Spark pools. A workspace always has a Spark DB called **default** and you can create additional Spark databases.
117119
118120
## SQL on-demand pools
121+
119122
SQL on-demand is a special kind of SQL pool that is always available with a Synapse workspace. It allows you to work with SQL without having to create or think about managing a Synapse SQL pool.
120123

121124
> [!NOTE]
@@ -208,7 +211,8 @@ We have data available in a SQL pool database. Now we load it into a Spark datab
208211
## Customize data visualization data with Spark and notebooks
209212
210213
With spark notebooks you can control exactly how render charts. The following
211-
code shows a simple example using the popular libraries matplotlib and sea-born.
214+
code shows a simple example using the popular libraries matplotlib and sea-born. It will
215+
render the same chart you saw when running the SQL queries earlier.
212216
213217
```py
214218
%%pyspark
@@ -223,7 +227,12 @@ seaborn.lineplot(x="PassengerCount", y="AvgTripDistance" , data = df)
223227
matplotlib.pyplot.show()
224228
```
225229
226-
## Load data from a Spark table into a SQL Pool table
230+
## Load data from a Spark table into a SQL pool table
231+
232+
Earlier we copied data from a SQL pool database into a Spark DB. Using
233+
Spark, we aggregated the data into the nyctaxi.passengercountstats.
234+
Now run the cell below in a notebook and it will copy the aggregated table back into
235+
the SQL pool datbase.
227236
228237
```scala
229238
%%spark
@@ -244,7 +253,7 @@ df.write.sqlanalytics("SQLDB1.dbo.PassengerCountStats", Constants.INTERNAL )
244253
```
245254
246255
* Select **Run**
247-
* NOTE: THe first time you run this it will take about 10 seconds for SQL on-demand to gather SQL resources needed to run your queries. Every subsequent query will not require this time.
256+
* NOTE: THe first time you run this it will take about 10 seconds for SQL on-demand to gather SQL resources needed to run your queries. Subsequent queries will not require this time.
248257
249258
## Use pipeline to orchestrate activities
250259

articles/synapse-analytics/spark/synapse-spark-sql-pool-import-export.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -38,14 +38,14 @@ For this reason, there is no need to create credentials or specify them in the c
3838

3939
To create users, connect to the database, and follow these examples:
4040

41-
```Sql
41+
```sql
4242
CREATE USER Mary FROM LOGIN Mary;
4343
CREATE USER [mike@contoso.com] FROM EXTERNAL PROVIDER;
4444
```
4545

4646
To assign a role:
4747

48-
```Sql
48+
```sql
4949
EXEC sp_addrolemember 'db_exporter', 'Mary';
5050
```
5151

@@ -58,28 +58,28 @@ The import statements are not required, they are pre-imported for the notebook e
5858
> [!NOTE]
5959
> **Imports not needed in notebook experience**
6060
61-
```Scala
61+
```scala
6262
import com.microsoft.spark.sqlanalytics.utils.Constants
6363
import org.apache.spark.sql.SqlAnalyticsConnector._
6464
```
6565

6666
#### Read API
6767

68-
```Scala
68+
```scala
6969
val df = spark.read.sqlanalytics("[DBName].[Schema].[TableName]")
7070
```
7171

7272
The above API will work for both Internal (Managed) as well as External Tables in the SQL pool.
7373

7474
#### Write API
7575

76-
```Scala
76+
```scala
7777
df.write.sqlanalytics("[DBName].[Schema].[TableName]", [TableType])
7878
```
7979

8080
where TableType can be Constants.INTERNAL or Constants.EXTERNAL
8181

82-
```Scala
82+
```scala
8383
df.write.sqlanalytics("[DBName].[Schema].[TableName]", Constants.INTERNAL)
8484
df.write.sqlanalytics("[DBName].[Schema].[TableName]", Constants.EXTERNAL)
8585
```
@@ -91,22 +91,22 @@ The authentication to Storage and the SQL Server is done
9191
> [!NOTE]
9292
> Imports not needed in notebook experience
9393
94-
```Scala
94+
```scala
9595
import com.microsoft.spark.sqlanalytics.utils.Constants
9696
import org.apache.spark.sql.SqlAnalyticsConnector._
9797
```
9898

9999
#### Read API
100100

101-
```Scala
101+
```scala
102102
val df = spark.read.
103103
option(Constants.SERVER, "samplews.database.windows.net").
104104
sqlanalytics("<DBName>.<Schema>.<TableName>")
105105
```
106106

107107
#### Write API
108108

109-
```Scala
109+
```scala
110110
df.write.
111111
option(Constants.SERVER, "[samplews].[database.windows.net]").
112112
sqlanalytics("[DBName].[Schema].[TableName]", [TableType])
@@ -118,7 +118,7 @@ sqlanalytics("[DBName].[Schema].[TableName]", [TableType])
118118

119119
Currently the connector doesn't support token-based auth to a SQL pool that is outside of the workspace. You'll need to use SQL Auth.
120120

121-
```Scala
121+
```scala
122122
val df = spark.read.
123123
option(Constants.SERVER, "samplews.database.windows.net").
124124
option(Constants.USER, [SQLServer Login UserName]).
@@ -128,7 +128,7 @@ sqlanalytics("<DBName>.<Schema>.<TableName>")
128128

129129
#### Write API
130130

131-
```Scala
131+
```scala
132132
df.write.
133133
option(Constants.SERVER, "[samplews].[database.windows.net]").
134134
option(Constants.USER, [SQLServer Login UserName]).
@@ -145,7 +145,7 @@ Assume you have a dataframe "pyspark_df" that you want to write into the DW.
145145

146146
Create a temp table using the dataframe in PySpark:
147147

148-
```python
148+
```py
149149
pyspark_df.createOrReplaceTempView("pysparkdftemptable")
150150
```
151151

0 commit comments

Comments
 (0)