Skip to content

Commit 0f57cde

Browse files
author
Saveen Reddy
authored
Update get-started.md
1 parent b559f15 commit 0f57cde

File tree

1 file changed

+10
-14
lines changed

1 file changed

+10
-14
lines changed

articles/synapse-analytics/get-started.md

Lines changed: 10 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ NOTE:
8383
* SQL on-demand also has its own kind of SQL on-demand databases that exist independently from any SQL on-demand pool
8484
* Currently a workspace always has exactly one SQL on-demand pool named **SQL on-demand**.
8585

86-
## Load the NYC Taxi Sample data into your tge SQLDB1 database
86+
## Load the NYC Taxi Sample data into the SQLDB1 database
8787

8888
* In Synapse Studio, in the top-most blue menu, click on the **?** icon.
8989
* Select **Getting started > Getting started hub**
@@ -111,7 +111,7 @@ NOTE:
111111
* This query shows how the total trip distances and average trip distance relate to the number of passengers
112112
* In the SQL script result window change the **View** to **Chart** to see a visualization of the results as a line chart
113113
114-
## Load the NYC data into a Spark Database
114+
## Create a Spark Ddatabase adnd load the NYC taxi data into it
115115
We have data available in a SQL pool DB. Now we load it into a Spark database.
116116
117117
* In Synapse Studio, navigate to the **Develop hub"
@@ -120,22 +120,18 @@ We have data available in a SQL pool DB. Now we load it into a Spark database.
120120
* Click **Add code** to add a notebook code cell and paste the text below:
121121
```
122122
%% spark
123-
spark.sql("CREATE DATABASE nyctaxi")
123+
spark.sql("CREATE DATABASE IF NOT EXISTS nyctaxi")
124124
val df = spark.read.sqlanalytics("SQLDB1.dbo.Trip")
125-
df.write.saveAsTable("nyctaxi.trip")
125+
df.write.mode("overwrite").saveAsTable("nyctaxi.trip")
126126
```
127127
* Navigate to the Data hub, right-click on Databases and select **Refresh**
128-
* Now you should see a Spark DB called nyxtaxi and inside that DB a table called trip
129-
128+
* Now you should see these databases:
129+
* SQLDB (SQL pool)
130+
* nyctaxi (Spark)
131+
130132
## Analyze the NYC Taxi data using Spark and notebooks
131133
* Return to your notebook
132-
* Create a new code cell and run this text
133-
```
134-
%%pyspark
135-
df = spark.sql("SELECT * FROM nyctaxi.trip")
136-
df.show(10)
137-
```
138-
* To show this in a nicer format run this code
134+
* Create a new code cell, enter the text below, adn run the cell
139135
```
140136
%%pyspark
141137
df = spark.sql("SELECT * FROM nyctaxi.trip")
@@ -158,7 +154,7 @@ We have data available in a SQL pool DB. Now we load it into a Spark database.
158154
```
159155
* In the cell results, click on **Chart** to see the data visualized
160156
161-
## Visualize data with Spark and notebooks
157+
## Customize data visualization data with Spark and notebooks
162158
163159
With spark notebooks you can control exactly how render charts. The following
164160
code shows a simple example using the popular libraries matplotlib and seaborn.

0 commit comments

Comments
 (0)