83
83
* SQL on-demand also has its own kind of SQL on-demand databases that exist independently from any SQL on-demand pool
84
84
* Currently a workspace always has exactly one SQL on-demand pool named ** SQL on-demand** .
85
85
86
- ## Load the NYC Taxi Sample data into your tge SQLDB1 database
86
+ ## Load the NYC Taxi Sample data into the SQLDB1 database
87
87
88
88
* In Synapse Studio, in the top-most blue menu, click on the ** ?** icon.
89
89
* Select ** Getting started > Getting started hub**
@@ -111,7 +111,7 @@ NOTE:
111
111
* This query shows how the total trip distances and average trip distance relate to the number of passengers
112
112
* In the SQL script result window change the **View** to **Chart** to see a visualization of the results as a line chart
113
113
114
- ## Load the NYC data into a Spark Database
114
+ ## Create a Spark Ddatabase adnd load the NYC taxi data into it
115
115
We have data available in a SQL pool DB. Now we load it into a Spark database.
116
116
117
117
* In Synapse Studio, navigate to the **Develop hub"
@@ -120,22 +120,18 @@ We have data available in a SQL pool DB. Now we load it into a Spark database.
120
120
* Click **Add code** to add a notebook code cell and paste the text below:
121
121
```
122
122
%% spark
123
- spark.sql("CREATE DATABASE nyctaxi")
123
+ spark.sql("CREATE DATABASE IF NOT EXISTS nyctaxi")
124
124
val df = spark.read.sqlanalytics("SQLDB1.dbo.Trip")
125
- df.write.saveAsTable("nyctaxi.trip")
125
+ df.write.mode("overwrite"). saveAsTable("nyctaxi.trip")
126
126
```
127
127
* Navigate to the Data hub, right-click on Databases and select **Refresh**
128
- * Now you should see a Spark DB called nyxtaxi and inside that DB a table called trip
129
-
128
+ * Now you should see these databases:
129
+ * SQLDB (SQL pool)
130
+ * nyctaxi (Spark)
131
+
130
132
## Analyze the NYC Taxi data using Spark and notebooks
131
133
* Return to your notebook
132
- * Create a new code cell and run this text
133
- ```
134
- %%pyspark
135
- df = spark.sql("SELECT * FROM nyctaxi.trip")
136
- df.show(10)
137
- ```
138
- * To show this in a nicer format run this code
134
+ * Create a new code cell, enter the text below, adn run the cell
139
135
```
140
136
%%pyspark
141
137
df = spark.sql("SELECT * FROM nyctaxi.trip")
@@ -158,7 +154,7 @@ We have data available in a SQL pool DB. Now we load it into a Spark database.
158
154
```
159
155
* In the cell results, click on **Chart** to see the data visualized
160
156
161
- ## Visualize data with Spark and notebooks
157
+ ## Customize data visualization data with Spark and notebooks
162
158
163
159
With spark notebooks you can control exactly how render charts. The following
164
160
code shows a simple example using the popular libraries matplotlib and seaborn.
0 commit comments