Skip to content

Commit 5659706

Browse files
committed
Merge branch 'patch-1' of https://github.com/tempacct791/azure-docs into public-prs-feb-2025-2
2 parents fcaaaf8 + df7e839 commit 5659706

File tree

1 file changed

+11
-11
lines changed

1 file changed

+11
-11
lines changed

articles/synapse-analytics/spark/apache-spark-data-visualization-tutorial.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -35,16 +35,16 @@ Create an Apache Spark Pool by following the [Create an Apache Spark pool tutori
3535
3. Because the raw data is in a Parquet format, you can use the Spark context to pull the file into memory as a DataFrame directly. Create a Spark DataFrame by retrieving the data via the Open Datasets API. Here, we use the Spark DataFrame *schema on read* properties to infer the datatypes and schema.
3636

3737
```python
38-
from azureml.opendatasets import NycTlcYellow
39-
40-
from datetime import datetime
41-
from dateutil import parser
42-
43-
end_date = parser.parse('2018-05-08 00:00:00')
44-
start_date = parser.parse('2018-05-01 00:00:00')
45-
46-
nyc_tlc = NycTlcYellow(start_date=start_date, end_date=end_date)
47-
filtered_df = spark.createDataFrame(nyc_tlc.to_pandas_dataframe())
38+
from azureml.opendatasets import NycTlcYellow
39+
40+
from datetime import datetime
41+
from dateutil import parser
42+
43+
end_date = parser.parse('2018-05-08 00:00:00')
44+
start_date = parser.parse('2018-05-01 00:00:00')
45+
46+
nyc_tlc = NycTlcYellow(start_date=start_date, end_date=end_date)
47+
df = spark.createDataFrame(nyc_tlc.to_pandas_dataframe())
4848

4949
```
5050

@@ -174,4 +174,4 @@ After you finish running the application, shut down the notebook to release the
174174
## Next steps
175175

176176
- [Azure Synapse Analytics](../index.yml)
177-
- [Apache Spark official documentation](https://spark.apache.org/docs/latest/)
177+
- [Apache Spark official documentation](https://spark.apache.org/docs/latest/)

0 commit comments

Comments
 (0)