Skip to content

Commit 40c6d8e

Browse files
authored
fixed pyspark bugs
1 parent ed8b677 commit 40c6d8e

File tree

1 file changed

+5
-5
lines changed

1 file changed

+5
-5
lines changed

articles/synapse-analytics/get-started.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -296,7 +296,7 @@ This will create a temporary view called 'trip_df'.
296296
SELECT
297297
*
298298
FROM
299-
Trip
299+
trip_df
300300
```
301301

302302
Now you will have the same output as above except the SQL language was used.
@@ -344,14 +344,14 @@ To get a chart like this;
344344
If you prefer not to use SQL then the same can be achieved with the following PySpark code
345345

346346
```python
347-
%%PySpark
347+
%%pyspark
348348
from pyspark.sql import functions as F
349349

350-
prepped_df = trip_df.select('TripDistanceMiles', 'PassengerCount')\
350+
prepped_df = data_path.select('TripDistanceMiles', 'PassengerCount')\
351351
.filter((F.col("TripDistanceMiles") > 0) & (F.col("PassengerCount") > 0))\
352-
.groupBy(trip_df.PassengerCount)\
352+
.groupBy(data_path.PassengerCount)\
353353
.agg(F.sum(F.col("TripDistanceMiles")).alias("SumTripDistance"),F.avg(F.col("TripDistanceMiles")).alias("AvgTripDistance"))\
354-
.orderBy(trip_df.PassengerCount)
354+
.orderBy(data_path.PassengerCount)
355355
display(prepped_df)
356356

357357
```

0 commit comments

Comments
 (0)