You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Your first step is to **create a database** with a datasource that references [NYC Yellow Taxi](https://azure.microsoft.com/services/open-datasets/catalog/nyc-taxi-limousine-commission-yellow-taxi-trip-records/) storage account. Then initialize the objects by executing [setup script](https://github.com/Azure-Samples/Synapse/blob/master/SQL/Samples/LdwSample/SampleDB.sql) on that database. This setup script will create the data sources, database scoped credentials, and external file formats that are used in these samples.
24
21
25
22
## Dataset
26
23
27
-
You can query Parquet files the same way you read CSV files. The only difference is that the FILEFORMAT parameter should be set to PARQUET. Examples in this article show the specifics of reading Parquet files.
28
-
29
-
> [!NOTE]
30
-
> You do not have to specify columns in the OPENROWSET WITH clause when reading parquet files. SQL on-demand will utilize metadata in the Parquet file and bind columns by name.
31
-
32
-
You'll use the folder *parquet/taxi* for the sample queries. It contains NYC Taxi - Yellow Taxi Trip Records data from July 2016. to June 2018.
33
-
34
-
Data is partitioned by year and month and the folder structure is as follows:
35
-
36
-
- year=2016
37
-
- month=6
38
-
- ...
39
-
- month=12
40
-
- year=2017
41
-
- month=1
42
-
- ...
43
-
- month=12
44
-
- year=2018
45
-
- month=1
46
-
- ...
47
-
- month=6
24
+
[NYC Yellow Taxi](https://azure.microsoft.com/services/open-datasets/catalog/nyc-taxi-limousine-commission-yellow-taxi-trip-records/) dataset is used in this sample. You can query Parquet files the same way you [read CSV files](query-parquet-files.md). The only difference is that the `FILEFORMAT` parameter should be set to `PARQUET`. Examples in this article show the specifics of reading Parquet files.
48
25
49
26
## Query set of parquet files
50
27
51
28
You can specify only the columns of interest when you query Parquet files.
@@ -81,13 +59,13 @@ The sample below shows the automatic schema inference capabilities for Parquet f
81
59
> You don't have to specify columns in the OPENROWSET WITH clause when reading Parquet files. In that case, SQL on-demand Query service will utilize metadata in the Parquet file and bind columns by name.
0 commit comments