@@ -55,20 +55,20 @@ spark.read.parquet_metadata("/path/to/parquet").show()
5555
5656The Dataframe provides the following per-file information:
5757
58- | column | type | description |
59- | :-----------------| :----:| :----------------------------------------------------------------------------|
60- | filename | string| The Parquet file name |
61- | blocks | int | Number of blocks / RowGroups in the Parquet file |
62- | compressedBytes | long | Number of compressed bytes of all blocks |
63- | uncompressedBytes | long | Number of uncompressed bytes of all blocks |
64- | rows | long | Number of rows in the file |
65- | columns | int | Number of columns in the file |
66- | values | long | Number of values in the file |
67- | nulls | long | Number of null values in the file |
68- | createdBy | string| The createdBy string of the Parquet file, e.g. library used to write the file|
69- | schema | string| The schema |
70- | encryption | string| The encryption |
71- | keyValues | string-to-string map| Key-value data of the file |
58+ | column | type | description |
59+ | :-----------------| :----:| :------------------------------------------------------------------------------ |
60+ | filename | string| The Parquet file name |
61+ | blocks | int | Number of blocks / RowGroups in the Parquet file |
62+ | compressedBytes | long | Number of compressed bytes of all blocks |
63+ | uncompressedBytes | long | Number of uncompressed bytes of all blocks |
64+ | rows | long | Number of rows in the file |
65+ | columns | int | Number of columns in the file |
66+ | values | long | Number of values in the file |
67+ | nulls | long | Number of null values in the file |
68+ | createdBy | string| The createdBy string of the Parquet file, e.g. library used to write the file |
69+ | schema | string| The schema |
70+ | encryption | string| The encryption (requires org.apache.parquet:parquet-hadoop:1.12.4 and above) |
71+ | keyValues | string-to-string map| Key-value data of the file |
7272
7373## Parquet file schema
7474
@@ -96,20 +96,20 @@ spark.read.parquet_schema("/path/to/parquet").show()
9696
9797The Dataframe provides the following per-file information:
9898
99- | column | type | description |
100- | :-----------------| :----:| :-------------------------------------|
101- | filename | string| The Parquet file name |
102- | columnName | string| The column name |
103- | columnPath | string array| The column path |
104- | repetition | string| The repetition |
105- | type | string| The data type |
106- | length | int | The length of the type |
107- | originalType | string | The original type |
108- | isPrimitive | boolean| True if type is primitive |
109- | primitiveType | string| The primitive type |
110- | primitiveOrder | string| The order of the primitive type |
111- | maxDefinitionLevel| int | The max definition level |
112- | maxRepetitionLevel| int | The max repetition level |
99+ | column | type | description |
100+ | :-----------------| :------------ :| :------------------------------------------- -------------------------------------|
101+ | filename | string | The Parquet file name |
102+ | columnName | string | The column name |
103+ | columnPath | string array | The column path |
104+ | repetition | string | The repetition |
105+ | type | string | The data type |
106+ | length | int | The length of the type |
107+ | originalType | string | The original type (requires org.apache.parquet:parquet-hadoop:1.11.0 and above) |
108+ | isPrimitive | boolean | True if type is primitive |
109+ | primitiveType | string | The primitive type |
110+ | primitiveOrder | string | The order of the primitive type |
111+ | maxDefinitionLevel| int | The max definition level |
112+ | maxRepetitionLevel| int | The max repetition level |
113113
114114## Parquet block / RowGroup metadata
115115
0 commit comments