File tree Expand file tree Collapse file tree 2 files changed +12
-7
lines changed
Expand file tree Collapse file tree 2 files changed +12
-7
lines changed Original file line number Diff line number Diff line change @@ -524,9 +524,11 @@ class PARQUET_EXPORT WriterProperties {
524524
525525 // / Enable writing page index in general for all columns. Default disabled.
526526 // /
527- // / Page index contains statistics for data pages and can be used to skip pages
528- // / when scanning data in ordered and unordered columns. Note that it does not
529- // / write statistics to the page header once page index is enabled.
527+ // / Writing statistics to the page index disables the old method of writing
528+ // / statistics to each data page header.
529+ // / The page index makes filtering more efficient than the page header, as
530+ // / it gathers all the statistics for a Parquet file in a single place,
531+ // / avoiding scattered I/O.
530532 // /
531533 // / Please check the link below for more details:
532534 // / https://github.com/apache/parquet-format/blob/master/PageIndex.md
Original file line number Diff line number Diff line change @@ -575,14 +575,17 @@ Miscellaneous
575575+--------------------------+----------+----------+---------+
576576| Feature | Reading | Writing | Notes |
577577+==========================+==========+==========+=========+
578- | Column Index | ✓ | ✓ | |
578+ | Column Index | ✓ | ✓ | \( 1) |
579579+--------------------------+----------+----------+---------+
580- | Offset Index | ✓ | ✓ | |
580+ | Offset Index | ✓ | ✓ | \( 1) |
581581+--------------------------+----------+----------+---------+
582- | Bloom Filter | ✓ | ✓ | \( 1 ) |
582+ | Bloom Filter | ✓ | ✓ | \( 2 ) |
583583+--------------------------+----------+----------+---------+
584584| CRC checksums | ✓ | ✓ | |
585585+--------------------------+----------+----------+---------+
586586
587- * \( 1) APIs are provided for creating, serializing and deserializing Bloom
587+ * \( 1) Access to the Column and Offset Index structures is provided, but
588+ data read APIs do not currently make any use of them.
589+
590+ * \( 2) APIs are provided for creating, serializing and deserializing Bloom
588591 Filters, but they are not integrated into data read APIs.
You can’t perform that action at this time.
0 commit comments