@@ -304,6 +304,8 @@ Statistics are enabled by default for all columns. You can disable statistics fo
304304all columns or specific columns using ``disable_statistics `` on the builder.
305305There is a ``max_statistics_size `` which limits the maximum number of bytes that
306306may be used for min and max values, useful for types like strings or binary blobs.
307+ If a column has enabled page index using ``enable_write_page_index ``, then it does
308+ not write statistics to the page header because it is duplicated in the ColumnIndex.
307309
308310There are also Arrow-specific settings that can be configured with
309311:class: `parquet::ArrowWriterProperties `:
@@ -573,20 +575,14 @@ Miscellaneous
573575+--------------------------+----------+----------+---------+
574576| Feature | Reading | Writing | Notes |
575577+==========================+==========+==========+=========+
576- | Column Index | ✓ | | \( 1) |
578+ | Column Index | ✓ | ✓ | |
577579+--------------------------+----------+----------+---------+
578- | Offset Index | ✓ | | \( 1) |
580+ | Offset Index | ✓ | ✓ | |
579581+--------------------------+----------+----------+---------+
580- | Bloom Filter | ✓ | ✓ | \( 2 ) |
582+ | Bloom Filter | ✓ | ✓ | \( 1 ) |
581583+--------------------------+----------+----------+---------+
582- | CRC checksums | ✓ | ✓ | \( 3) |
584+ | CRC checksums | ✓ | ✓ | |
583585+--------------------------+----------+----------+---------+
584586
585- * \( 1) Access to the Column and Offset Index structures is provided, but
586- data read APIs do not currently make any use of them.
587-
588- * \( 2) APIs are provided for creating, serializing and deserializing Bloom
587+ * \( 1) APIs are provided for creating, serializing and deserializing Bloom
589588 Filters, but they are not integrated into data read APIs.
590-
591- * \( 3) For now, only the checksums of V1 Data Pages and Dictionary Pages
592- are computed.
0 commit comments