@@ -450,38 +450,46 @@ Logical types
450450Specific logical types can override the default Arrow type mapping for a given
451451physical type.
452452
453- +-------------------+-----------------------------+----------------------------+---------+
454- | Logical type | Physical type | Mapped Arrow type | Notes |
455- +===================+=============================+============================+=========+
456- | NULL | Any | Null | \( 1) |
457- +-------------------+-----------------------------+----------------------------+---------+
458- | INT | INT32 | Int8 / UInt8 / Int16 / | |
459- | | | UInt16 / Int32 / UInt32 | |
460- +-------------------+-----------------------------+----------------------------+---------+
461- | INT | INT64 | Int64 / UInt64 | |
462- +-------------------+-----------------------------+----------------------------+---------+
463- | DECIMAL | INT32 / INT64 / BYTE_ARRAY | Decimal128 / Decimal256 | \( 2) |
464- | | / FIXED_LENGTH_BYTE_ARRAY | | |
465- +-------------------+-----------------------------+----------------------------+---------+
466- | DATE | INT32 | Date32 | \( 3) |
467- +-------------------+-----------------------------+----------------------------+---------+
468- | TIME | INT32 | Time32 (milliseconds) | |
469- +-------------------+-----------------------------+----------------------------+---------+
470- | TIME | INT64 | Time64 (micro- or | |
471- | | | nanoseconds) | |
472- +-------------------+-----------------------------+----------------------------+---------+
473- | TIMESTAMP | INT64 | Timestamp (milli-, micro- | |
474- | | | or nanoseconds) | |
475- +-------------------+-----------------------------+----------------------------+---------+
476- | STRING | BYTE_ARRAY | String / LargeString / | |
477- | | | StringView | |
478- +-------------------+-----------------------------+----------------------------+---------+
479- | LIST | Any | List | \( 4) |
480- +-------------------+-----------------------------+----------------------------+---------+
481- | MAP | Any | Map | \( 5) |
482- +-------------------+-----------------------------+----------------------------+---------+
483- | FLOAT16 | FIXED_LENGTH_BYTE_ARRAY | HalfFloat | |
484- +-------------------+-----------------------------+----------------------------+---------+
453+ +-------------------+-----------------------------+------------------------------+-----------+
454+ | Logical type | Physical type | Mapped Arrow type | Notes |
455+ +===================+=============================+==============================+===========+
456+ | NULL | Any | Null | \( 1) |
457+ +-------------------+-----------------------------+------------------------------+-----------+
458+ | INT | INT32 | Int8 / UInt8 / Int16 / | |
459+ | | | UInt16 / Int32 / UInt32 | |
460+ +-------------------+-----------------------------+------------------------------+-----------+
461+ | INT | INT64 | Int64 / UInt64 | |
462+ +-------------------+-----------------------------+------------------------------+-----------+
463+ | DECIMAL | INT32 / INT64 / BYTE_ARRAY | Decimal128 / Decimal256 | \( 2) |
464+ | | / FIXED_LENGTH_BYTE_ARRAY | | |
465+ +-------------------+-----------------------------+------------------------------+-----------+
466+ | DATE | INT32 | Date32 | \( 3) |
467+ +-------------------+-----------------------------+------------------------------+-----------+
468+ | TIME | INT32 | Time32 (milliseconds) | |
469+ +-------------------+-----------------------------+------------------------------+-----------+
470+ | TIME | INT64 | Time64 (micro- or | |
471+ | | | nanoseconds) | |
472+ +-------------------+-----------------------------+------------------------------+-----------+
473+ | TIMESTAMP | INT64 | Timestamp (milli-, micro- | |
474+ | | | or nanoseconds) | |
475+ +-------------------+-----------------------------+------------------------------+-----------+
476+ | STRING | BYTE_ARRAY | String / LargeString / | |
477+ | | | StringView | |
478+ +-------------------+-----------------------------+------------------------------+-----------+
479+ | LIST | Any | List | \( 4) |
480+ +-------------------+-----------------------------+------------------------------+-----------+
481+ | MAP | Any | Map | \( 5) |
482+ +-------------------+-----------------------------+------------------------------+-----------+
483+ | FLOAT16 | FIXED_LENGTH_BYTE_ARRAY | HalfFloat | |
484+ +-------------------+-----------------------------+------------------------------+-----------+
485+ | UUID | FIXED_LENGTH_BYTE_ARRAY | Extension (``arrow.uuid ``) | \( 6) |
486+ +-------------------+-----------------------------+------------------------------+-----------+
487+ | JSON | BYTE_ARRAY | Extension (``arrow.json ``) | \( 6) |
488+ +-------------------+-----------------------------+------------------------------+-----------+
489+ | GEOMETRY | BYTE_ARRAY | Extension (``geoarrow.wkb ``) | \( 6) \( 7) |
490+ +-------------------+-----------------------------+------------------------------+-----------+
491+ | GEOGRAPHY | BYTE_ARRAY | Extension (``geoarrow.wkb ``) | \( 6) \( 7) |
492+ +-------------------+-----------------------------+------------------------------+-----------+
485493
486494* \( 1) On the write side, the Parquet physical type INT32 is generated.
487495
@@ -496,9 +504,14 @@ physical type.
496504 in contradiction with the
497505 `Parquet specification <https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#maps >`__.
498506
499- *Unsupported logical types: * JSON, BSON, UUID. If such a type is encountered
507+ * \( 6) Requires that ``arrow_extensions_enabled `` in ``ArrowReaderProperties `` is ``true ``.
508+ When ``false ``, the underlying storage type is read.
509+
510+ * \( 7) Requires that the ``geoarrow.wkb `` extension type is registered.
511+
512+ *Unsupported logical types: * BSON. If such a type is encountered
500513when reading a Parquet file, the default physical type mapping is used (for
501- example, a Parquet JSON column may be read as Arrow Binary or FixedSizeBinary).
514+ example, a Parquet BSON column may be read as Arrow Binary or FixedSizeBinary).
502515
503516Converted types
504517~~~~~~~~~~~~~~~
@@ -513,7 +526,10 @@ Special cases
513526
514527An Arrow Extension type is written out as its storage type. It can still
515528be recreated at read time using Parquet metadata (see "Roundtripping Arrow
516- types" below).
529+ types" below). Some extension types have Parquet LogicalType equivalents
530+ (e.g., UUID, JSON, GEOMETRY, GEOGRAPHY). These are created automatically
531+ if the appropriate option is set in the ``ArrowReaderProperties `` even if
532+ there was no Arrow schema stored in the Parquet metadata.
517533
518534An Arrow Dictionary type is written out as its value type. It can still
519535be recreated at read time using Parquet metadata (see "Roundtripping Arrow
0 commit comments