diff --git a/parquet/THRIFT.md b/parquet/THRIFT.md
new file mode 100644
index 000000000000..06e97709cce3
--- /dev/null
+++ b/parquet/THRIFT.md
@@ -0,0 +1,447 @@
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+# Thrift serialization in the parquet crate
+
+For both performance and flexibility reasons, this crate uses custom Thrift parsers and
+serialization mechanisms. For many of the objects defined by the Parquet specification macros
+are used to generate the objects as well as the code to serialize them. But in certain instances
+(performance bottlenecks, additions to the spec, etc.), it becomes necessary to implement the
+serialization code manually. This document serves to document both the standard usage of the
+Thrift macros, as well as how to implement custom encoders and decoders.
+
+## Thrift macros
+
+The Parquet specification utilizes Thrift enums, unions, and structs, defined by an Interface
+Description Language (IDL). This IDL is usually parsed by a Thrift code generator to produce
+language specific structures and serialization/deserialization code. This crate, however, uses
+Rust macros to perform the same function. In addition to skipping creation of additional duplicate
+structures, doing so allows for customizations that produce more performant code, as well as the
+ability to pick and choose which fields to process.
+
+### Enums
+
+Thrift enums are the simplest structure, and are logically identical to Rust enums with unit
+variants. The IDL description will look like
+
+```
+enum Type {
+  BOOLEAN = 0;
+  INT32 = 1;
+  INT64 = 2;
+  INT96 = 3;
+  FLOAT = 4;
+  DOUBLE = 5;
+  BYTE_ARRAY = 6;
+  FIXED_LEN_BYTE_ARRAY = 7;
+}
+```
+
+The `thrift_enum` macro can be used in this instance.
+
+```rust
+thrift_enum!(
+    enum Type {
+  BOOLEAN = 0;
+  INT32 = 1;
+  INT64 = 2;
+  INT96 = 3;
+  FLOAT = 4;
+  DOUBLE = 5;
+  BYTE_ARRAY = 6;
+  FIXED_LEN_BYTE_ARRAY = 7;
+}
+);
+```
+
+which will produce a public Rust enum
+
+```rust
+pub enum Type {
+  BOOLEAN,
+  INT32,
+  INT64,
+  INT96,
+  FLOAT,
+  DOUBLE,
+  BYTE_ARRAY,
+  FIXED_LEN_BYTE_ARRAY,
+}
+```
+
+### Unions
+
+Thrift unions are a special kind of struct in which only a single field is populated. In this
+regard they are much like Rust enums which can have a mix of unit and tuple variants. Because of
+this flexibility, specifying unions is a little bit trickier.
+
+Often times a union will be defined for which all the variants are typed with empty structs. For
+example the `TimeUnit` union used for `LogicalType`s.
+
+```
+struct MilliSeconds {}
+struct MicroSeconds {}
+struct NanoSeconds {}
+union TimeUnit {
+  1: MilliSeconds MILLIS
+  2: MicroSeconds MICROS
+  3: NanoSeconds NANOS
+}
+```
+
+When serialized, these empty structs become a single `0` (to mark the end of the struct). As an
+optimization, and to allow for a simpler interface, the `thrift_union_all_empty` macro can be used.
+
+```rust
+thrift_union_all_empty!(
+union TimeUnit {
+  1: MilliSeconds MILLIS
+  2: MicroSeconds MICROS
+  3: NanoSeconds NANOS
+}
+);
+```
+
+This macro will ignore the types specified for each variant, and will produce the following Rust
+`enum`:
+
+```rust
+pub enum TimeUnit {
+    MILLIS,
+    MICROS,
+    NANOS,
+}
+```
+
+For unions with mixed variant types, some modifications to the IDL are necessary. Take the
+definition of `ColumnCryptoMetadata`.
+
+```
+struct EncryptionWithFooterKey {
+}
+
+struct EncryptionWithColumnKey {
+  /** Column path in schema **/
+  1: required list<string> path_in_schema
+
+  /** Retrieval metadata of column encryption key **/
+  2: optional binary key_metadata
+}
+
+union ColumnCryptoMetaData {
+  1: EncryptionWithFooterKey ENCRYPTION_WITH_FOOTER_KEY
+  2: EncryptionWithColumnKey ENCRYPTION_WITH_COLUMN_KEY
+}
+```
+
+The `ENCRYPTION_WITH_FOOTER_KEY` variant is typed with an empty struct, while
+`ENCRYPTION_WITH_COLUMN_KEY` has the type of a struct with fields. In this case, the `thrift_union`
+macro is used.
+
+```rust
+thrift_union!(
+union ColumnCryptoMetaData {
+  1: ENCRYPTION_WITH_FOOTER_KEY
+  2: (EncryptionWithColumnKey) ENCRYPTION_WITH_COLUMN_KEY
+}
+);
+```
+
+Here, the type has been omitted for `ENCRYPTION_WITH_FOOTER_KEY` to indicate it should be a unit
+variant, while the type for `ENCRYPTION_WITH_COLUMN_KEY` is enclosed in parens. The parens are
+necessary to provide a semantic clue to the macro that the identifier is a type. The above will
+produce the Rust enum
+
+```rust
+pub enum ColumnCryptoMetaData {
+    ENCRYPTION_WITH_FOOTER_KEY,
+    ENCRYPTION_WITH_COLUMN_KEY(EncryptionWithColumnKey),
+}
+```
+
+### Structs
+
+The `thrift_struct` macro is used for structs. This macro is a little more flexible than the others
+because it allows for the visibility to be specified, and also allows for lifetimes to be specified
+for the defined structs as well as their fields. An example of this is the `SchemaElement` struct.
+This is defined in this crate as
+
+```rust
+thrift_struct!(
+pub(crate) struct SchemaElement<'a> {
+  1: optional Type r#type;
+  2: optional i32 type_length;
+  3: optional Repetition repetition_type;
+  4: required string<'a> name;
+  5: optional i32 num_children;
+  6: optional ConvertedType converted_type;
+  7: optional i32 scale
+  8: optional i32 precision
+  9: optional i32 field_id;
+  10: optional LogicalType logical_type
+}
+);
+```
+
+Here the `string` field `name` is given a lifetime annotation, which is then propagated to the
+struct definition. Without this annotation, the resultant field would be a `String` type, rather
+than a string slice. The visibility of this struct (and all fields) will be `pub(crate)`. The
+resultant Rust struct will be
+
+```rust
+pub(crate) struct SchemaElement<'a> {
+    pub(crate) r#type: Type, // here we've changed the name `type` to `r#type` to avoid reserved words
+    pub(crate) type_length: i32,
+    pub(crate) repetition_type: Repetition,
+    pub(crate) name: &'a str,
+    ...
+}
+```
+
+The lifetime annotations can also be added to list elements, as in
+
+```rust
+thrift_struct!(
+struct FileMetaData<'a> {
+  /** Version of this file **/
+  1: required i32 version
+  2: required list<'a><SchemaElement> schema;
+  3: required i64 num_rows
+  4: required list<'a><RowGroup> row_groups
+  5: optional list<KeyValue> key_value_metadata
+  6: optional string created_by
+  7: optional list<ColumnOrder> column_orders;
+  8: optional EncryptionAlgorithm encryption_algorithm
+  9: optional binary footer_signing_key_metadata
+}
+);
+```
+
+Note that the lifetime annotation precedes the element type specification.
+
+## Serialization traits
+
+Serialization is performed via several Rust traits. On the deserialization, objects implement
+the `ReadThrift` trait. This defines a `read_thrift` function that takes a
+`ThriftCompactInputProtocol` I/O object as an argument. The `read_thrift` function performs
+all steps necessary to deserialize the object from the input stream, and is usually produced by
+one of the macros mentioned above.
+
+On the serialization side, the `WriteThrift` and `WriteThriftField` traits are used in conjunction
+with a `ThriftCompactOutputProtocol` struct. As above, the Thrift macros produce the necessary
+implementations needed to perform serialization.
+
+While the macros can be used in most circumstances, sometimes more control is needed. The following
+sections provide information on how to provide custom implementations for the serialization
+traits.
+
+### ReadThrift Customization
+
+Thrift enums are serialized as a single `i32` value. The process of reading an enum is straightforward:
+read the enum discriminant, and then match on the possible values. For instance, reading the
+`ConvertedType` enum becomes:
+
+```rust
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for ConvertedType {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        let val = prot.read_i32()?;
+        Ok(match val {
+            0 => Self::UTF8,
+            1 => Self::MAP,
+            2 => Self::MAP_KEY_VALUE,
+            ...
+            21 => Self::INTERVAL,
+            _ => return Err(general_err!("Unexpected ConvertedType {}", val)),
+        })
+    }
+}
+```
+
+The default behavior is to return an error when an unexpected field is encountered. One could,
+however, provide an `Unknown` variant if forward compatibility is neeeded in the case of an
+evolving enum.
+
+Deserializing structs is more involved, but still fairly easy. A thrift struct is serialized as
+repeated `(field_id,field_type,field)` tuples. The `field_id` and `field_type` usually occupy a
+single byte, followed by the Thrift encoded field. Because only 4 bits are available for the id,
+encoders usually will instead use deltas from the preceding field. If the delta will exceed 15,
+then the `field_id` nibble will be set to `0`, and the `field_id` will instead be encoded as a
+varint, following the `field_type`. Fields will generally be read in a loop, with the `field_id`
+and `field_type` read first, and then the `field_id` used to determine which field to read.
+When a `field_id` of `0` is encountered, this marks the end of the struct and processing ceases.
+Here is an example of the processing loop:
+
+```rust
+    let mut last_field_id = 0i16;
+    loop {
+        // read the field id and field type. break if we encounter `Stop`
+        let field_ident = prot.read_field_begin(last_field_id)?;
+        if field_ident.field_type == FieldType::Stop {
+            break;
+        }
+        // match on the field id
+        match field_ident.id {
+            1 => {
+                let val = i32::read_thrift(&mut *prot)?;
+                num_values = Some(val);
+            }
+            2 => {
+                let val = Encoding::read_thrift(&mut *prot)?;
+                encoding = Some(val);
+            }
+            3 => {
+                let val = Encoding::read_thrift(&mut *prot)?;
+                definition_level_encoding = Some(val);
+            }
+            4 => {
+                let val = Encoding::read_thrift(&mut *prot)?;
+                repetition_level_encoding = Some(val);
+            }
+            // Thrift structs are meant to be forward compatible, so do not error
+            // here. Instead, simply skip unknown fields.
+            _ => {
+                prot.skip(field_ident.field_type)?;
+            }
+        };
+        // set the last seen field id to calculate the next field_id
+        last_field_id = field_ident.id;
+    }
+```
+
+Thrift unions are encoded as structs, but only a single field will be encoded. The loop above
+can be eliminated, and only the `match` on the id performed. A subsequent call to
+`read_field_begin` must return `Stop`, or an error should be returned. Here's an example from
+the decoding of the `LogicalType` union:
+
+```rust
+    // read the discriminant, error if it is `0`
+    let field_ident = prot.read_field_begin(0)?;
+    if field_ident.field_type == FieldType::Stop {
+        return Err(general_err!("received empty union from remote LogicalType"));
+    }
+    let ret = match field_ident.id {
+        1 => {
+            prot.skip_empty_struct()?;
+            Self::String
+        }
+        ...
+        _ => {
+            // LogicalType needs to be forward compatible, so we have defined an `_Unknown`
+            // variant for it. This can return an error if forward compatibility is not desired.
+            prot.skip(field_ident.field_type)?;
+            Self::_Unknown {
+                field_id: field_ident.id,
+            }
+        }
+    };
+    // test to ensure there is only one field present
+    let field_ident = prot.read_field_begin(field_ident.id)?;
+    if field_ident.field_type != FieldType::Stop {
+        return Err(general_err!(
+            "Received multiple fields for union from remote LogicalType"
+        ));
+    }
+```
+
+### WriteThrift Customization
+
+On the serialization side, there are two traits to implement. The first, `WriteThrift`, is used
+for actually serializing the object. The other, `WriteThriftField`, handles serializing objects
+as struct fields.
+
+Serializing enums is as simple as writing the discriminant as an `i32`. For example, here is the
+custom serialization code for `ConvertedType`:
+
+```rust
+impl WriteThrift for ConvertedType {
+    const ELEMENT_TYPE: ElementType = ElementType::I32;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        // because we've added NONE, the variant values are off by 1, so correct that here
+        writer.write_i32(*self as i32 - 1)
+    }
+}
+```
+
+Structs and unions are serialized by field. When performing the serialization, one needs to keep
+track of the last field that has been written, as this is needed to calculate the delta in the
+Thrift field header. For required fields this is not strictly necessary, but when writing
+optional fields it is. A typical `write_thrift` implementation will look like:
+
+```rust
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        // required field f1
+        self.f1.write_thrift_field(writer, 1, 0)?; // field_id == 1, last_field_id == 0
+        // required field f2
+        self.f2.write_thrift_field(writer, 2, 1)?; // field_id == 2, last_field_id == 1
+        // final required field f3, we now save the last_field_id, which is returned by write_thrift_field
+        let mut last_field_id = self.f3.write_thrift_field(writer, 3, 2)?; // field_id == 3, last_field_id == 2
+
+        // optional field f4
+        if let Some(val) = self.f4.as_ref() {
+            last_field_id = val.write_thrift_field(writer, 4, last_field_id)?;
+        }
+        // optional field f5
+        if let Some(val) = self.f5.as_ref() {
+            last_field_id = val.write_thrift_field(writer, 5, last_field_id)?;
+        }
+        // write end of struct
+        writer.write_struct_end()
+    }
+```
+
+### Handling for lists
+
+Lists of serialized objects can usually be read using `parquet_thrift::read_thrift_vec` and written
+using the `WriteThrift::write_thrift` implementation for vectors of objects that implement
+`WriteThrift`.
+
+When reading a list, one first reads the list header which will provide the number of elements
+that have been encoded, and then read elements one at a time.
+
+```rust
+    // read the list header
+    let list_ident = prot.read_list_begin()?;
+    // allocate vector with enough capacity
+    let mut page_locations = Vec::with_capacity(list_ident.size as usize);
+    // read elements
+    for _ in 0..list_ident.size {
+        page_locations.push(read_page_location(prot)?);
+    }
+```
+
+Writing is simply the reverse: write the list header, and then serialize the elements:
+
+```rust
+    // write the list header
+    writer.write_list_begin(ElementType::Struct, page_locations.len)?;
+    // write the elements
+    for i in 0..len {
+        page_locations[i].write_thrift(writer)?;
+    }
+```
+
+## More examples
+
+For more examples, the easiest thing to do is to [expand](https://github.com/dtolnay/cargo-expand)
+the thrift macros. For instance, to see the implementations generated in the `basic` module, type:
+
+```sh
+% cargo expand -p parquet --lib --all-features basic
+```
diff --git a/parquet/src/arrow/arrow_reader/mod.rs b/parquet/src/arrow/arrow_reader/mod.rs
index 19ba93d66642..3fa008335cf6 100644
--- a/parquet/src/arrow/arrow_reader/mod.rs
+++ b/parquet/src/arrow/arrow_reader/mod.rs
@@ -30,6 +30,7 @@ pub use crate::arrow::array_reader::RowGroups;
 use crate::arrow::array_reader::{ArrayReader, ArrayReaderBuilder};
 use crate::arrow::schema::{parquet_to_arrow_schema_and_fields, ParquetField};
 use crate::arrow::{parquet_to_arrow_field_levels, FieldLevels, ProjectionMask};
+use crate::basic::{BloomFilterAlgorithm, BloomFilterCompression, BloomFilterHash};
 use crate::bloom_filter::{
     chunk_read_bloom_filter_header_and_offset, Sbbf, SBBF_HEADER_SIZE_ESTIMATE,
 };
@@ -39,7 +40,6 @@ use crate::encryption::decrypt::FileDecryptionProperties;
 use crate::errors::{ParquetError, Result};
 use crate::file::metadata::{PageIndexPolicy, ParquetMetaData, ParquetMetaDataReader};
 use crate::file::reader::{ChunkReader, SerializedPageReader};
-use crate::format::{BloomFilterAlgorithm, BloomFilterCompression, BloomFilterHash};
 use crate::schema::types::SchemaDescriptor;
 
 use crate::arrow::arrow_reader::metrics::ArrowReaderMetrics;
@@ -261,7 +261,7 @@ impl<T> ArrowReaderBuilder<T> {
     /// Skip 1100      (skip the remaining 900 rows in row group 2 and the first 200 rows in row group 3)
     /// ```
     ///
-    /// [`Index`]: crate::file::page_index::index::Index
+    /// [`Index`]: crate::file::page_index::column_index::ColumnIndexMetaData
     pub fn with_row_selection(self, selection: RowSelection) -> Self {
         Self {
             selection: Some(selection),
@@ -819,17 +819,17 @@ impl<T: ChunkReader + 'static> ParquetRecordBatchReaderBuilder<T> {
             chunk_read_bloom_filter_header_and_offset(offset, buffer.clone())?;
 
         match header.algorithm {
-            BloomFilterAlgorithm::BLOCK(_) => {
+            BloomFilterAlgorithm::BLOCK => {
                 // this match exists to future proof the singleton algorithm enum
             }
         }
         match header.compression {
-            BloomFilterCompression::UNCOMPRESSED(_) => {
+            BloomFilterCompression::UNCOMPRESSED => {
                 // this match exists to future proof the singleton compression enum
             }
         }
         match header.hash {
-            BloomFilterHash::XXHASH(_) => {
+            BloomFilterHash::XXHASH => {
                 // this match exists to future proof the singleton hash enum
             }
         }
@@ -1185,6 +1185,7 @@ mod tests {
         FloatType, Int32Type, Int64Type, Int96, Int96Type,
     };
     use crate::errors::Result;
+    use crate::file::metadata::ParquetMetaData;
     use crate::file::properties::{EnabledStatistics, WriterProperties, WriterVersion};
     use crate::file::writer::SerializedFileWriter;
     use crate::schema::parser::parse_message_type;
@@ -2913,7 +2914,7 @@ mod tests {
         schema: TypePtr,
         field: Option<Field>,
         opts: &TestOptions,
-    ) -> Result<crate::format::FileMetaData> {
+    ) -> Result<ParquetMetaData> {
         let mut writer_props = opts.writer_props();
         if let Some(field) = field {
             let arrow_schema = Schema::new(vec![field]);
diff --git a/parquet/src/arrow/arrow_reader/selection.rs b/parquet/src/arrow/arrow_reader/selection.rs
index 229eae4c5bb6..21ed97b8bde1 100644
--- a/parquet/src/arrow/arrow_reader/selection.rs
+++ b/parquet/src/arrow/arrow_reader/selection.rs
@@ -21,6 +21,8 @@ use std::cmp::Ordering;
 use std::collections::VecDeque;
 use std::ops::Range;
 
+use crate::file::page_index::offset_index::PageLocation;
+
 /// [`RowSelection`] is a collection of [`RowSelector`] used to skip rows when
 /// scanning a parquet file
 #[derive(Debug, Clone, Copy, Eq, PartialEq)]
@@ -95,7 +97,7 @@ impl RowSelector {
 /// * It contains no [`RowSelector`] of 0 rows
 /// * Consecutive [`RowSelector`]s alternate skipping or selecting rows
 ///
-/// [`PageIndex`]: crate::file::page_index::index::PageIndex
+/// [`PageIndex`]: crate::file::page_index::column_index::ColumnIndexMetaData
 #[derive(Debug, Clone, Default, Eq, PartialEq)]
 pub struct RowSelection {
     selectors: Vec<RowSelector>,
@@ -162,7 +164,7 @@ impl RowSelection {
     /// Note: this method does not make any effort to combine consecutive ranges, nor coalesce
     /// ranges that are close together. This is instead delegated to the IO subsystem to optimise,
     /// e.g. [`ObjectStore::get_ranges`](object_store::ObjectStore::get_ranges)
-    pub fn scan_ranges(&self, page_locations: &[crate::format::PageLocation]) -> Vec<Range<u64>> {
+    pub fn scan_ranges(&self, page_locations: &[PageLocation]) -> Vec<Range<u64>> {
         let mut ranges: Vec<Range<u64>> = vec![];
         let mut row_offset = 0;
 
@@ -693,7 +695,6 @@ fn union_row_selections(left: &[RowSelector], right: &[RowSelector]) -> RowSelec
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::format::PageLocation;
     use rand::{rng, Rng};
 
     #[test]
diff --git a/parquet/src/arrow/arrow_reader/statistics.rs b/parquet/src/arrow/arrow_reader/statistics.rs
index eba1f561203c..1613656ab9ae 100644
--- a/parquet/src/arrow/arrow_reader/statistics.rs
+++ b/parquet/src/arrow/arrow_reader/statistics.rs
@@ -25,7 +25,7 @@ use crate::basic::Type as PhysicalType;
 use crate::data_type::{ByteArray, FixedLenByteArray};
 use crate::errors::{ParquetError, Result};
 use crate::file::metadata::{ParquetColumnIndex, ParquetOffsetIndex, RowGroupMetaData};
-use crate::file::page_index::index::{Index, PageIndex};
+use crate::file::page_index::column_index::{ColumnIndexIterators, ColumnIndexMetaData};
 use crate::file::statistics::Statistics as ParquetStatistics;
 use crate::schema::types::SchemaDescriptor;
 use arrow_array::builder::{
@@ -597,17 +597,17 @@ macro_rules! get_statistics {
 }
 
 macro_rules! make_data_page_stats_iterator {
-    ($iterator_type: ident, $func: expr, $index_type: path, $stat_value_type: ty) => {
+    ($iterator_type: ident, $func: ident, $stat_value_type: ty) => {
         struct $iterator_type<'a, I>
         where
-            I: Iterator<Item = (usize, &'a Index)>,
+            I: Iterator<Item = (usize, &'a ColumnIndexMetaData)>,
         {
             iter: I,
         }
 
         impl<'a, I> $iterator_type<'a, I>
         where
-            I: Iterator<Item = (usize, &'a Index)>,
+            I: Iterator<Item = (usize, &'a ColumnIndexMetaData)>,
         {
             fn new(iter: I) -> Self {
                 Self { iter }
@@ -616,7 +616,7 @@ macro_rules! make_data_page_stats_iterator {
 
         impl<'a, I> Iterator for $iterator_type<'a, I>
         where
-            I: Iterator<Item = (usize, &'a Index)>,
+            I: Iterator<Item = (usize, &'a ColumnIndexMetaData)>,
         {
             type Item = Vec<Option<$stat_value_type>>;
 
@@ -624,16 +624,14 @@ macro_rules! make_data_page_stats_iterator {
                 let next = self.iter.next();
                 match next {
                     Some((len, index)) => match index {
-                        $index_type(native_index) => {
-                            Some(native_index.indexes.iter().map($func).collect::<Vec<_>>())
-                        }
                         // No matching `Index` found;
                         // thus no statistics that can be extracted.
                         // We return vec![None; len] to effectively
                         // create an arrow null-array with the length
                         // corresponding to the number of entries in
                         // `ParquetOffsetIndex` per row group per column.
-                        _ => Some(vec![None; len]),
+                        ColumnIndexMetaData::NONE => Some(vec![None; len]),
+                        _ => Some(<$stat_value_type>::$func(&index).collect::<Vec<_>>()),
                     },
                     _ => None,
                 }
@@ -646,101 +644,45 @@ macro_rules! make_data_page_stats_iterator {
     };
 }
 
-make_data_page_stats_iterator!(
-    MinBooleanDataPageStatsIterator,
-    |x: &PageIndex<bool>| { x.min },
-    Index::BOOLEAN,
-    bool
-);
-make_data_page_stats_iterator!(
-    MaxBooleanDataPageStatsIterator,
-    |x: &PageIndex<bool>| { x.max },
-    Index::BOOLEAN,
-    bool
-);
-make_data_page_stats_iterator!(
-    MinInt32DataPageStatsIterator,
-    |x: &PageIndex<i32>| { x.min },
-    Index::INT32,
-    i32
-);
-make_data_page_stats_iterator!(
-    MaxInt32DataPageStatsIterator,
-    |x: &PageIndex<i32>| { x.max },
-    Index::INT32,
-    i32
-);
-make_data_page_stats_iterator!(
-    MinInt64DataPageStatsIterator,
-    |x: &PageIndex<i64>| { x.min },
-    Index::INT64,
-    i64
-);
-make_data_page_stats_iterator!(
-    MaxInt64DataPageStatsIterator,
-    |x: &PageIndex<i64>| { x.max },
-    Index::INT64,
-    i64
-);
+make_data_page_stats_iterator!(MinBooleanDataPageStatsIterator, min_values_iter, bool);
+make_data_page_stats_iterator!(MaxBooleanDataPageStatsIterator, max_values_iter, bool);
+make_data_page_stats_iterator!(MinInt32DataPageStatsIterator, min_values_iter, i32);
+make_data_page_stats_iterator!(MaxInt32DataPageStatsIterator, max_values_iter, i32);
+make_data_page_stats_iterator!(MinInt64DataPageStatsIterator, min_values_iter, i64);
+make_data_page_stats_iterator!(MaxInt64DataPageStatsIterator, max_values_iter, i64);
 make_data_page_stats_iterator!(
     MinFloat16DataPageStatsIterator,
-    |x: &PageIndex<FixedLenByteArray>| { x.min.clone() },
-    Index::FIXED_LEN_BYTE_ARRAY,
+    min_values_iter,
     FixedLenByteArray
 );
 make_data_page_stats_iterator!(
     MaxFloat16DataPageStatsIterator,
-    |x: &PageIndex<FixedLenByteArray>| { x.max.clone() },
-    Index::FIXED_LEN_BYTE_ARRAY,
+    max_values_iter,
     FixedLenByteArray
 );
-make_data_page_stats_iterator!(
-    MinFloat32DataPageStatsIterator,
-    |x: &PageIndex<f32>| { x.min },
-    Index::FLOAT,
-    f32
-);
-make_data_page_stats_iterator!(
-    MaxFloat32DataPageStatsIterator,
-    |x: &PageIndex<f32>| { x.max },
-    Index::FLOAT,
-    f32
-);
-make_data_page_stats_iterator!(
-    MinFloat64DataPageStatsIterator,
-    |x: &PageIndex<f64>| { x.min },
-    Index::DOUBLE,
-    f64
-);
-make_data_page_stats_iterator!(
-    MaxFloat64DataPageStatsIterator,
-    |x: &PageIndex<f64>| { x.max },
-    Index::DOUBLE,
-    f64
-);
+make_data_page_stats_iterator!(MinFloat32DataPageStatsIterator, min_values_iter, f32);
+make_data_page_stats_iterator!(MaxFloat32DataPageStatsIterator, max_values_iter, f32);
+make_data_page_stats_iterator!(MinFloat64DataPageStatsIterator, min_values_iter, f64);
+make_data_page_stats_iterator!(MaxFloat64DataPageStatsIterator, max_values_iter, f64);
 make_data_page_stats_iterator!(
     MinByteArrayDataPageStatsIterator,
-    |x: &PageIndex<ByteArray>| { x.min.clone() },
-    Index::BYTE_ARRAY,
+    min_values_iter,
     ByteArray
 );
 make_data_page_stats_iterator!(
     MaxByteArrayDataPageStatsIterator,
-    |x: &PageIndex<ByteArray>| { x.max.clone() },
-    Index::BYTE_ARRAY,
+    max_values_iter,
     ByteArray
 );
 make_data_page_stats_iterator!(
     MaxFixedLenByteArrayDataPageStatsIterator,
-    |x: &PageIndex<FixedLenByteArray>| { x.max.clone() },
-    Index::FIXED_LEN_BYTE_ARRAY,
+    max_values_iter,
     FixedLenByteArray
 );
 
 make_data_page_stats_iterator!(
     MinFixedLenByteArrayDataPageStatsIterator,
-    |x: &PageIndex<FixedLenByteArray>| { x.min.clone() },
-    Index::FIXED_LEN_BYTE_ARRAY,
+    min_values_iter,
     FixedLenByteArray
 );
 
@@ -748,14 +690,14 @@ macro_rules! get_decimal_page_stats_iterator {
     ($iterator_type: ident, $func: ident, $stat_value_type: ident, $convert_func: ident) => {
         struct $iterator_type<'a, I>
         where
-            I: Iterator<Item = (usize, &'a Index)>,
+            I: Iterator<Item = (usize, &'a ColumnIndexMetaData)>,
         {
             iter: I,
         }
 
         impl<'a, I> $iterator_type<'a, I>
         where
-            I: Iterator<Item = (usize, &'a Index)>,
+            I: Iterator<Item = (usize, &'a ColumnIndexMetaData)>,
         {
             fn new(iter: I) -> Self {
                 Self { iter }
@@ -764,44 +706,37 @@ macro_rules! get_decimal_page_stats_iterator {
 
         impl<'a, I> Iterator for $iterator_type<'a, I>
         where
-            I: Iterator<Item = (usize, &'a Index)>,
+            I: Iterator<Item = (usize, &'a ColumnIndexMetaData)>,
         {
             type Item = Vec<Option<$stat_value_type>>;
 
+            // Some(native_index.$func().map(|v| v.map($conv)).collect::<Vec<_>>())
             fn next(&mut self) -> Option<Self::Item> {
                 let next = self.iter.next();
                 match next {
                     Some((len, index)) => match index {
-                        Index::INT32(native_index) => Some(
+                        ColumnIndexMetaData::INT32(native_index) => Some(
                             native_index
-                                .indexes
-                                .iter()
-                                .map(|x| x.$func.and_then(|x| Some($stat_value_type::from(x))))
+                                .$func()
+                                .map(|x| x.map(|x| $stat_value_type::from(*x)))
                                 .collect::<Vec<_>>(),
                         ),
-                        Index::INT64(native_index) => Some(
+                        ColumnIndexMetaData::INT64(native_index) => Some(
                             native_index
-                                .indexes
-                                .iter()
-                                .map(|x| x.$func.and_then(|x| $stat_value_type::try_from(x).ok()))
+                                .$func()
+                                .map(|x| x.map(|x| $stat_value_type::try_from(*x).unwrap()))
                                 .collect::<Vec<_>>(),
                         ),
-                        Index::BYTE_ARRAY(native_index) => Some(
+                        ColumnIndexMetaData::BYTE_ARRAY(native_index) => Some(
                             native_index
-                                .indexes
-                                .iter()
-                                .map(|x| {
-                                    x.clone().$func.and_then(|x| Some($convert_func(x.data())))
-                                })
+                                .$func()
+                                .map(|x| x.map(|x| $convert_func(x)))
                                 .collect::<Vec<_>>(),
                         ),
-                        Index::FIXED_LEN_BYTE_ARRAY(native_index) => Some(
+                        ColumnIndexMetaData::FIXED_LEN_BYTE_ARRAY(native_index) => Some(
                             native_index
-                                .indexes
-                                .iter()
-                                .map(|x| {
-                                    x.clone().$func.and_then(|x| Some($convert_func(x.data())))
-                                })
+                                .$func()
+                                .map(|x| x.map(|x| $convert_func(x)))
                                 .collect::<Vec<_>>(),
                         ),
                         _ => Some(vec![None; len]),
@@ -819,56 +754,56 @@ macro_rules! get_decimal_page_stats_iterator {
 
 get_decimal_page_stats_iterator!(
     MinDecimal32DataPageStatsIterator,
-    min,
+    min_values_iter,
     i32,
     from_bytes_to_i32
 );
 
 get_decimal_page_stats_iterator!(
     MaxDecimal32DataPageStatsIterator,
-    max,
+    max_values_iter,
     i32,
     from_bytes_to_i32
 );
 
 get_decimal_page_stats_iterator!(
     MinDecimal64DataPageStatsIterator,
-    min,
+    min_values_iter,
     i64,
     from_bytes_to_i64
 );
 
 get_decimal_page_stats_iterator!(
     MaxDecimal64DataPageStatsIterator,
-    max,
+    max_values_iter,
     i64,
     from_bytes_to_i64
 );
 
 get_decimal_page_stats_iterator!(
     MinDecimal128DataPageStatsIterator,
-    min,
+    min_values_iter,
     i128,
     from_bytes_to_i128
 );
 
 get_decimal_page_stats_iterator!(
     MaxDecimal128DataPageStatsIterator,
-    max,
+    max_values_iter,
     i128,
     from_bytes_to_i128
 );
 
 get_decimal_page_stats_iterator!(
     MinDecimal256DataPageStatsIterator,
-    min,
+    min_values_iter,
     i256,
     from_bytes_to_i256
 );
 
 get_decimal_page_stats_iterator!(
     MaxDecimal256DataPageStatsIterator,
-    max,
+    max_values_iter,
     i256,
     from_bytes_to_i256
 );
@@ -1174,77 +1109,44 @@ fn max_statistics<'a, I: Iterator<Item = Option<&'a ParquetStatistics>>>(
 }
 
 /// Extracts the min statistics from an iterator
-/// of parquet page [`Index`]'es to an [`ArrayRef`]
+/// of parquet page [`ColumnIndexMetaData`]'s to an [`ArrayRef`]
 pub(crate) fn min_page_statistics<'a, I>(
     data_type: &DataType,
     iterator: I,
     physical_type: Option<PhysicalType>,
 ) -> Result<ArrayRef>
 where
-    I: Iterator<Item = (usize, &'a Index)>,
+    I: Iterator<Item = (usize, &'a ColumnIndexMetaData)>,
 {
     get_data_page_statistics!(Min, data_type, iterator, physical_type)
 }
 
 /// Extracts the max statistics from an iterator
-/// of parquet page [`Index`]'es to an [`ArrayRef`]
+/// of parquet page [`ColumnIndexMetaData`]'s to an [`ArrayRef`]
 pub(crate) fn max_page_statistics<'a, I>(
     data_type: &DataType,
     iterator: I,
     physical_type: Option<PhysicalType>,
 ) -> Result<ArrayRef>
 where
-    I: Iterator<Item = (usize, &'a Index)>,
+    I: Iterator<Item = (usize, &'a ColumnIndexMetaData)>,
 {
     get_data_page_statistics!(Max, data_type, iterator, physical_type)
 }
 
 /// Extracts the null count statistics from an iterator
-/// of parquet page [`Index`]'es to an [`ArrayRef`]
+/// of parquet page [`ColumnIndexMetaData`]'s to an [`ArrayRef`]
 ///
 /// The returned Array is an [`UInt64Array`]
 pub(crate) fn null_counts_page_statistics<'a, I>(iterator: I) -> Result<UInt64Array>
 where
-    I: Iterator<Item = (usize, &'a Index)>,
+    I: Iterator<Item = (usize, &'a ColumnIndexMetaData)>,
 {
     let iter = iterator.flat_map(|(len, index)| match index {
-        Index::NONE => vec![None; len],
-        Index::BOOLEAN(native_index) => native_index
-            .indexes
-            .iter()
-            .map(|x| x.null_count.map(|x| x as u64))
-            .collect::<Vec<_>>(),
-        Index::INT32(native_index) => native_index
-            .indexes
-            .iter()
-            .map(|x| x.null_count.map(|x| x as u64))
-            .collect::<Vec<_>>(),
-        Index::INT64(native_index) => native_index
-            .indexes
-            .iter()
-            .map(|x| x.null_count.map(|x| x as u64))
-            .collect::<Vec<_>>(),
-        Index::FLOAT(native_index) => native_index
-            .indexes
-            .iter()
-            .map(|x| x.null_count.map(|x| x as u64))
-            .collect::<Vec<_>>(),
-        Index::DOUBLE(native_index) => native_index
-            .indexes
-            .iter()
-            .map(|x| x.null_count.map(|x| x as u64))
-            .collect::<Vec<_>>(),
-        Index::FIXED_LEN_BYTE_ARRAY(native_index) => native_index
-            .indexes
-            .iter()
-            .map(|x| x.null_count.map(|x| x as u64))
-            .collect::<Vec<_>>(),
-        Index::BYTE_ARRAY(native_index) => native_index
-            .indexes
-            .iter()
-            .map(|x| x.null_count.map(|x| x as u64))
-            .collect::<Vec<_>>(),
-        _ => unimplemented!(),
+        ColumnIndexMetaData::NONE => vec![None; len],
+        column_index => column_index.null_counts().map_or(vec![None; len], |v| {
+            v.iter().map(|i| Some(*i as u64)).collect::<Vec<_>>()
+        }),
     });
 
     Ok(UInt64Array::from_iter(iter))
@@ -1573,7 +1475,7 @@ impl<'a> StatisticsConverter<'a> {
     /// page level statistics can prune at a finer granularity.
     ///
     /// However since they are stored in a separate metadata
-    /// structure ([`Index`]) there is different code to extract them as
+    /// structure ([`ColumnIndexMetaData`]) there is different code to extract them as
     /// compared to arrow statistics.
     ///
     /// # Parameters:
diff --git a/parquet/src/arrow/arrow_writer/mod.rs b/parquet/src/arrow/arrow_writer/mod.rs
index 8d641dc18999..6b4dc87abba4 100644
--- a/parquet/src/arrow/arrow_writer/mod.rs
+++ b/parquet/src/arrow/arrow_writer/mod.rs
@@ -23,7 +23,6 @@ use std::iter::Peekable;
 use std::slice::Iter;
 use std::sync::{Arc, Mutex};
 use std::vec::IntoIter;
-use thrift::protocol::TCompactOutputProtocol;
 
 use arrow_array::cast::AsArray;
 use arrow_array::types::*;
@@ -44,12 +43,12 @@ use crate::data_type::{ByteArray, FixedLenByteArray};
 #[cfg(feature = "encryption")]
 use crate::encryption::encrypt::FileEncryptor;
 use crate::errors::{ParquetError, Result};
-use crate::file::metadata::{KeyValue, RowGroupMetaData};
+use crate::file::metadata::{KeyValue, ParquetMetaData, RowGroupMetaData};
 use crate::file::properties::{WriterProperties, WriterPropertiesPtr};
 use crate::file::reader::{ChunkReader, Length};
 use crate::file::writer::{SerializedFileWriter, SerializedRowGroupWriter};
+use crate::parquet_thrift::{ThriftCompactOutputProtocol, WriteThrift};
 use crate::schema::types::{ColumnDescPtr, SchemaDescriptor};
-use crate::thrift::TSerializable;
 use levels::{calculate_array_levels, ArrayLevels};
 
 mod byte_array;
@@ -398,13 +397,13 @@ impl<W: Write + Send> ArrowWriter<W> {
     /// Unlike [`Self::close`] this does not consume self
     ///
     /// Attempting to write after calling finish will result in an error
-    pub fn finish(&mut self) -> Result<crate::format::FileMetaData> {
+    pub fn finish(&mut self) -> Result<ParquetMetaData> {
         self.flush()?;
         self.writer.finish()
     }
 
     /// Close and finalize the underlying Parquet writer
-    pub fn close(mut self) -> Result<crate::format::FileMetaData> {
+    pub fn close(mut self) -> Result<ParquetMetaData> {
         self.finish()
     }
 
@@ -594,8 +593,8 @@ impl PageWriter for ArrowPageWriter {
                     }
                 }
                 None => {
-                    let mut protocol = TCompactOutputProtocol::new(&mut header);
-                    page_header.write_to_out_protocol(&mut protocol)?;
+                    let mut protocol = ThriftCompactOutputProtocol::new(&mut header);
+                    page_header.write_thrift(&mut protocol)?;
                 }
             };
 
@@ -755,7 +754,7 @@ impl ArrowColumnChunk {
 /// row_group_writer.close().unwrap();
 ///
 /// let metadata = writer.close().unwrap();
-/// assert_eq!(metadata.num_rows, 3);
+/// assert_eq!(metadata.file_metadata().num_rows(), 3);
 /// ```
 pub struct ArrowColumnWriter {
     writer: ArrowColumnWriterImpl,
@@ -1510,11 +1509,11 @@ mod tests {
     use crate::arrow::arrow_reader::{ParquetRecordBatchReader, ParquetRecordBatchReaderBuilder};
     use crate::arrow::ARROW_SCHEMA_META_KEY;
     use crate::column::page::{Page, PageReader};
-    use crate::file::page_encoding_stats::PageEncodingStats;
+    use crate::file::metadata::thrift_gen::PageHeader;
+    use crate::file::page_index::column_index::ColumnIndexMetaData;
     use crate::file::reader::SerializedPageReader;
-    use crate::format::PageHeader;
+    use crate::parquet_thrift::{ReadThrift, ThriftSliceInputProtocol};
     use crate::schema::types::ColumnPath;
-    use crate::thrift::TCompactSliceInputProtocol;
     use arrow::datatypes::ToByteSlice;
     use arrow::datatypes::{DataType, Schema};
     use arrow::error::Result as ArrowResult;
@@ -1530,7 +1529,6 @@ mod tests {
     use crate::basic::Encoding;
     use crate::data_type::AsBytes;
     use crate::file::metadata::{ColumnChunkMetaData, ParquetMetaData, ParquetMetaDataReader};
-    use crate::file::page_index::index::Index;
     use crate::file::properties::{
         BloomFilterPosition, EnabledStatistics, ReaderProperties, WriterVersion,
     };
@@ -2580,12 +2578,12 @@ mod tests {
             ArrowWriter::try_new(&mut out, batch.schema(), None).expect("Unable to write file");
         writer.write(&batch).unwrap();
         let file_meta_data = writer.close().unwrap();
-        for row_group in file_meta_data.row_groups {
-            for column in row_group.columns {
-                assert!(column.offset_index_offset.is_some());
-                assert!(column.offset_index_length.is_some());
-                assert!(column.column_index_offset.is_none());
-                assert!(column.column_index_length.is_none());
+        for row_group in file_meta_data.row_groups() {
+            for column in row_group.columns() {
+                assert!(column.offset_index_offset().is_some());
+                assert!(column.offset_index_length().is_some());
+                assert!(column.column_index_offset().is_none());
+                assert!(column.column_index_length().is_none());
             }
         }
     }
@@ -3034,14 +3032,18 @@ mod tests {
         writer.write(&batch).unwrap();
         let file_metadata = writer.close().unwrap();
 
+        let schema = file_metadata.file_metadata().schema();
         // Coerced name of "item" should be "element"
-        assert_eq!(file_metadata.schema[3].name, "element");
+        let list_field = &schema.get_fields()[0].get_fields()[0];
+        assert_eq!(list_field.get_fields()[0].name(), "element");
+
+        let map_field = &schema.get_fields()[1].get_fields()[0];
         // Coerced name of "entries" should be "key_value"
-        assert_eq!(file_metadata.schema[5].name, "key_value");
+        assert_eq!(map_field.name(), "key_value");
         // Coerced name of "keys" should be "key"
-        assert_eq!(file_metadata.schema[6].name, "key");
+        assert_eq!(map_field.get_fields()[0].name(), "key");
         // Coerced name of "values" should be "value"
-        assert_eq!(file_metadata.schema[7].name, "value");
+        assert_eq!(map_field.get_fields()[1].name(), "value");
 
         // Double check schema after reading from the file
         let reader = SerializedFileReader::new(file).unwrap();
@@ -3985,15 +3987,15 @@ mod tests {
         writer.write(&batch).unwrap();
 
         let metadata = writer.close().unwrap();
-        assert_eq!(metadata.row_groups.len(), 1);
-        let row_group = &metadata.row_groups[0];
-        assert_eq!(row_group.columns.len(), 2);
+        assert_eq!(metadata.num_row_groups(), 1);
+        let row_group = metadata.row_group(0);
+        assert_eq!(row_group.num_columns(), 2);
         // Column "a" has both offset and column index, as requested
-        assert!(row_group.columns[0].offset_index_offset.is_some());
-        assert!(row_group.columns[0].column_index_offset.is_some());
+        assert!(row_group.column(0).offset_index_offset().is_some());
+        assert!(row_group.column(0).column_index_offset().is_some());
         // Column "b" should only have offset index
-        assert!(row_group.columns[1].offset_index_offset.is_some());
-        assert!(row_group.columns[1].column_index_offset.is_none());
+        assert!(row_group.column(1).offset_index_offset().is_some());
+        assert!(row_group.column(1).column_index_offset().is_none());
 
         let options = ReadOptionsBuilder::new().with_page_index().build();
         let reader = SerializedFileReader::new_with_options(Bytes::from(buf), options).unwrap();
@@ -4025,9 +4027,12 @@ mod tests {
         assert_eq!(column_index[0].len(), 2); // 2 columns
 
         let a_idx = &column_index[0][0];
-        assert!(matches!(a_idx, Index::BYTE_ARRAY(_)), "{a_idx:?}");
+        assert!(
+            matches!(a_idx, ColumnIndexMetaData::BYTE_ARRAY(_)),
+            "{a_idx:?}"
+        );
         let b_idx = &column_index[0][1];
-        assert!(matches!(b_idx, Index::NONE), "{b_idx:?}");
+        assert!(matches!(b_idx, ColumnIndexMetaData::NONE), "{b_idx:?}");
     }
 
     #[test]
@@ -4057,15 +4062,15 @@ mod tests {
         writer.write(&batch).unwrap();
 
         let metadata = writer.close().unwrap();
-        assert_eq!(metadata.row_groups.len(), 1);
-        let row_group = &metadata.row_groups[0];
-        assert_eq!(row_group.columns.len(), 2);
+        assert_eq!(metadata.num_row_groups(), 1);
+        let row_group = metadata.row_group(0);
+        assert_eq!(row_group.num_columns(), 2);
         // Column "a" should only have offset index
-        assert!(row_group.columns[0].offset_index_offset.is_some());
-        assert!(row_group.columns[0].column_index_offset.is_none());
+        assert!(row_group.column(0).offset_index_offset().is_some());
+        assert!(row_group.column(0).column_index_offset().is_none());
         // Column "b" should only have offset index
-        assert!(row_group.columns[1].offset_index_offset.is_some());
-        assert!(row_group.columns[1].column_index_offset.is_none());
+        assert!(row_group.column(1).offset_index_offset().is_some());
+        assert!(row_group.column(1).column_index_offset().is_none());
 
         let options = ReadOptionsBuilder::new().with_page_index().build();
         let reader = SerializedFileReader::new_with_options(Bytes::from(buf), options).unwrap();
@@ -4093,9 +4098,9 @@ mod tests {
         assert_eq!(column_index[0].len(), 2); // 2 columns
 
         let a_idx = &column_index[0][0];
-        assert!(matches!(a_idx, Index::NONE), "{a_idx:?}");
+        assert!(matches!(a_idx, ColumnIndexMetaData::NONE), "{a_idx:?}");
         let b_idx = &column_index[0][1];
-        assert!(matches!(b_idx, Index::NONE), "{b_idx:?}");
+        assert!(matches!(b_idx, ColumnIndexMetaData::NONE), "{b_idx:?}");
     }
 
     #[test]
@@ -4207,8 +4212,8 @@ mod tests {
 
         // decode first page header
         let first_page = &file[4..];
-        let mut prot = TCompactSliceInputProtocol::new(first_page);
-        let hdr = PageHeader::read_from_in_protocol(&mut prot).unwrap();
+        let mut prot = ThriftSliceInputProtocol::new(first_page);
+        let hdr = PageHeader::read_thrift(&mut prot).unwrap();
         let stats = hdr.data_page_header.unwrap().statistics;
 
         assert!(stats.is_none());
@@ -4237,8 +4242,8 @@ mod tests {
 
         // decode first page header
         let first_page = &file[4..];
-        let mut prot = TCompactSliceInputProtocol::new(first_page);
-        let hdr = PageHeader::read_from_in_protocol(&mut prot).unwrap();
+        let mut prot = ThriftSliceInputProtocol::new(first_page);
+        let hdr = PageHeader::read_thrift(&mut prot).unwrap();
         let stats = hdr.data_page_header.unwrap().statistics;
 
         let stats = stats.unwrap();
@@ -4285,8 +4290,8 @@ mod tests {
 
         // decode first page header
         let first_page = &file[4..];
-        let mut prot = TCompactSliceInputProtocol::new(first_page);
-        let hdr = PageHeader::read_from_in_protocol(&mut prot).unwrap();
+        let mut prot = ThriftSliceInputProtocol::new(first_page);
+        let hdr = PageHeader::read_thrift(&mut prot).unwrap();
         let stats = hdr.data_page_header.unwrap().statistics;
         assert!(stats.is_some());
         let stats = stats.unwrap();
@@ -4298,8 +4303,8 @@ mod tests {
 
         // check second page now
         let second_page = &prot.as_slice()[hdr.compressed_page_size as usize..];
-        let mut prot = TCompactSliceInputProtocol::new(second_page);
-        let hdr = PageHeader::read_from_in_protocol(&mut prot).unwrap();
+        let mut prot = ThriftSliceInputProtocol::new(second_page);
+        let hdr = PageHeader::read_thrift(&mut prot).unwrap();
         let stats = hdr.data_page_header.unwrap().statistics;
         assert!(stats.is_some());
         let stats = stats.unwrap();
@@ -4329,14 +4334,18 @@ mod tests {
         writer.write(&batch).unwrap();
         let file_metadata = writer.close().unwrap();
 
-        assert_eq!(file_metadata.row_groups.len(), 1);
-        assert_eq!(file_metadata.row_groups[0].columns.len(), 1);
-        let chunk_meta = file_metadata.row_groups[0].columns[0]
-            .meta_data
-            .as_ref()
-            .expect("column metadata missing");
-        assert!(chunk_meta.encoding_stats.is_some());
-        let chunk_page_stats = chunk_meta.encoding_stats.as_ref().unwrap();
+        assert_eq!(file_metadata.num_row_groups(), 1);
+        assert_eq!(file_metadata.row_group(0).num_columns(), 1);
+        assert!(file_metadata
+            .row_group(0)
+            .column(0)
+            .page_encoding_stats()
+            .is_some());
+        let chunk_page_stats = file_metadata
+            .row_group(0)
+            .column(0)
+            .page_encoding_stats()
+            .unwrap();
 
         // check that the read metadata is also correct
         let options = ReadOptionsBuilder::new().with_page_index().build();
@@ -4347,11 +4356,7 @@ mod tests {
         let column = rowgroup.metadata().column(0);
         assert!(column.page_encoding_stats().is_some());
         let file_page_stats = column.page_encoding_stats().unwrap();
-        let chunk_stats: Vec<PageEncodingStats> = chunk_page_stats
-            .iter()
-            .map(|x| crate::file::page_encoding_stats::try_from_thrift(x).unwrap())
-            .collect();
-        assert_eq!(&chunk_stats, file_page_stats);
+        assert_eq!(chunk_page_stats, file_page_stats);
     }
 
     #[test]
diff --git a/parquet/src/arrow/async_reader/mod.rs b/parquet/src/arrow/async_reader/mod.rs
index 8d95c9a205ee..98946a608bde 100644
--- a/parquet/src/arrow/async_reader/mod.rs
+++ b/parquet/src/arrow/async_reader/mod.rs
@@ -47,6 +47,7 @@ use crate::arrow::arrow_reader::{
 };
 use crate::arrow::ProjectionMask;
 
+use crate::basic::{BloomFilterAlgorithm, BloomFilterCompression, BloomFilterHash};
 use crate::bloom_filter::{
     chunk_read_bloom_filter_header_and_offset, Sbbf, SBBF_HEADER_SIZE_ESTIMATE,
 };
@@ -55,7 +56,6 @@ use crate::errors::{ParquetError, Result};
 use crate::file::metadata::{PageIndexPolicy, ParquetMetaData, ParquetMetaDataReader};
 use crate::file::page_index::offset_index::OffsetIndexMetaData;
 use crate::file::reader::{ChunkReader, Length, SerializedPageReader};
-use crate::format::{BloomFilterAlgorithm, BloomFilterCompression, BloomFilterHash};
 
 mod metadata;
 pub use metadata::*;
@@ -450,17 +450,17 @@ impl<T: AsyncFileReader + Send + 'static> ParquetRecordBatchStreamBuilder<T> {
             chunk_read_bloom_filter_header_and_offset(offset, buffer.clone())?;
 
         match header.algorithm {
-            BloomFilterAlgorithm::BLOCK(_) => {
+            BloomFilterAlgorithm::BLOCK => {
                 // this match exists to future proof the singleton algorithm enum
             }
         }
         match header.compression {
-            BloomFilterCompression::UNCOMPRESSED(_) => {
+            BloomFilterCompression::UNCOMPRESSED => {
                 // this match exists to future proof the singleton compression enum
             }
         }
         match header.hash {
-            BloomFilterHash::XXHASH(_) => {
+            BloomFilterHash::XXHASH => {
                 // this match exists to future proof the singleton hash enum
             }
         }
diff --git a/parquet/src/arrow/async_writer/mod.rs b/parquet/src/arrow/async_writer/mod.rs
index 66ba6b87fee7..232333a1b486 100644
--- a/parquet/src/arrow/async_writer/mod.rs
+++ b/parquet/src/arrow/async_writer/mod.rs
@@ -64,8 +64,10 @@ use crate::{
     arrow::arrow_writer::ArrowWriterOptions,
     arrow::ArrowWriter,
     errors::{ParquetError, Result},
-    file::{metadata::RowGroupMetaData, properties::WriterProperties},
-    format::{FileMetaData, KeyValue},
+    file::{
+        metadata::{KeyValue, ParquetMetaData, RowGroupMetaData},
+        properties::WriterProperties,
+    },
 };
 use arrow_array::RecordBatch;
 use arrow_schema::SchemaRef;
@@ -245,7 +247,7 @@ impl<W: AsyncFileWriter> AsyncArrowWriter<W> {
     /// Unlike [`Self::close`] this does not consume self
     ///
     /// Attempting to write after calling finish will result in an error
-    pub async fn finish(&mut self) -> Result<FileMetaData> {
+    pub async fn finish(&mut self) -> Result<ParquetMetaData> {
         let metadata = self.sync_writer.finish()?;
 
         // Force to flush the remaining data.
@@ -258,7 +260,7 @@ impl<W: AsyncFileWriter> AsyncArrowWriter<W> {
     /// Close and finalize the writer.
     ///
     /// All the data in the inner buffer will be force flushed.
-    pub async fn close(mut self) -> Result<FileMetaData> {
+    pub async fn close(mut self) -> Result<ParquetMetaData> {
         self.finish().await
     }
 
diff --git a/parquet/src/arrow/schema/extension.rs b/parquet/src/arrow/schema/extension.rs
index c6d04328aca8..bdd3d91eb97f 100644
--- a/parquet/src/arrow/schema/extension.rs
+++ b/parquet/src/arrow/schema/extension.rs
@@ -44,7 +44,7 @@ pub(crate) fn try_add_extension_type(
     };
     match parquet_logical_type {
         #[cfg(feature = "variant_experimental")]
-        LogicalType::Variant => {
+        LogicalType::Variant { .. } => {
             arrow_field.try_with_extension_type(parquet_variant_compute::VariantType)?;
         }
         #[cfg(feature = "arrow_canonical_extension_types")]
@@ -70,7 +70,7 @@ pub(crate) fn has_extension_type(parquet_type: &Type) -> bool {
     };
     match parquet_logical_type {
         #[cfg(feature = "variant_experimental")]
-        LogicalType::Variant => true,
+        LogicalType::Variant { .. } => true,
         #[cfg(feature = "arrow_canonical_extension_types")]
         LogicalType::Uuid => true,
         #[cfg(feature = "arrow_canonical_extension_types")]
@@ -89,7 +89,9 @@ pub(crate) fn logical_type_for_struct(field: &Field) -> Option<LogicalType> {
         return None;
     }
     match field.try_extension_type::<VariantType>() {
-        Ok(VariantType) => Some(LogicalType::Variant),
+        Ok(VariantType) => Some(LogicalType::Variant {
+            specification_version: None,
+        }),
         // Given check above, this should not error, but if it does ignore
         Err(_e) => None,
     }
diff --git a/parquet/src/arrow/schema/mod.rs b/parquet/src/arrow/schema/mod.rs
index 8e689f5eb66e..ea8176e5734d 100644
--- a/parquet/src/arrow/schema/mod.rs
+++ b/parquet/src/arrow/schema/mod.rs
@@ -534,9 +534,9 @@ fn arrow_to_parquet_type(field: &Field, coerce_types: bool) -> Result<Type> {
                     is_adjusted_to_u_t_c: matches!(tz, Some(z) if !z.as_ref().is_empty()),
                     unit: match time_unit {
                         TimeUnit::Second => unreachable!(),
-                        TimeUnit::Millisecond => ParquetTimeUnit::MILLIS(Default::default()),
-                        TimeUnit::Microsecond => ParquetTimeUnit::MICROS(Default::default()),
-                        TimeUnit::Nanosecond => ParquetTimeUnit::NANOS(Default::default()),
+                        TimeUnit::Millisecond => ParquetTimeUnit::MILLIS,
+                        TimeUnit::Microsecond => ParquetTimeUnit::MICROS,
+                        TimeUnit::Nanosecond => ParquetTimeUnit::NANOS,
                     },
                 }))
                 .with_repetition(repetition)
@@ -573,7 +573,7 @@ fn arrow_to_parquet_type(field: &Field, coerce_types: bool) -> Result<Type> {
             .with_logical_type(Some(LogicalType::Time {
                 is_adjusted_to_u_t_c: field.metadata().contains_key("adjusted_to_utc"),
                 unit: match unit {
-                    TimeUnit::Millisecond => ParquetTimeUnit::MILLIS(Default::default()),
+                    TimeUnit::Millisecond => ParquetTimeUnit::MILLIS,
                     u => unreachable!("Invalid unit for Time32: {:?}", u),
                 },
             }))
@@ -584,8 +584,8 @@ fn arrow_to_parquet_type(field: &Field, coerce_types: bool) -> Result<Type> {
             .with_logical_type(Some(LogicalType::Time {
                 is_adjusted_to_u_t_c: field.metadata().contains_key("adjusted_to_utc"),
                 unit: match unit {
-                    TimeUnit::Microsecond => ParquetTimeUnit::MICROS(Default::default()),
-                    TimeUnit::Nanosecond => ParquetTimeUnit::NANOS(Default::default()),
+                    TimeUnit::Microsecond => ParquetTimeUnit::MICROS,
+                    TimeUnit::Nanosecond => ParquetTimeUnit::NANOS,
                     u => unreachable!("Invalid unit for Time64: {:?}", u),
                 },
             }))
diff --git a/parquet/src/arrow/schema/primitive.rs b/parquet/src/arrow/schema/primitive.rs
index 24b18b39bc18..fb1b83d9fc39 100644
--- a/parquet/src/arrow/schema/primitive.rs
+++ b/parquet/src/arrow/schema/primitive.rs
@@ -186,7 +186,7 @@ fn from_int32(info: &BasicTypeInfo, scale: i32, precision: i32) -> Result<DataTy
         (Some(LogicalType::Decimal { scale, precision }), _) => decimal_128_type(scale, precision),
         (Some(LogicalType::Date), _) => Ok(DataType::Date32),
         (Some(LogicalType::Time { unit, .. }), _) => match unit {
-            ParquetTimeUnit::MILLIS(_) => Ok(DataType::Time32(TimeUnit::Millisecond)),
+            ParquetTimeUnit::MILLIS => Ok(DataType::Time32(TimeUnit::Millisecond)),
             _ => Err(arrow_err!(
                 "Cannot create INT32 physical type from {:?}",
                 unit
@@ -225,11 +225,11 @@ fn from_int64(info: &BasicTypeInfo, scale: i32, precision: i32) -> Result<DataTy
             false => Ok(DataType::UInt64),
         },
         (Some(LogicalType::Time { unit, .. }), _) => match unit {
-            ParquetTimeUnit::MILLIS(_) => {
+            ParquetTimeUnit::MILLIS => {
                 Err(arrow_err!("Cannot create INT64 from MILLIS time unit",))
             }
-            ParquetTimeUnit::MICROS(_) => Ok(DataType::Time64(TimeUnit::Microsecond)),
-            ParquetTimeUnit::NANOS(_) => Ok(DataType::Time64(TimeUnit::Nanosecond)),
+            ParquetTimeUnit::MICROS => Ok(DataType::Time64(TimeUnit::Microsecond)),
+            ParquetTimeUnit::NANOS => Ok(DataType::Time64(TimeUnit::Nanosecond)),
         },
         (
             Some(LogicalType::Timestamp {
@@ -239,9 +239,9 @@ fn from_int64(info: &BasicTypeInfo, scale: i32, precision: i32) -> Result<DataTy
             _,
         ) => Ok(DataType::Timestamp(
             match unit {
-                ParquetTimeUnit::MILLIS(_) => TimeUnit::Millisecond,
-                ParquetTimeUnit::MICROS(_) => TimeUnit::Microsecond,
-                ParquetTimeUnit::NANOS(_) => TimeUnit::Nanosecond,
+                ParquetTimeUnit::MILLIS => TimeUnit::Millisecond,
+                ParquetTimeUnit::MICROS => TimeUnit::Microsecond,
+                ParquetTimeUnit::NANOS => TimeUnit::Nanosecond,
             },
             if is_adjusted_to_u_t_c {
                 Some("UTC".into())
@@ -276,8 +276,8 @@ fn from_byte_array(info: &BasicTypeInfo, precision: i32, scale: i32) -> Result<D
         (Some(LogicalType::Json), _) => Ok(DataType::Utf8),
         (Some(LogicalType::Bson), _) => Ok(DataType::Binary),
         (Some(LogicalType::Enum), _) => Ok(DataType::Binary),
-        (Some(LogicalType::Geometry), _) => Ok(DataType::Binary),
-        (Some(LogicalType::Geography), _) => Ok(DataType::Binary),
+        (Some(LogicalType::Geometry { .. }), _) => Ok(DataType::Binary),
+        (Some(LogicalType::Geography { .. }), _) => Ok(DataType::Binary),
         (None, ConvertedType::NONE) => Ok(DataType::Binary),
         (None, ConvertedType::JSON) => Ok(DataType::Utf8),
         (None, ConvertedType::BSON) => Ok(DataType::Binary),
diff --git a/parquet/src/basic.rs b/parquet/src/basic.rs
index 6353b5f4ee63..68eebaf5080a 100644
--- a/parquet/src/basic.rs
+++ b/parquet/src/basic.rs
@@ -15,58 +15,57 @@
 // specific language governing permissions and limitations
 // under the License.
 
-//! Contains Rust mappings for Thrift definition.
-//! Refer to [`parquet.thrift`](https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift) file to see raw definitions.
+//! Contains Rust mappings for Thrift definition. This module contains only mappings for thrift
+//! enums and unions. Thrift structs are handled elsewhere.
+//! Refer to [`parquet.thrift`](https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift)
+//! file to see raw definitions.
 
+use std::io::Write;
 use std::str::FromStr;
 use std::{fmt, str};
 
 pub use crate::compression::{BrotliLevel, GzipLevel, ZstdLevel};
-use crate::format as parquet;
+use crate::parquet_thrift::{
+    ElementType, FieldType, ReadThrift, ThriftCompactInputProtocol, ThriftCompactOutputProtocol,
+    WriteThrift, WriteThriftField,
+};
+use crate::{thrift_enum, thrift_struct, thrift_union_all_empty};
 
 use crate::errors::{ParquetError, Result};
 
-// Re-export crate::format types used in this module
-pub use crate::format::{
-    BsonType, DateType, DecimalType, EnumType, IntType, JsonType, ListType, MapType, NullType,
-    StringType, TimeType, TimeUnit, TimestampType, UUIDType,
-};
-
 // ----------------------------------------------------------------------
 // Types from the Thrift definition
 
 // ----------------------------------------------------------------------
-// Mirrors `parquet::Type`
+// Mirrors thrift enum `Type`
 
+thrift_enum!(
 /// Types supported by Parquet.
 ///
 /// These physical types are intended to be used in combination with the encodings to
 /// control the on disk storage format.
 /// For example INT16 is not included as a type since a good encoding of INT32
 /// would handle this.
-#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
-#[allow(non_camel_case_types)]
-pub enum Type {
-    /// A boolean value.
-    BOOLEAN,
-    /// 32-bit signed integer.
-    INT32,
-    /// 64-bit signed integer.
-    INT64,
-    /// 96-bit signed integer for timestamps.
-    INT96,
-    /// IEEE 754 single-precision floating point value.
-    FLOAT,
-    /// IEEE 754 double-precision floating point value.
-    DOUBLE,
-    /// Arbitrary length byte array.
-    BYTE_ARRAY,
-    /// Fixed length byte array.
-    FIXED_LEN_BYTE_ARRAY,
+enum Type {
+  BOOLEAN = 0;
+  INT32 = 1;
+  INT64 = 2;
+  INT96 = 3;  // deprecated, only used by legacy implementations.
+  FLOAT = 4;
+  DOUBLE = 5;
+  BYTE_ARRAY = 6;
+  FIXED_LEN_BYTE_ARRAY = 7;
 }
+);
 
 // ----------------------------------------------------------------------
-// Mirrors `parquet::ConvertedType`
+// Mirrors thrift enum `ConvertedType`
+//
+// Cannot use macros because of added field `None`
+
+// TODO(ets): Adding the `NONE` variant to this enum is a bit awkward. We should
+// look into removing it and using `Option<ConvertedType>` instead. Then all of this
+// handwritten code could go away.
 
 /// Common types (converted types) used by frameworks when using Parquet.
 ///
@@ -170,8 +169,123 @@ pub enum ConvertedType {
     INTERVAL,
 }
 
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for ConvertedType {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        let val = prot.read_i32()?;
+        Ok(match val {
+            0 => Self::UTF8,
+            1 => Self::MAP,
+            2 => Self::MAP_KEY_VALUE,
+            3 => Self::LIST,
+            4 => Self::ENUM,
+            5 => Self::DECIMAL,
+            6 => Self::DATE,
+            7 => Self::TIME_MILLIS,
+            8 => Self::TIME_MICROS,
+            9 => Self::TIMESTAMP_MILLIS,
+            10 => Self::TIMESTAMP_MICROS,
+            11 => Self::UINT_8,
+            12 => Self::UINT_16,
+            13 => Self::UINT_32,
+            14 => Self::UINT_64,
+            15 => Self::INT_8,
+            16 => Self::INT_16,
+            17 => Self::INT_32,
+            18 => Self::INT_64,
+            19 => Self::JSON,
+            20 => Self::BSON,
+            21 => Self::INTERVAL,
+            _ => return Err(general_err!("Unexpected ConvertedType {}", val)),
+        })
+    }
+}
+
+impl WriteThrift for ConvertedType {
+    const ELEMENT_TYPE: ElementType = ElementType::I32;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        // because we've added NONE, the variant values are off by 1, so correct that here
+        writer.write_i32(*self as i32 - 1)
+    }
+}
+
+impl WriteThriftField for ConvertedType {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::I32, field_id, last_field_id)?;
+        self.write_thrift(writer)?;
+        Ok(field_id)
+    }
+}
+
 // ----------------------------------------------------------------------
-// Mirrors `parquet::LogicalType`
+// Mirrors thrift union `TimeUnit`
+
+thrift_union_all_empty!(
+/// Time unit for `Time` and `Timestamp` logical types.
+union TimeUnit {
+  1: MilliSeconds MILLIS
+  2: MicroSeconds MICROS
+  3: NanoSeconds NANOS
+}
+);
+
+// ----------------------------------------------------------------------
+// Mirrors thrift union `LogicalType`
+
+// private structs for decoding logical type
+
+thrift_struct!(
+struct DecimalType {
+  1: required i32 scale
+  2: required i32 precision
+}
+);
+
+thrift_struct!(
+struct TimestampType {
+  1: required bool is_adjusted_to_u_t_c
+  2: required TimeUnit unit
+}
+);
+
+// they are identical
+use TimestampType as TimeType;
+
+thrift_struct!(
+struct IntType {
+  1: required i8 bit_width
+  2: required bool is_signed
+}
+);
+
+thrift_struct!(
+struct VariantType {
+  // The version of the variant specification that the variant was
+  // written with.
+  1: optional i8 specification_version
+}
+);
+
+thrift_struct!(
+struct GeometryType<'a> {
+  1: optional string<'a> crs;
+}
+);
+
+thrift_struct!(
+struct GeographyType<'a> {
+  1: optional string<'a> crs;
+  2: optional EdgeInterpolationAlgorithm algorithm;
+}
+);
+
+// TODO(ets): should we switch to tuple variants so we can use
+// the thrift macros?
 
 /// Logical types used by version 2.4.0+ of the Parquet format.
 ///
@@ -229,31 +343,282 @@ pub enum LogicalType {
     /// A 16-bit floating point number.
     Float16,
     /// A Variant value.
-    Variant,
+    Variant {
+        /// The version of the variant specification that the variant was written with.
+        specification_version: Option<i8>,
+    },
     /// A geospatial feature in the Well-Known Binary (WKB) format with linear/planar edges interpolation.
-    Geometry,
+    Geometry {
+        /// A custom CRS. If unset the defaults to `OGC:CRS84`, which means that the geometries
+        /// must be stored in longitude, latitude based on the WGS84 datum.
+        crs: Option<String>,
+    },
     /// A geospatial feature in the WKB format with an explicit (non-linear/non-planar) edges interpolation.
-    Geography,
+    Geography {
+        /// A custom CRS. If unset the defaults to `OGC:CRS84`.
+        crs: Option<String>,
+        /// An optional algorithm can be set to correctly interpret edges interpolation
+        /// of the geometries. If unset, the algorithm defaults to `SPHERICAL`.
+        algorithm: Option<EdgeInterpolationAlgorithm>,
+    },
+    /// For forward compatibility; used when an unknown union value is encountered.
+    _Unknown {
+        /// The field id encountered when parsing the unknown logical type.
+        field_id: i16,
+    },
+}
+
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for LogicalType {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        let field_ident = prot.read_field_begin(0)?;
+        if field_ident.field_type == FieldType::Stop {
+            return Err(general_err!("received empty union from remote LogicalType"));
+        }
+        let ret = match field_ident.id {
+            1 => {
+                prot.skip_empty_struct()?;
+                Self::String
+            }
+            2 => {
+                prot.skip_empty_struct()?;
+                Self::Map
+            }
+            3 => {
+                prot.skip_empty_struct()?;
+                Self::List
+            }
+            4 => {
+                prot.skip_empty_struct()?;
+                Self::Enum
+            }
+            5 => {
+                let val = DecimalType::read_thrift(&mut *prot)?;
+                Self::Decimal {
+                    scale: val.scale,
+                    precision: val.precision,
+                }
+            }
+            6 => {
+                prot.skip_empty_struct()?;
+                Self::Date
+            }
+            7 => {
+                let val = TimeType::read_thrift(&mut *prot)?;
+                Self::Time {
+                    is_adjusted_to_u_t_c: val.is_adjusted_to_u_t_c,
+                    unit: val.unit,
+                }
+            }
+            8 => {
+                let val = TimestampType::read_thrift(&mut *prot)?;
+                Self::Timestamp {
+                    is_adjusted_to_u_t_c: val.is_adjusted_to_u_t_c,
+                    unit: val.unit,
+                }
+            }
+            10 => {
+                let val = IntType::read_thrift(&mut *prot)?;
+                Self::Integer {
+                    is_signed: val.is_signed,
+                    bit_width: val.bit_width,
+                }
+            }
+            11 => {
+                prot.skip_empty_struct()?;
+                Self::Unknown
+            }
+            12 => {
+                prot.skip_empty_struct()?;
+                Self::Json
+            }
+            13 => {
+                prot.skip_empty_struct()?;
+                Self::Bson
+            }
+            14 => {
+                prot.skip_empty_struct()?;
+                Self::Uuid
+            }
+            15 => {
+                prot.skip_empty_struct()?;
+                Self::Float16
+            }
+            16 => {
+                let val = VariantType::read_thrift(&mut *prot)?;
+                Self::Variant {
+                    specification_version: val.specification_version,
+                }
+            }
+            17 => {
+                let val = GeometryType::read_thrift(&mut *prot)?;
+                Self::Geometry {
+                    crs: val.crs.map(|s| s.to_owned()),
+                }
+            }
+            18 => {
+                let val = GeographyType::read_thrift(&mut *prot)?;
+                // unset algorithm means SPHERICAL, per the spec:
+                // https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#geography
+                let algorithm = val
+                    .algorithm
+                    .unwrap_or(EdgeInterpolationAlgorithm::SPHERICAL);
+                Self::Geography {
+                    crs: val.crs.map(|s| s.to_owned()),
+                    algorithm: Some(algorithm),
+                }
+            }
+            _ => {
+                prot.skip(field_ident.field_type)?;
+                Self::_Unknown {
+                    field_id: field_ident.id,
+                }
+            }
+        };
+        let field_ident = prot.read_field_begin(field_ident.id)?;
+        if field_ident.field_type != FieldType::Stop {
+            return Err(general_err!(
+                "Received multiple fields for union from remote LogicalType"
+            ));
+        }
+        Ok(ret)
+    }
+}
+
+impl WriteThrift for LogicalType {
+    const ELEMENT_TYPE: ElementType = ElementType::Struct;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        match self {
+            Self::String => {
+                writer.write_empty_struct(1, 0)?;
+            }
+            Self::Map => {
+                writer.write_empty_struct(2, 0)?;
+            }
+            Self::List => {
+                writer.write_empty_struct(3, 0)?;
+            }
+            Self::Enum => {
+                writer.write_empty_struct(4, 0)?;
+            }
+            Self::Decimal { scale, precision } => {
+                DecimalType {
+                    scale: *scale,
+                    precision: *precision,
+                }
+                .write_thrift_field(writer, 5, 0)?;
+            }
+            Self::Date => {
+                writer.write_empty_struct(6, 0)?;
+            }
+            Self::Time {
+                is_adjusted_to_u_t_c,
+                unit,
+            } => {
+                TimeType {
+                    is_adjusted_to_u_t_c: *is_adjusted_to_u_t_c,
+                    unit: *unit,
+                }
+                .write_thrift_field(writer, 7, 0)?;
+            }
+            Self::Timestamp {
+                is_adjusted_to_u_t_c,
+                unit,
+            } => {
+                TimestampType {
+                    is_adjusted_to_u_t_c: *is_adjusted_to_u_t_c,
+                    unit: *unit,
+                }
+                .write_thrift_field(writer, 8, 0)?;
+            }
+            Self::Integer {
+                bit_width,
+                is_signed,
+            } => {
+                IntType {
+                    bit_width: *bit_width,
+                    is_signed: *is_signed,
+                }
+                .write_thrift_field(writer, 10, 0)?;
+            }
+            Self::Unknown => {
+                writer.write_empty_struct(11, 0)?;
+            }
+            Self::Json => {
+                writer.write_empty_struct(12, 0)?;
+            }
+            Self::Bson => {
+                writer.write_empty_struct(13, 0)?;
+            }
+            Self::Uuid => {
+                writer.write_empty_struct(14, 0)?;
+            }
+            Self::Float16 => {
+                writer.write_empty_struct(15, 0)?;
+            }
+            Self::Variant {
+                specification_version,
+            } => {
+                VariantType {
+                    specification_version: *specification_version,
+                }
+                .write_thrift_field(writer, 16, 0)?;
+            }
+            Self::Geometry { crs } => {
+                GeometryType {
+                    crs: crs.as_ref().map(|s| s.as_str()),
+                }
+                .write_thrift_field(writer, 17, 0)?;
+            }
+            Self::Geography { crs, algorithm } => {
+                GeographyType {
+                    crs: crs.as_ref().map(|s| s.as_str()),
+                    algorithm: *algorithm,
+                }
+                .write_thrift_field(writer, 18, 0)?;
+            }
+            _ => return Err(nyi_err!("logical type")),
+        }
+        writer.write_struct_end()
+    }
+}
+
+impl WriteThriftField for LogicalType {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::Struct, field_id, last_field_id)?;
+        self.write_thrift(writer)?;
+        Ok(field_id)
+    }
 }
 
 // ----------------------------------------------------------------------
-// Mirrors `parquet::FieldRepetitionType`
+// Mirrors thrift enum `FieldRepetitionType`
+//
 
+thrift_enum!(
 /// Representation of field types in schema.
-#[derive(Debug, Clone, Copy, PartialEq, Eq)]
-#[allow(non_camel_case_types)]
-pub enum Repetition {
-    /// Field is required (can not be null) and each record has exactly 1 value.
-    REQUIRED,
-    /// Field is optional (can be null) and each record has 0 or 1 values.
-    OPTIONAL,
-    /// Field is repeated and can contain 0 or more values.
-    REPEATED,
+enum FieldRepetitionType {
+  /// This field is required (can not be null) and each row has exactly 1 value.
+  REQUIRED = 0;
+  /// The field is optional (can be null) and each row has 0 or 1 values.
+  OPTIONAL = 1;
+  /// The field is repeated and can contain 0 or more values.
+  REPEATED = 2;
 }
+);
+
+/// Type alias for thrift `FieldRepetitionType`
+pub type Repetition = FieldRepetitionType;
 
 // ----------------------------------------------------------------------
-// Mirrors `parquet::Encoding`
+// Mirrors thrift enum `Encoding`
 
+thrift_enum!(
 /// Encodings supported by Parquet.
 ///
 /// Not all encodings are valid for all types. These enums are also used to specify the
@@ -270,80 +635,72 @@ pub enum Repetition {
 /// performance impact when evaluating these encodings.
 ///
 /// [WriterVersion]: crate::file::properties::WriterVersion
-#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Ord, PartialOrd)]
-#[allow(non_camel_case_types)]
-pub enum Encoding {
-    /// Default byte encoding.
-    /// - BOOLEAN - 1 bit per value, 0 is false; 1 is true.
-    /// - INT32 - 4 bytes per value, stored as little-endian.
-    /// - INT64 - 8 bytes per value, stored as little-endian.
-    /// - FLOAT - 4 bytes per value, stored as little-endian.
-    /// - DOUBLE - 8 bytes per value, stored as little-endian.
-    /// - BYTE_ARRAY - 4 byte length stored as little endian, followed by bytes.
-    /// - FIXED_LEN_BYTE_ARRAY - just the bytes are stored.
-    PLAIN,
-
-    /// **Deprecated** dictionary encoding.
-    ///
-    /// The values in the dictionary are encoded using PLAIN encoding.
-    /// Since it is deprecated, RLE_DICTIONARY encoding is used for a data page, and
-    /// PLAIN encoding is used for dictionary page.
-    PLAIN_DICTIONARY,
-
-    /// Group packed run length encoding.
-    ///
-    /// Usable for definition/repetition levels encoding and boolean values.
-    RLE,
-
-    /// **Deprecated** Bit-packed encoding.
-    ///
-    /// This can only be used if the data has a known max width.
-    /// Usable for definition/repetition levels encoding.
-    ///
-    /// There are compatibility issues with files using this encoding.
-    /// The parquet standard specifies the bits to be packed starting from the
-    /// most-significant bit, several implementations do not follow this bit order.
-    /// Several other implementations also have issues reading this encoding
-    /// because of incorrect assumptions about the length of the encoded data.
-    ///
-    /// The RLE/bit-packing hybrid is more cpu and memory efficient and should be used instead.
-    #[deprecated(
-        since = "51.0.0",
-        note = "Please see documentation for compatibility issues and use the RLE/bit-packing hybrid encoding instead"
-    )]
-    BIT_PACKED,
-
-    /// Delta encoding for integers, either INT32 or INT64.
-    ///
-    /// Works best on sorted data.
-    DELTA_BINARY_PACKED,
-
-    /// Encoding for byte arrays to separate the length values and the data.
-    ///
-    /// The lengths are encoded using DELTA_BINARY_PACKED encoding.
-    DELTA_LENGTH_BYTE_ARRAY,
-
-    /// Incremental encoding for byte arrays.
-    ///
-    /// Prefix lengths are encoded using DELTA_BINARY_PACKED encoding.
-    /// Suffixes are stored using DELTA_LENGTH_BYTE_ARRAY encoding.
-    DELTA_BYTE_ARRAY,
-
-    /// Dictionary encoding.
-    ///
-    /// The ids are encoded using the RLE encoding.
-    RLE_DICTIONARY,
-
-    /// Encoding for fixed-width data.
-    ///
-    /// K byte-streams are created where K is the size in bytes of the data type.
-    /// The individual bytes of a value are scattered to the corresponding stream and
-    /// the streams are concatenated.
-    /// This itself does not reduce the size of the data but can lead to better compression
-    /// afterwards. Note that the use of this encoding with FIXED_LEN_BYTE_ARRAY(N) data may
-    /// perform poorly for large values of N.
-    BYTE_STREAM_SPLIT,
+enum Encoding {
+  /// Default encoding.
+  /// - BOOLEAN - 1 bit per value. 0 is false; 1 is true.
+  /// - INT32 - 4 bytes per value.  Stored as little-endian.
+  /// - INT64 - 8 bytes per value.  Stored as little-endian.
+  /// - FLOAT - 4 bytes per value.  IEEE. Stored as little-endian.
+  /// - DOUBLE - 8 bytes per value.  IEEE. Stored as little-endian.
+  /// - BYTE_ARRAY - 4 byte length stored as little endian, followed by bytes.
+  /// - FIXED_LEN_BYTE_ARRAY - Just the bytes.
+  PLAIN = 0;
+  //  GROUP_VAR_INT = 1;
+  /// **Deprecated** dictionary encoding.
+  ///
+  /// The values in the dictionary are encoded using PLAIN encoding.
+  /// Since it is deprecated, RLE_DICTIONARY encoding is used for a data page, and
+  /// PLAIN encoding is used for dictionary page.
+  PLAIN_DICTIONARY = 2;
+  /// Group packed run length encoding.
+  ///
+  /// Usable for definition/repetition levels encoding and boolean values.
+  RLE = 3;
+  /// **Deprecated** Bit-packed encoding.
+  ///
+  /// This can only be used if the data has a known max width.
+  /// Usable for definition/repetition levels encoding.
+  ///
+  /// There are compatibility issues with files using this encoding.
+  /// The parquet standard specifies the bits to be packed starting from the
+  /// most-significant bit, several implementations do not follow this bit order.
+  /// Several other implementations also have issues reading this encoding
+  /// because of incorrect assumptions about the length of the encoded data.
+  ///
+  /// The RLE/bit-packing hybrid is more cpu and memory efficient and should be used instead.
+  #[deprecated(
+      since = "51.0.0",
+      note = "Please see documentation for compatibility issues and use the RLE/bit-packing hybrid encoding instead"
+  )]
+  BIT_PACKED = 4;
+  /// Delta encoding for integers, either INT32 or INT64.
+  ///
+  /// Works best on sorted data.
+  DELTA_BINARY_PACKED = 5;
+  /// Encoding for byte arrays to separate the length values and the data.
+  ///
+  /// The lengths are encoded using DELTA_BINARY_PACKED encoding.
+  DELTA_LENGTH_BYTE_ARRAY = 6;
+  /// Incremental encoding for byte arrays.
+  ///
+  /// Prefix lengths are encoded using DELTA_BINARY_PACKED encoding.
+  /// Suffixes are stored using DELTA_LENGTH_BYTE_ARRAY encoding.
+  DELTA_BYTE_ARRAY = 7;
+  /// Dictionary encoding.
+  ///
+  /// The ids are encoded using the RLE encoding.
+  RLE_DICTIONARY = 8;
+  /// Encoding for fixed-width data.
+  ///
+  /// K byte-streams are created where K is the size in bytes of the data type.
+  /// The individual bytes of a value are scattered to the corresponding stream and
+  /// the streams are concatenated.
+  /// This itself does not reduce the size of the data but can lead to better compression
+  /// afterwards. Note that the use of this encoding with FIXED_LEN_BYTE_ARRAY(N) data may
+  /// perform poorly for large values of N.
+  BYTE_STREAM_SPLIT = 9;
 }
+);
 
 impl FromStr for Encoding {
     type Err = ParquetError;
@@ -368,7 +725,7 @@ impl FromStr for Encoding {
 }
 
 // ----------------------------------------------------------------------
-// Mirrors `parquet::CompressionCodec`
+// Mirrors thrift enum `CompressionCodec`
 
 /// Supported block compression algorithms.
 ///
@@ -406,6 +763,57 @@ pub enum Compression {
     LZ4_RAW,
 }
 
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for Compression {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        let val = prot.read_i32()?;
+        Ok(match val {
+            0 => Self::UNCOMPRESSED,
+            1 => Self::SNAPPY,
+            2 => Self::GZIP(Default::default()),
+            3 => Self::LZO,
+            4 => Self::BROTLI(Default::default()),
+            5 => Self::LZ4,
+            6 => Self::ZSTD(Default::default()),
+            7 => Self::LZ4_RAW,
+            _ => return Err(general_err!("Unexpected CompressionCodec {}", val)),
+        })
+    }
+}
+
+// TODO(ets): explore replacing this with a thrift_enum!(ThriftCompression) for the serialization
+// and then provide `From` impls to convert back and forth. This is necessary due to the addition
+// of compression level to some variants.
+impl WriteThrift for Compression {
+    const ELEMENT_TYPE: ElementType = ElementType::I32;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        let id: i32 = match *self {
+            Self::UNCOMPRESSED => 0,
+            Self::SNAPPY => 1,
+            Self::GZIP(_) => 2,
+            Self::LZO => 3,
+            Self::BROTLI(_) => 4,
+            Self::LZ4 => 5,
+            Self::ZSTD(_) => 6,
+            Self::LZ4_RAW => 7,
+        };
+        writer.write_i32(id)
+    }
+}
+
+impl WriteThriftField for Compression {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::I32, field_id, last_field_id)?;
+        self.write_thrift(writer)?;
+        Ok(field_id)
+    }
+}
+
 impl Compression {
     /// Returns the codec type of this compression setting as a string, without the compression
     /// level.
@@ -497,25 +905,144 @@ impl FromStr for Compression {
 }
 
 // ----------------------------------------------------------------------
-/// Mirrors [parquet::PageType]
-///
+// Mirrors thrift enum `PageType`
+
+thrift_enum!(
 /// Available data pages for Parquet file format.
 /// Note that some of the page types may not be supported.
-#[derive(Debug, Clone, Copy, PartialEq, Eq)]
-#[allow(non_camel_case_types)]
-pub enum PageType {
-    /// Data page Parquet 1.0
-    DATA_PAGE,
-    /// Index page
-    INDEX_PAGE,
-    /// Dictionary page
-    DICTIONARY_PAGE,
-    /// Data page Parquet 2.0
-    DATA_PAGE_V2,
+enum PageType {
+  DATA_PAGE = 0;
+  INDEX_PAGE = 1;
+  DICTIONARY_PAGE = 2;
+  DATA_PAGE_V2 = 3;
+}
+);
+
+// ----------------------------------------------------------------------
+// Mirrors thrift enum `BoundaryOrder`
+
+thrift_enum!(
+/// Enum to annotate whether lists of min/max elements inside ColumnIndex
+/// are ordered and if so, in which direction.
+enum BoundaryOrder {
+  UNORDERED = 0;
+  ASCENDING = 1;
+  DESCENDING = 2;
+}
+);
+
+// ----------------------------------------------------------------------
+// Mirrors thrift enum `EdgeInterpolationAlgorithm`
+
+// this is hand coded to allow for the _Unknown variant (allows this to be forward compatible)
+
+/// Edge interpolation algorithm for [`LogicalType::Geography`]
+#[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
+#[repr(i32)]
+pub enum EdgeInterpolationAlgorithm {
+    /// Edges are interpolated as geodesics on a sphere.
+    SPHERICAL = 0,
+    /// <https://en.wikipedia.org/wiki/Vincenty%27s_formulae>
+    VINCENTY = 1,
+    /// Thomas, Paul D. Spheroidal geodesics, reference systems, & local geometry. US Naval Oceanographic Office, 1970
+    THOMAS = 2,
+    /// Thomas, Paul D. Mathematical models for navigation systems. US Naval Oceanographic Office, 1965.
+    ANDOYER = 3,
+    /// Karney, Charles FF. "Algorithms for geodesics." Journal of Geodesy 87 (2013): 43-55
+    KARNEY = 4,
+    /// Unknown algorithm
+    _Unknown(i32),
+}
+
+impl fmt::Display for EdgeInterpolationAlgorithm {
+    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
+        f.write_fmt(format_args!("{0:?}", self))
+    }
+}
+
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for EdgeInterpolationAlgorithm {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        let val = prot.read_i32()?;
+        match val {
+            0 => Ok(Self::SPHERICAL),
+            1 => Ok(Self::VINCENTY),
+            2 => Ok(Self::THOMAS),
+            3 => Ok(Self::ANDOYER),
+            4 => Ok(Self::KARNEY),
+            _ => Ok(Self::_Unknown(val)),
+        }
+    }
+}
+
+impl WriteThrift for EdgeInterpolationAlgorithm {
+    const ELEMENT_TYPE: ElementType = ElementType::I32;
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        let val: i32 = match *self {
+            Self::SPHERICAL => 0,
+            Self::VINCENTY => 1,
+            Self::THOMAS => 2,
+            Self::ANDOYER => 3,
+            Self::KARNEY => 4,
+            Self::_Unknown(i) => i,
+        };
+        writer.write_i32(val)
+    }
+}
+
+impl WriteThriftField for EdgeInterpolationAlgorithm {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::I32, field_id, last_field_id)?;
+        self.write_thrift(writer)?;
+        Ok(field_id)
+    }
+}
+
+impl Default for EdgeInterpolationAlgorithm {
+    fn default() -> Self {
+        Self::SPHERICAL
+    }
 }
 
 // ----------------------------------------------------------------------
-// Mirrors `parquet::ColumnOrder`
+// Mirrors thrift union `BloomFilterAlgorithm`
+
+thrift_union_all_empty!(
+/// The algorithm used in Bloom filter.
+union BloomFilterAlgorithm {
+  /// Block-based Bloom filter.
+  1: SplitBlockAlgorithm BLOCK;
+}
+);
+
+// ----------------------------------------------------------------------
+// Mirrors thrift union `BloomFilterHash`
+
+thrift_union_all_empty!(
+/// The hash function used in Bloom filter. This function takes the hash of a column value
+/// using plain encoding.
+union BloomFilterHash {
+  /// xxHash Strategy.
+  1: XxHash XXHASH;
+}
+);
+
+// ----------------------------------------------------------------------
+// Mirrors thrift union `BloomFilterCompression`
+
+thrift_union_all_empty!(
+/// The compression used in the Bloom filter.
+union BloomFilterCompression {
+  1: Uncompressed UNCOMPRESSED;
+}
+);
+
+// ----------------------------------------------------------------------
+// Mirrors thrift union `ColumnOrder`
 
 /// Sort order for page and column statistics.
 ///
@@ -554,9 +1081,13 @@ pub enum ColumnOrder {
     /// Column uses the order defined by its logical or physical type
     /// (if there is no logical type), parquet-format 2.4.0+.
     TYPE_DEFINED_ORDER(SortOrder),
+    // The following are not defined in the Parquet spec and should always be last.
     /// Undefined column order, means legacy behaviour before parquet-format 2.4.0.
     /// Sort order is always SIGNED.
     UNDEFINED,
+    /// An unknown but present ColumnOrder. Statistics with an unknown `ColumnOrder`
+    /// will be ignored.
+    UNKNOWN,
 }
 
 impl ColumnOrder {
@@ -584,9 +1115,10 @@ impl ColumnOrder {
                 LogicalType::Unknown => SortOrder::UNDEFINED,
                 LogicalType::Uuid => SortOrder::UNSIGNED,
                 LogicalType::Float16 => SortOrder::SIGNED,
-                LogicalType::Variant | LogicalType::Geometry | LogicalType::Geography => {
-                    SortOrder::UNDEFINED
-                }
+                LogicalType::Variant { .. }
+                | LogicalType::Geometry { .. }
+                | LogicalType::Geography { .. }
+                | LogicalType::_Unknown { .. } => SortOrder::UNDEFINED,
             },
             // Fall back to converted type
             None => Self::get_converted_sort_order(converted_type, physical_type),
@@ -656,41 +1188,64 @@ impl ColumnOrder {
         match *self {
             ColumnOrder::TYPE_DEFINED_ORDER(order) => order,
             ColumnOrder::UNDEFINED => SortOrder::SIGNED,
+            ColumnOrder::UNKNOWN => SortOrder::UNDEFINED,
         }
     }
 }
 
-impl fmt::Display for Type {
-    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
-        write!(f, "{self:?}")
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for ColumnOrder {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        let field_ident = prot.read_field_begin(0)?;
+        if field_ident.field_type == FieldType::Stop {
+            return Err(general_err!("Received empty union from remote ColumnOrder"));
+        }
+        let ret = match field_ident.id {
+            1 => {
+                // NOTE: the sort order needs to be set correctly after parsing.
+                prot.skip_empty_struct()?;
+                Self::TYPE_DEFINED_ORDER(SortOrder::SIGNED)
+            }
+            _ => {
+                prot.skip(field_ident.field_type)?;
+                Self::UNKNOWN
+            }
+        };
+        let field_ident = prot.read_field_begin(field_ident.id)?;
+        if field_ident.field_type != FieldType::Stop {
+            return Err(general_err!(
+                "Received multiple fields for union from remote ColumnOrder"
+            ));
+        }
+        Ok(ret)
     }
 }
 
-impl fmt::Display for ConvertedType {
-    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
-        write!(f, "{self:?}")
-    }
-}
+impl WriteThrift for ColumnOrder {
+    const ELEMENT_TYPE: ElementType = ElementType::Struct;
 
-impl fmt::Display for Repetition {
-    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
-        write!(f, "{self:?}")
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        match *self {
+            Self::TYPE_DEFINED_ORDER(_) => {
+                writer.write_field_begin(FieldType::Struct, 1, 0)?;
+                writer.write_struct_end()?;
+            }
+            _ => return Err(general_err!("Attempt to write undefined ColumnOrder")),
+        }
+        // write end of struct for this union
+        writer.write_struct_end()
     }
 }
 
-impl fmt::Display for Encoding {
-    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
-        write!(f, "{self:?}")
-    }
-}
+// ----------------------------------------------------------------------
+// Display handlers
 
-impl fmt::Display for Compression {
+impl fmt::Display for ConvertedType {
     fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
         write!(f, "{self:?}")
     }
 }
 
-impl fmt::Display for PageType {
+impl fmt::Display for Compression {
     fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
         write!(f, "{self:?}")
     }
@@ -708,198 +1263,6 @@ impl fmt::Display for ColumnOrder {
     }
 }
 
-// ----------------------------------------------------------------------
-// parquet::Type <=> Type conversion
-
-impl TryFrom<parquet::Type> for Type {
-    type Error = ParquetError;
-
-    fn try_from(value: parquet::Type) -> Result<Self> {
-        Ok(match value {
-            parquet::Type::BOOLEAN => Type::BOOLEAN,
-            parquet::Type::INT32 => Type::INT32,
-            parquet::Type::INT64 => Type::INT64,
-            parquet::Type::INT96 => Type::INT96,
-            parquet::Type::FLOAT => Type::FLOAT,
-            parquet::Type::DOUBLE => Type::DOUBLE,
-            parquet::Type::BYTE_ARRAY => Type::BYTE_ARRAY,
-            parquet::Type::FIXED_LEN_BYTE_ARRAY => Type::FIXED_LEN_BYTE_ARRAY,
-            _ => return Err(general_err!("unexpected parquet type: {}", value.0)),
-        })
-    }
-}
-
-impl From<Type> for parquet::Type {
-    fn from(value: Type) -> Self {
-        match value {
-            Type::BOOLEAN => parquet::Type::BOOLEAN,
-            Type::INT32 => parquet::Type::INT32,
-            Type::INT64 => parquet::Type::INT64,
-            Type::INT96 => parquet::Type::INT96,
-            Type::FLOAT => parquet::Type::FLOAT,
-            Type::DOUBLE => parquet::Type::DOUBLE,
-            Type::BYTE_ARRAY => parquet::Type::BYTE_ARRAY,
-            Type::FIXED_LEN_BYTE_ARRAY => parquet::Type::FIXED_LEN_BYTE_ARRAY,
-        }
-    }
-}
-
-// ----------------------------------------------------------------------
-// parquet::ConvertedType <=> ConvertedType conversion
-
-impl TryFrom<Option<parquet::ConvertedType>> for ConvertedType {
-    type Error = ParquetError;
-
-    fn try_from(option: Option<parquet::ConvertedType>) -> Result<Self> {
-        Ok(match option {
-            None => ConvertedType::NONE,
-            Some(value) => match value {
-                parquet::ConvertedType::UTF8 => ConvertedType::UTF8,
-                parquet::ConvertedType::MAP => ConvertedType::MAP,
-                parquet::ConvertedType::MAP_KEY_VALUE => ConvertedType::MAP_KEY_VALUE,
-                parquet::ConvertedType::LIST => ConvertedType::LIST,
-                parquet::ConvertedType::ENUM => ConvertedType::ENUM,
-                parquet::ConvertedType::DECIMAL => ConvertedType::DECIMAL,
-                parquet::ConvertedType::DATE => ConvertedType::DATE,
-                parquet::ConvertedType::TIME_MILLIS => ConvertedType::TIME_MILLIS,
-                parquet::ConvertedType::TIME_MICROS => ConvertedType::TIME_MICROS,
-                parquet::ConvertedType::TIMESTAMP_MILLIS => ConvertedType::TIMESTAMP_MILLIS,
-                parquet::ConvertedType::TIMESTAMP_MICROS => ConvertedType::TIMESTAMP_MICROS,
-                parquet::ConvertedType::UINT_8 => ConvertedType::UINT_8,
-                parquet::ConvertedType::UINT_16 => ConvertedType::UINT_16,
-                parquet::ConvertedType::UINT_32 => ConvertedType::UINT_32,
-                parquet::ConvertedType::UINT_64 => ConvertedType::UINT_64,
-                parquet::ConvertedType::INT_8 => ConvertedType::INT_8,
-                parquet::ConvertedType::INT_16 => ConvertedType::INT_16,
-                parquet::ConvertedType::INT_32 => ConvertedType::INT_32,
-                parquet::ConvertedType::INT_64 => ConvertedType::INT_64,
-                parquet::ConvertedType::JSON => ConvertedType::JSON,
-                parquet::ConvertedType::BSON => ConvertedType::BSON,
-                parquet::ConvertedType::INTERVAL => ConvertedType::INTERVAL,
-                _ => {
-                    return Err(general_err!(
-                        "unexpected parquet converted type: {}",
-                        value.0
-                    ))
-                }
-            },
-        })
-    }
-}
-
-impl From<ConvertedType> for Option<parquet::ConvertedType> {
-    fn from(value: ConvertedType) -> Self {
-        match value {
-            ConvertedType::NONE => None,
-            ConvertedType::UTF8 => Some(parquet::ConvertedType::UTF8),
-            ConvertedType::MAP => Some(parquet::ConvertedType::MAP),
-            ConvertedType::MAP_KEY_VALUE => Some(parquet::ConvertedType::MAP_KEY_VALUE),
-            ConvertedType::LIST => Some(parquet::ConvertedType::LIST),
-            ConvertedType::ENUM => Some(parquet::ConvertedType::ENUM),
-            ConvertedType::DECIMAL => Some(parquet::ConvertedType::DECIMAL),
-            ConvertedType::DATE => Some(parquet::ConvertedType::DATE),
-            ConvertedType::TIME_MILLIS => Some(parquet::ConvertedType::TIME_MILLIS),
-            ConvertedType::TIME_MICROS => Some(parquet::ConvertedType::TIME_MICROS),
-            ConvertedType::TIMESTAMP_MILLIS => Some(parquet::ConvertedType::TIMESTAMP_MILLIS),
-            ConvertedType::TIMESTAMP_MICROS => Some(parquet::ConvertedType::TIMESTAMP_MICROS),
-            ConvertedType::UINT_8 => Some(parquet::ConvertedType::UINT_8),
-            ConvertedType::UINT_16 => Some(parquet::ConvertedType::UINT_16),
-            ConvertedType::UINT_32 => Some(parquet::ConvertedType::UINT_32),
-            ConvertedType::UINT_64 => Some(parquet::ConvertedType::UINT_64),
-            ConvertedType::INT_8 => Some(parquet::ConvertedType::INT_8),
-            ConvertedType::INT_16 => Some(parquet::ConvertedType::INT_16),
-            ConvertedType::INT_32 => Some(parquet::ConvertedType::INT_32),
-            ConvertedType::INT_64 => Some(parquet::ConvertedType::INT_64),
-            ConvertedType::JSON => Some(parquet::ConvertedType::JSON),
-            ConvertedType::BSON => Some(parquet::ConvertedType::BSON),
-            ConvertedType::INTERVAL => Some(parquet::ConvertedType::INTERVAL),
-        }
-    }
-}
-
-// ----------------------------------------------------------------------
-// parquet::LogicalType <=> LogicalType conversion
-
-impl From<parquet::LogicalType> for LogicalType {
-    fn from(value: parquet::LogicalType) -> Self {
-        match value {
-            parquet::LogicalType::STRING(_) => LogicalType::String,
-            parquet::LogicalType::MAP(_) => LogicalType::Map,
-            parquet::LogicalType::LIST(_) => LogicalType::List,
-            parquet::LogicalType::ENUM(_) => LogicalType::Enum,
-            parquet::LogicalType::DECIMAL(t) => LogicalType::Decimal {
-                scale: t.scale,
-                precision: t.precision,
-            },
-            parquet::LogicalType::DATE(_) => LogicalType::Date,
-            parquet::LogicalType::TIME(t) => LogicalType::Time {
-                is_adjusted_to_u_t_c: t.is_adjusted_to_u_t_c,
-                unit: t.unit,
-            },
-            parquet::LogicalType::TIMESTAMP(t) => LogicalType::Timestamp {
-                is_adjusted_to_u_t_c: t.is_adjusted_to_u_t_c,
-                unit: t.unit,
-            },
-            parquet::LogicalType::INTEGER(t) => LogicalType::Integer {
-                bit_width: t.bit_width,
-                is_signed: t.is_signed,
-            },
-            parquet::LogicalType::UNKNOWN(_) => LogicalType::Unknown,
-            parquet::LogicalType::JSON(_) => LogicalType::Json,
-            parquet::LogicalType::BSON(_) => LogicalType::Bson,
-            parquet::LogicalType::UUID(_) => LogicalType::Uuid,
-            parquet::LogicalType::FLOAT16(_) => LogicalType::Float16,
-            parquet::LogicalType::VARIANT(_) => LogicalType::Variant,
-            parquet::LogicalType::GEOMETRY(_) => LogicalType::Geometry,
-            parquet::LogicalType::GEOGRAPHY(_) => LogicalType::Geography,
-        }
-    }
-}
-
-impl From<LogicalType> for parquet::LogicalType {
-    fn from(value: LogicalType) -> Self {
-        match value {
-            LogicalType::String => parquet::LogicalType::STRING(Default::default()),
-            LogicalType::Map => parquet::LogicalType::MAP(Default::default()),
-            LogicalType::List => parquet::LogicalType::LIST(Default::default()),
-            LogicalType::Enum => parquet::LogicalType::ENUM(Default::default()),
-            LogicalType::Decimal { scale, precision } => {
-                parquet::LogicalType::DECIMAL(DecimalType { scale, precision })
-            }
-            LogicalType::Date => parquet::LogicalType::DATE(Default::default()),
-            LogicalType::Time {
-                is_adjusted_to_u_t_c,
-                unit,
-            } => parquet::LogicalType::TIME(TimeType {
-                is_adjusted_to_u_t_c,
-                unit,
-            }),
-            LogicalType::Timestamp {
-                is_adjusted_to_u_t_c,
-                unit,
-            } => parquet::LogicalType::TIMESTAMP(TimestampType {
-                is_adjusted_to_u_t_c,
-                unit,
-            }),
-            LogicalType::Integer {
-                bit_width,
-                is_signed,
-            } => parquet::LogicalType::INTEGER(IntType {
-                bit_width,
-                is_signed,
-            }),
-            LogicalType::Unknown => parquet::LogicalType::UNKNOWN(Default::default()),
-            LogicalType::Json => parquet::LogicalType::JSON(Default::default()),
-            LogicalType::Bson => parquet::LogicalType::BSON(Default::default()),
-            LogicalType::Uuid => parquet::LogicalType::UUID(Default::default()),
-            LogicalType::Float16 => parquet::LogicalType::FLOAT16(Default::default()),
-            LogicalType::Variant => parquet::LogicalType::VARIANT(Default::default()),
-            LogicalType::Geometry => parquet::LogicalType::GEOMETRY(Default::default()),
-            LogicalType::Geography => parquet::LogicalType::GEOGRAPHY(Default::default()),
-        }
-    }
-}
-
 // ----------------------------------------------------------------------
 // LogicalType <=> ConvertedType conversion
 
@@ -920,14 +1283,14 @@ impl From<Option<LogicalType>> for ConvertedType {
                 LogicalType::Decimal { .. } => ConvertedType::DECIMAL,
                 LogicalType::Date => ConvertedType::DATE,
                 LogicalType::Time { unit, .. } => match unit {
-                    TimeUnit::MILLIS(_) => ConvertedType::TIME_MILLIS,
-                    TimeUnit::MICROS(_) => ConvertedType::TIME_MICROS,
-                    TimeUnit::NANOS(_) => ConvertedType::NONE,
+                    TimeUnit::MILLIS => ConvertedType::TIME_MILLIS,
+                    TimeUnit::MICROS => ConvertedType::TIME_MICROS,
+                    TimeUnit::NANOS => ConvertedType::NONE,
                 },
                 LogicalType::Timestamp { unit, .. } => match unit {
-                    TimeUnit::MILLIS(_) => ConvertedType::TIMESTAMP_MILLIS,
-                    TimeUnit::MICROS(_) => ConvertedType::TIMESTAMP_MICROS,
-                    TimeUnit::NANOS(_) => ConvertedType::NONE,
+                    TimeUnit::MILLIS => ConvertedType::TIMESTAMP_MILLIS,
+                    TimeUnit::MICROS => ConvertedType::TIMESTAMP_MICROS,
+                    TimeUnit::NANOS => ConvertedType::NONE,
                 },
                 LogicalType::Integer {
                     bit_width,
@@ -949,9 +1312,10 @@ impl From<Option<LogicalType>> for ConvertedType {
                 LogicalType::Bson => ConvertedType::BSON,
                 LogicalType::Uuid
                 | LogicalType::Float16
-                | LogicalType::Variant
-                | LogicalType::Geometry
-                | LogicalType::Geography
+                | LogicalType::Variant { .. }
+                | LogicalType::Geometry { .. }
+                | LogicalType::Geography { .. }
+                | LogicalType::_Unknown { .. }
                 | LogicalType::Unknown => ConvertedType::NONE,
             },
             None => ConvertedType::NONE,
@@ -959,146 +1323,6 @@ impl From<Option<LogicalType>> for ConvertedType {
     }
 }
 
-// ----------------------------------------------------------------------
-// parquet::FieldRepetitionType <=> Repetition conversion
-
-impl TryFrom<parquet::FieldRepetitionType> for Repetition {
-    type Error = ParquetError;
-
-    fn try_from(value: parquet::FieldRepetitionType) -> Result<Self> {
-        Ok(match value {
-            parquet::FieldRepetitionType::REQUIRED => Repetition::REQUIRED,
-            parquet::FieldRepetitionType::OPTIONAL => Repetition::OPTIONAL,
-            parquet::FieldRepetitionType::REPEATED => Repetition::REPEATED,
-            _ => {
-                return Err(general_err!(
-                    "unexpected parquet repetition type: {}",
-                    value.0
-                ))
-            }
-        })
-    }
-}
-
-impl From<Repetition> for parquet::FieldRepetitionType {
-    fn from(value: Repetition) -> Self {
-        match value {
-            Repetition::REQUIRED => parquet::FieldRepetitionType::REQUIRED,
-            Repetition::OPTIONAL => parquet::FieldRepetitionType::OPTIONAL,
-            Repetition::REPEATED => parquet::FieldRepetitionType::REPEATED,
-        }
-    }
-}
-
-// ----------------------------------------------------------------------
-// parquet::Encoding <=> Encoding conversion
-
-impl TryFrom<parquet::Encoding> for Encoding {
-    type Error = ParquetError;
-
-    fn try_from(value: parquet::Encoding) -> Result<Self> {
-        Ok(match value {
-            parquet::Encoding::PLAIN => Encoding::PLAIN,
-            parquet::Encoding::PLAIN_DICTIONARY => Encoding::PLAIN_DICTIONARY,
-            parquet::Encoding::RLE => Encoding::RLE,
-            #[allow(deprecated)]
-            parquet::Encoding::BIT_PACKED => Encoding::BIT_PACKED,
-            parquet::Encoding::DELTA_BINARY_PACKED => Encoding::DELTA_BINARY_PACKED,
-            parquet::Encoding::DELTA_LENGTH_BYTE_ARRAY => Encoding::DELTA_LENGTH_BYTE_ARRAY,
-            parquet::Encoding::DELTA_BYTE_ARRAY => Encoding::DELTA_BYTE_ARRAY,
-            parquet::Encoding::RLE_DICTIONARY => Encoding::RLE_DICTIONARY,
-            parquet::Encoding::BYTE_STREAM_SPLIT => Encoding::BYTE_STREAM_SPLIT,
-            _ => return Err(general_err!("unexpected parquet encoding: {}", value.0)),
-        })
-    }
-}
-
-impl From<Encoding> for parquet::Encoding {
-    fn from(value: Encoding) -> Self {
-        match value {
-            Encoding::PLAIN => parquet::Encoding::PLAIN,
-            Encoding::PLAIN_DICTIONARY => parquet::Encoding::PLAIN_DICTIONARY,
-            Encoding::RLE => parquet::Encoding::RLE,
-            #[allow(deprecated)]
-            Encoding::BIT_PACKED => parquet::Encoding::BIT_PACKED,
-            Encoding::DELTA_BINARY_PACKED => parquet::Encoding::DELTA_BINARY_PACKED,
-            Encoding::DELTA_LENGTH_BYTE_ARRAY => parquet::Encoding::DELTA_LENGTH_BYTE_ARRAY,
-            Encoding::DELTA_BYTE_ARRAY => parquet::Encoding::DELTA_BYTE_ARRAY,
-            Encoding::RLE_DICTIONARY => parquet::Encoding::RLE_DICTIONARY,
-            Encoding::BYTE_STREAM_SPLIT => parquet::Encoding::BYTE_STREAM_SPLIT,
-        }
-    }
-}
-
-// ----------------------------------------------------------------------
-// parquet::CompressionCodec <=> Compression conversion
-
-impl TryFrom<parquet::CompressionCodec> for Compression {
-    type Error = ParquetError;
-
-    fn try_from(value: parquet::CompressionCodec) -> Result<Self> {
-        Ok(match value {
-            parquet::CompressionCodec::UNCOMPRESSED => Compression::UNCOMPRESSED,
-            parquet::CompressionCodec::SNAPPY => Compression::SNAPPY,
-            parquet::CompressionCodec::GZIP => Compression::GZIP(Default::default()),
-            parquet::CompressionCodec::LZO => Compression::LZO,
-            parquet::CompressionCodec::BROTLI => Compression::BROTLI(Default::default()),
-            parquet::CompressionCodec::LZ4 => Compression::LZ4,
-            parquet::CompressionCodec::ZSTD => Compression::ZSTD(Default::default()),
-            parquet::CompressionCodec::LZ4_RAW => Compression::LZ4_RAW,
-            _ => {
-                return Err(general_err!(
-                    "unexpected parquet compression codec: {}",
-                    value.0
-                ))
-            }
-        })
-    }
-}
-
-impl From<Compression> for parquet::CompressionCodec {
-    fn from(value: Compression) -> Self {
-        match value {
-            Compression::UNCOMPRESSED => parquet::CompressionCodec::UNCOMPRESSED,
-            Compression::SNAPPY => parquet::CompressionCodec::SNAPPY,
-            Compression::GZIP(_) => parquet::CompressionCodec::GZIP,
-            Compression::LZO => parquet::CompressionCodec::LZO,
-            Compression::BROTLI(_) => parquet::CompressionCodec::BROTLI,
-            Compression::LZ4 => parquet::CompressionCodec::LZ4,
-            Compression::ZSTD(_) => parquet::CompressionCodec::ZSTD,
-            Compression::LZ4_RAW => parquet::CompressionCodec::LZ4_RAW,
-        }
-    }
-}
-
-// ----------------------------------------------------------------------
-// parquet::PageType <=> PageType conversion
-
-impl TryFrom<parquet::PageType> for PageType {
-    type Error = ParquetError;
-
-    fn try_from(value: parquet::PageType) -> Result<Self> {
-        Ok(match value {
-            parquet::PageType::DATA_PAGE => PageType::DATA_PAGE,
-            parquet::PageType::INDEX_PAGE => PageType::INDEX_PAGE,
-            parquet::PageType::DICTIONARY_PAGE => PageType::DICTIONARY_PAGE,
-            parquet::PageType::DATA_PAGE_V2 => PageType::DATA_PAGE_V2,
-            _ => return Err(general_err!("unexpected parquet page type: {}", value.0)),
-        })
-    }
-}
-
-impl From<PageType> for parquet::PageType {
-    fn from(value: PageType) -> Self {
-        match value {
-            PageType::DATA_PAGE => parquet::PageType::DATA_PAGE,
-            PageType::INDEX_PAGE => parquet::PageType::INDEX_PAGE,
-            PageType::DICTIONARY_PAGE => parquet::PageType::DICTIONARY_PAGE,
-            PageType::DATA_PAGE_V2 => parquet::PageType::DATA_PAGE_V2,
-        }
-    }
-}
-
 // ----------------------------------------------------------------------
 // String conversions for schema parsing.
 
@@ -1186,11 +1410,11 @@ impl str::FromStr for LogicalType {
             "DATE" => Ok(LogicalType::Date),
             "TIME" => Ok(LogicalType::Time {
                 is_adjusted_to_u_t_c: false,
-                unit: TimeUnit::MILLIS(parquet::MilliSeconds {}),
+                unit: TimeUnit::MILLIS,
             }),
             "TIMESTAMP" => Ok(LogicalType::Timestamp {
                 is_adjusted_to_u_t_c: false,
-                unit: TimeUnit::MILLIS(parquet::MilliSeconds {}),
+                unit: TimeUnit::MILLIS,
             }),
             "STRING" => Ok(LogicalType::String),
             "JSON" => Ok(LogicalType::Json),
@@ -1201,8 +1425,11 @@ impl str::FromStr for LogicalType {
                 "Interval parquet logical type not yet supported"
             )),
             "FLOAT16" => Ok(LogicalType::Float16),
-            "GEOMETRY" => Ok(LogicalType::Geometry),
-            "GEOGRAPHY" => Ok(LogicalType::Geography),
+            "GEOMETRY" => Ok(LogicalType::Geometry { crs: None }),
+            "GEOGRAPHY" => Ok(LogicalType::Geography {
+                crs: None,
+                algorithm: Some(EdgeInterpolationAlgorithm::SPHERICAL),
+            }),
             other => Err(general_err!("Invalid parquet logical type {}", other)),
         }
     }
@@ -1212,6 +1439,7 @@ impl str::FromStr for LogicalType {
 #[allow(deprecated)] // allow BIT_PACKED encoding for the whole test module
 mod tests {
     use super::*;
+    use crate::parquet_thrift::{tests::test_roundtrip, ThriftSliceInputProtocol};
 
     #[test]
     fn test_display_type() {
@@ -1220,47 +1448,11 @@ mod tests {
         assert_eq!(Type::INT64.to_string(), "INT64");
         assert_eq!(Type::INT96.to_string(), "INT96");
         assert_eq!(Type::FLOAT.to_string(), "FLOAT");
-        assert_eq!(Type::DOUBLE.to_string(), "DOUBLE");
-        assert_eq!(Type::BYTE_ARRAY.to_string(), "BYTE_ARRAY");
-        assert_eq!(
-            Type::FIXED_LEN_BYTE_ARRAY.to_string(),
-            "FIXED_LEN_BYTE_ARRAY"
-        );
-    }
-
-    #[test]
-    fn test_from_type() {
-        assert_eq!(
-            Type::try_from(parquet::Type::BOOLEAN).unwrap(),
-            Type::BOOLEAN
-        );
-        assert_eq!(Type::try_from(parquet::Type::INT32).unwrap(), Type::INT32);
-        assert_eq!(Type::try_from(parquet::Type::INT64).unwrap(), Type::INT64);
-        assert_eq!(Type::try_from(parquet::Type::INT96).unwrap(), Type::INT96);
-        assert_eq!(Type::try_from(parquet::Type::FLOAT).unwrap(), Type::FLOAT);
-        assert_eq!(Type::try_from(parquet::Type::DOUBLE).unwrap(), Type::DOUBLE);
-        assert_eq!(
-            Type::try_from(parquet::Type::BYTE_ARRAY).unwrap(),
-            Type::BYTE_ARRAY
-        );
-        assert_eq!(
-            Type::try_from(parquet::Type::FIXED_LEN_BYTE_ARRAY).unwrap(),
-            Type::FIXED_LEN_BYTE_ARRAY
-        );
-    }
-
-    #[test]
-    fn test_into_type() {
-        assert_eq!(parquet::Type::BOOLEAN, Type::BOOLEAN.into());
-        assert_eq!(parquet::Type::INT32, Type::INT32.into());
-        assert_eq!(parquet::Type::INT64, Type::INT64.into());
-        assert_eq!(parquet::Type::INT96, Type::INT96.into());
-        assert_eq!(parquet::Type::FLOAT, Type::FLOAT.into());
-        assert_eq!(parquet::Type::DOUBLE, Type::DOUBLE.into());
-        assert_eq!(parquet::Type::BYTE_ARRAY, Type::BYTE_ARRAY.into());
-        assert_eq!(
-            parquet::Type::FIXED_LEN_BYTE_ARRAY,
-            Type::FIXED_LEN_BYTE_ARRAY.into()
+        assert_eq!(Type::DOUBLE.to_string(), "DOUBLE");
+        assert_eq!(Type::BYTE_ARRAY.to_string(), "BYTE_ARRAY");
+        assert_eq!(
+            Type::FIXED_LEN_BYTE_ARRAY.to_string(),
+            "FIXED_LEN_BYTE_ARRAY"
         );
     }
 
@@ -1304,6 +1496,43 @@ mod tests {
         );
     }
 
+    #[test]
+    fn test_converted_type_roundtrip() {
+        test_roundtrip(ConvertedType::UTF8);
+        test_roundtrip(ConvertedType::MAP);
+        test_roundtrip(ConvertedType::MAP_KEY_VALUE);
+        test_roundtrip(ConvertedType::LIST);
+        test_roundtrip(ConvertedType::ENUM);
+        test_roundtrip(ConvertedType::DECIMAL);
+        test_roundtrip(ConvertedType::DATE);
+        test_roundtrip(ConvertedType::TIME_MILLIS);
+        test_roundtrip(ConvertedType::TIME_MICROS);
+        test_roundtrip(ConvertedType::TIMESTAMP_MILLIS);
+        test_roundtrip(ConvertedType::TIMESTAMP_MICROS);
+        test_roundtrip(ConvertedType::UINT_8);
+        test_roundtrip(ConvertedType::UINT_16);
+        test_roundtrip(ConvertedType::UINT_32);
+        test_roundtrip(ConvertedType::UINT_64);
+        test_roundtrip(ConvertedType::INT_8);
+        test_roundtrip(ConvertedType::INT_16);
+        test_roundtrip(ConvertedType::INT_32);
+        test_roundtrip(ConvertedType::INT_64);
+        test_roundtrip(ConvertedType::JSON);
+        test_roundtrip(ConvertedType::BSON);
+        test_roundtrip(ConvertedType::INTERVAL);
+    }
+
+    #[test]
+    fn test_read_invalid_converted_type() {
+        let mut prot = ThriftSliceInputProtocol::new(&[0x7eu8]);
+        let res = ConvertedType::read_thrift(&mut prot);
+        assert!(res.is_err());
+        assert_eq!(
+            res.unwrap_err().to_string(),
+            "Parquet error: Unexpected ConvertedType 63"
+        );
+    }
+
     #[test]
     fn test_display_converted_type() {
         assert_eq!(ConvertedType::NONE.to_string(), "NONE");
@@ -1339,202 +1568,6 @@ mod tests {
         assert_eq!(ConvertedType::DECIMAL.to_string(), "DECIMAL")
     }
 
-    #[test]
-    fn test_from_converted_type() {
-        let parquet_conv_none: Option<parquet::ConvertedType> = None;
-        assert_eq!(
-            ConvertedType::try_from(parquet_conv_none).unwrap(),
-            ConvertedType::NONE
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::UTF8)).unwrap(),
-            ConvertedType::UTF8
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::MAP)).unwrap(),
-            ConvertedType::MAP
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::MAP_KEY_VALUE)).unwrap(),
-            ConvertedType::MAP_KEY_VALUE
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::LIST)).unwrap(),
-            ConvertedType::LIST
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::ENUM)).unwrap(),
-            ConvertedType::ENUM
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::DECIMAL)).unwrap(),
-            ConvertedType::DECIMAL
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::DATE)).unwrap(),
-            ConvertedType::DATE
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::TIME_MILLIS)).unwrap(),
-            ConvertedType::TIME_MILLIS
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::TIME_MICROS)).unwrap(),
-            ConvertedType::TIME_MICROS
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::TIMESTAMP_MILLIS)).unwrap(),
-            ConvertedType::TIMESTAMP_MILLIS
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::TIMESTAMP_MICROS)).unwrap(),
-            ConvertedType::TIMESTAMP_MICROS
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::UINT_8)).unwrap(),
-            ConvertedType::UINT_8
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::UINT_16)).unwrap(),
-            ConvertedType::UINT_16
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::UINT_32)).unwrap(),
-            ConvertedType::UINT_32
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::UINT_64)).unwrap(),
-            ConvertedType::UINT_64
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::INT_8)).unwrap(),
-            ConvertedType::INT_8
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::INT_16)).unwrap(),
-            ConvertedType::INT_16
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::INT_32)).unwrap(),
-            ConvertedType::INT_32
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::INT_64)).unwrap(),
-            ConvertedType::INT_64
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::JSON)).unwrap(),
-            ConvertedType::JSON
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::BSON)).unwrap(),
-            ConvertedType::BSON
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::INTERVAL)).unwrap(),
-            ConvertedType::INTERVAL
-        );
-        assert_eq!(
-            ConvertedType::try_from(Some(parquet::ConvertedType::DECIMAL)).unwrap(),
-            ConvertedType::DECIMAL
-        )
-    }
-
-    #[test]
-    fn test_into_converted_type() {
-        let converted_type: Option<parquet::ConvertedType> = None;
-        assert_eq!(converted_type, ConvertedType::NONE.into());
-        assert_eq!(
-            Some(parquet::ConvertedType::UTF8),
-            ConvertedType::UTF8.into()
-        );
-        assert_eq!(Some(parquet::ConvertedType::MAP), ConvertedType::MAP.into());
-        assert_eq!(
-            Some(parquet::ConvertedType::MAP_KEY_VALUE),
-            ConvertedType::MAP_KEY_VALUE.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::LIST),
-            ConvertedType::LIST.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::ENUM),
-            ConvertedType::ENUM.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::DECIMAL),
-            ConvertedType::DECIMAL.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::DATE),
-            ConvertedType::DATE.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::TIME_MILLIS),
-            ConvertedType::TIME_MILLIS.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::TIME_MICROS),
-            ConvertedType::TIME_MICROS.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::TIMESTAMP_MILLIS),
-            ConvertedType::TIMESTAMP_MILLIS.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::TIMESTAMP_MICROS),
-            ConvertedType::TIMESTAMP_MICROS.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::UINT_8),
-            ConvertedType::UINT_8.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::UINT_16),
-            ConvertedType::UINT_16.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::UINT_32),
-            ConvertedType::UINT_32.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::UINT_64),
-            ConvertedType::UINT_64.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::INT_8),
-            ConvertedType::INT_8.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::INT_16),
-            ConvertedType::INT_16.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::INT_32),
-            ConvertedType::INT_32.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::INT_64),
-            ConvertedType::INT_64.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::JSON),
-            ConvertedType::JSON.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::BSON),
-            ConvertedType::BSON.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::INTERVAL),
-            ConvertedType::INTERVAL.into()
-        );
-        assert_eq!(
-            Some(parquet::ConvertedType::DECIMAL),
-            ConvertedType::DECIMAL.into()
-        )
-    }
-
     #[test]
     fn test_from_string_into_converted_type() {
         assert_eq!(
@@ -1736,42 +1769,42 @@ mod tests {
         );
         assert_eq!(
             ConvertedType::from(Some(LogicalType::Time {
-                unit: TimeUnit::MILLIS(Default::default()),
+                unit: TimeUnit::MILLIS,
                 is_adjusted_to_u_t_c: true,
             })),
             ConvertedType::TIME_MILLIS
         );
         assert_eq!(
             ConvertedType::from(Some(LogicalType::Time {
-                unit: TimeUnit::MICROS(Default::default()),
+                unit: TimeUnit::MICROS,
                 is_adjusted_to_u_t_c: true,
             })),
             ConvertedType::TIME_MICROS
         );
         assert_eq!(
             ConvertedType::from(Some(LogicalType::Time {
-                unit: TimeUnit::NANOS(Default::default()),
+                unit: TimeUnit::NANOS,
                 is_adjusted_to_u_t_c: false,
             })),
             ConvertedType::NONE
         );
         assert_eq!(
             ConvertedType::from(Some(LogicalType::Timestamp {
-                unit: TimeUnit::MILLIS(Default::default()),
+                unit: TimeUnit::MILLIS,
                 is_adjusted_to_u_t_c: true,
             })),
             ConvertedType::TIMESTAMP_MILLIS
         );
         assert_eq!(
             ConvertedType::from(Some(LogicalType::Timestamp {
-                unit: TimeUnit::MICROS(Default::default()),
+                unit: TimeUnit::MICROS,
                 is_adjusted_to_u_t_c: false,
             })),
             ConvertedType::TIMESTAMP_MICROS
         );
         assert_eq!(
             ConvertedType::from(Some(LogicalType::Timestamp {
-                unit: TimeUnit::NANOS(Default::default()),
+                unit: TimeUnit::NANOS,
                 is_adjusted_to_u_t_c: false,
             })),
             ConvertedType::NONE
@@ -1852,12 +1885,106 @@ mod tests {
             ConvertedType::from(Some(LogicalType::Float16)),
             ConvertedType::NONE
         );
+        assert_eq!(
+            ConvertedType::from(Some(LogicalType::Geometry { crs: None })),
+            ConvertedType::NONE
+        );
+        assert_eq!(
+            ConvertedType::from(Some(LogicalType::Geography {
+                crs: None,
+                algorithm: Some(EdgeInterpolationAlgorithm::default()),
+            })),
+            ConvertedType::NONE
+        );
         assert_eq!(
             ConvertedType::from(Some(LogicalType::Unknown)),
             ConvertedType::NONE
         );
     }
 
+    #[test]
+    fn test_logical_type_roundtrip() {
+        test_roundtrip(LogicalType::String);
+        test_roundtrip(LogicalType::Map);
+        test_roundtrip(LogicalType::List);
+        test_roundtrip(LogicalType::Enum);
+        test_roundtrip(LogicalType::Decimal {
+            scale: 0,
+            precision: 20,
+        });
+        test_roundtrip(LogicalType::Date);
+        test_roundtrip(LogicalType::Time {
+            is_adjusted_to_u_t_c: true,
+            unit: TimeUnit::MICROS,
+        });
+        test_roundtrip(LogicalType::Time {
+            is_adjusted_to_u_t_c: false,
+            unit: TimeUnit::MILLIS,
+        });
+        test_roundtrip(LogicalType::Time {
+            is_adjusted_to_u_t_c: false,
+            unit: TimeUnit::NANOS,
+        });
+        test_roundtrip(LogicalType::Timestamp {
+            is_adjusted_to_u_t_c: false,
+            unit: TimeUnit::MICROS,
+        });
+        test_roundtrip(LogicalType::Timestamp {
+            is_adjusted_to_u_t_c: true,
+            unit: TimeUnit::MILLIS,
+        });
+        test_roundtrip(LogicalType::Timestamp {
+            is_adjusted_to_u_t_c: true,
+            unit: TimeUnit::NANOS,
+        });
+        test_roundtrip(LogicalType::Integer {
+            bit_width: 8,
+            is_signed: true,
+        });
+        test_roundtrip(LogicalType::Integer {
+            bit_width: 16,
+            is_signed: false,
+        });
+        test_roundtrip(LogicalType::Integer {
+            bit_width: 32,
+            is_signed: true,
+        });
+        test_roundtrip(LogicalType::Integer {
+            bit_width: 64,
+            is_signed: false,
+        });
+        test_roundtrip(LogicalType::Json);
+        test_roundtrip(LogicalType::Bson);
+        test_roundtrip(LogicalType::Uuid);
+        test_roundtrip(LogicalType::Float16);
+        test_roundtrip(LogicalType::Variant {
+            specification_version: Some(1),
+        });
+        test_roundtrip(LogicalType::Variant {
+            specification_version: None,
+        });
+        test_roundtrip(LogicalType::Geometry {
+            crs: Some("foo".to_owned()),
+        });
+        test_roundtrip(LogicalType::Geometry { crs: None });
+        test_roundtrip(LogicalType::Geography {
+            crs: Some("foo".to_owned()),
+            algorithm: Some(EdgeInterpolationAlgorithm::ANDOYER),
+        });
+        test_roundtrip(LogicalType::Geography {
+            crs: None,
+            algorithm: Some(EdgeInterpolationAlgorithm::KARNEY),
+        });
+        test_roundtrip(LogicalType::Geography {
+            crs: Some("foo".to_owned()),
+            algorithm: Some(EdgeInterpolationAlgorithm::SPHERICAL),
+        });
+        test_roundtrip(LogicalType::Geography {
+            crs: None,
+            algorithm: Some(EdgeInterpolationAlgorithm::SPHERICAL),
+        });
+    }
+
     #[test]
     fn test_display_repetition() {
         assert_eq!(Repetition::REQUIRED.to_string(), "REQUIRED");
@@ -1865,38 +1992,6 @@ mod tests {
         assert_eq!(Repetition::REPEATED.to_string(), "REPEATED");
     }
 
-    #[test]
-    fn test_from_repetition() {
-        assert_eq!(
-            Repetition::try_from(parquet::FieldRepetitionType::REQUIRED).unwrap(),
-            Repetition::REQUIRED
-        );
-        assert_eq!(
-            Repetition::try_from(parquet::FieldRepetitionType::OPTIONAL).unwrap(),
-            Repetition::OPTIONAL
-        );
-        assert_eq!(
-            Repetition::try_from(parquet::FieldRepetitionType::REPEATED).unwrap(),
-            Repetition::REPEATED
-        );
-    }
-
-    #[test]
-    fn test_into_repetition() {
-        assert_eq!(
-            parquet::FieldRepetitionType::REQUIRED,
-            Repetition::REQUIRED.into()
-        );
-        assert_eq!(
-            parquet::FieldRepetitionType::OPTIONAL,
-            Repetition::OPTIONAL.into()
-        );
-        assert_eq!(
-            parquet::FieldRepetitionType::REPEATED,
-            Repetition::REPEATED.into()
-        );
-    }
-
     #[test]
     fn test_from_string_into_repetition() {
         assert_eq!(
@@ -1940,61 +2035,6 @@ mod tests {
         assert_eq!(Encoding::RLE_DICTIONARY.to_string(), "RLE_DICTIONARY");
     }
 
-    #[test]
-    fn test_from_encoding() {
-        assert_eq!(
-            Encoding::try_from(parquet::Encoding::PLAIN).unwrap(),
-            Encoding::PLAIN
-        );
-        assert_eq!(
-            Encoding::try_from(parquet::Encoding::PLAIN_DICTIONARY).unwrap(),
-            Encoding::PLAIN_DICTIONARY
-        );
-        assert_eq!(
-            Encoding::try_from(parquet::Encoding::RLE).unwrap(),
-            Encoding::RLE
-        );
-        assert_eq!(
-            Encoding::try_from(parquet::Encoding::BIT_PACKED).unwrap(),
-            Encoding::BIT_PACKED
-        );
-        assert_eq!(
-            Encoding::try_from(parquet::Encoding::DELTA_BINARY_PACKED).unwrap(),
-            Encoding::DELTA_BINARY_PACKED
-        );
-        assert_eq!(
-            Encoding::try_from(parquet::Encoding::DELTA_LENGTH_BYTE_ARRAY).unwrap(),
-            Encoding::DELTA_LENGTH_BYTE_ARRAY
-        );
-        assert_eq!(
-            Encoding::try_from(parquet::Encoding::DELTA_BYTE_ARRAY).unwrap(),
-            Encoding::DELTA_BYTE_ARRAY
-        );
-    }
-
-    #[test]
-    fn test_into_encoding() {
-        assert_eq!(parquet::Encoding::PLAIN, Encoding::PLAIN.into());
-        assert_eq!(
-            parquet::Encoding::PLAIN_DICTIONARY,
-            Encoding::PLAIN_DICTIONARY.into()
-        );
-        assert_eq!(parquet::Encoding::RLE, Encoding::RLE.into());
-        assert_eq!(parquet::Encoding::BIT_PACKED, Encoding::BIT_PACKED.into());
-        assert_eq!(
-            parquet::Encoding::DELTA_BINARY_PACKED,
-            Encoding::DELTA_BINARY_PACKED.into()
-        );
-        assert_eq!(
-            parquet::Encoding::DELTA_LENGTH_BYTE_ARRAY,
-            Encoding::DELTA_LENGTH_BYTE_ARRAY.into()
-        );
-        assert_eq!(
-            parquet::Encoding::DELTA_BYTE_ARRAY,
-            Encoding::DELTA_BYTE_ARRAY.into()
-        );
-    }
-
     #[test]
     fn test_compression_codec_to_string() {
         assert_eq!(Compression::UNCOMPRESSED.codec_to_string(), "UNCOMPRESSED");
@@ -2024,64 +2064,6 @@ mod tests {
         );
     }
 
-    #[test]
-    fn test_from_compression() {
-        assert_eq!(
-            Compression::try_from(parquet::CompressionCodec::UNCOMPRESSED).unwrap(),
-            Compression::UNCOMPRESSED
-        );
-        assert_eq!(
-            Compression::try_from(parquet::CompressionCodec::SNAPPY).unwrap(),
-            Compression::SNAPPY
-        );
-        assert_eq!(
-            Compression::try_from(parquet::CompressionCodec::GZIP).unwrap(),
-            Compression::GZIP(Default::default())
-        );
-        assert_eq!(
-            Compression::try_from(parquet::CompressionCodec::LZO).unwrap(),
-            Compression::LZO
-        );
-        assert_eq!(
-            Compression::try_from(parquet::CompressionCodec::BROTLI).unwrap(),
-            Compression::BROTLI(Default::default())
-        );
-        assert_eq!(
-            Compression::try_from(parquet::CompressionCodec::LZ4).unwrap(),
-            Compression::LZ4
-        );
-        assert_eq!(
-            Compression::try_from(parquet::CompressionCodec::ZSTD).unwrap(),
-            Compression::ZSTD(Default::default())
-        );
-    }
-
-    #[test]
-    fn test_into_compression() {
-        assert_eq!(
-            parquet::CompressionCodec::UNCOMPRESSED,
-            Compression::UNCOMPRESSED.into()
-        );
-        assert_eq!(
-            parquet::CompressionCodec::SNAPPY,
-            Compression::SNAPPY.into()
-        );
-        assert_eq!(
-            parquet::CompressionCodec::GZIP,
-            Compression::GZIP(Default::default()).into()
-        );
-        assert_eq!(parquet::CompressionCodec::LZO, Compression::LZO.into());
-        assert_eq!(
-            parquet::CompressionCodec::BROTLI,
-            Compression::BROTLI(Default::default()).into()
-        );
-        assert_eq!(parquet::CompressionCodec::LZ4, Compression::LZ4.into());
-        assert_eq!(
-            parquet::CompressionCodec::ZSTD,
-            Compression::ZSTD(Default::default()).into()
-        );
-    }
-
     #[test]
     fn test_display_page_type() {
         assert_eq!(PageType::DATA_PAGE.to_string(), "DATA_PAGE");
@@ -2090,40 +2072,6 @@ mod tests {
         assert_eq!(PageType::DATA_PAGE_V2.to_string(), "DATA_PAGE_V2");
     }
 
-    #[test]
-    fn test_from_page_type() {
-        assert_eq!(
-            PageType::try_from(parquet::PageType::DATA_PAGE).unwrap(),
-            PageType::DATA_PAGE
-        );
-        assert_eq!(
-            PageType::try_from(parquet::PageType::INDEX_PAGE).unwrap(),
-            PageType::INDEX_PAGE
-        );
-        assert_eq!(
-            PageType::try_from(parquet::PageType::DICTIONARY_PAGE).unwrap(),
-            PageType::DICTIONARY_PAGE
-        );
-        assert_eq!(
-            PageType::try_from(parquet::PageType::DATA_PAGE_V2).unwrap(),
-            PageType::DATA_PAGE_V2
-        );
-    }
-
-    #[test]
-    fn test_into_page_type() {
-        assert_eq!(parquet::PageType::DATA_PAGE, PageType::DATA_PAGE.into());
-        assert_eq!(parquet::PageType::INDEX_PAGE, PageType::INDEX_PAGE.into());
-        assert_eq!(
-            parquet::PageType::DICTIONARY_PAGE,
-            PageType::DICTIONARY_PAGE.into()
-        );
-        assert_eq!(
-            parquet::PageType::DATA_PAGE_V2,
-            PageType::DATA_PAGE_V2.into()
-        );
-    }
-
     #[test]
     fn test_display_sort_order() {
         assert_eq!(SortOrder::SIGNED.to_string(), "SIGNED");
@@ -2148,6 +2096,12 @@ mod tests {
         assert_eq!(ColumnOrder::UNDEFINED.to_string(), "UNDEFINED");
     }
 
+    #[test]
+    fn test_column_order_roundtrip() {
+        // SortOrder::SIGNED is the default on read.
+        test_roundtrip(ColumnOrder::TYPE_DEFINED_ORDER(SortOrder::SIGNED))
+    }
+
     #[test]
     fn test_column_order_get_logical_type_sort_order() {
         // Helper to check the order in a list of values.
@@ -2212,34 +2166,42 @@ mod tests {
             LogicalType::Date,
             LogicalType::Time {
                 is_adjusted_to_u_t_c: false,
-                unit: TimeUnit::MILLIS(Default::default()),
+                unit: TimeUnit::MILLIS,
             },
             LogicalType::Time {
                 is_adjusted_to_u_t_c: false,
-                unit: TimeUnit::MICROS(Default::default()),
+                unit: TimeUnit::MICROS,
             },
             LogicalType::Time {
                 is_adjusted_to_u_t_c: true,
-                unit: TimeUnit::NANOS(Default::default()),
+                unit: TimeUnit::NANOS,
             },
             LogicalType::Timestamp {
                 is_adjusted_to_u_t_c: false,
-                unit: TimeUnit::MILLIS(Default::default()),
+                unit: TimeUnit::MILLIS,
             },
             LogicalType::Timestamp {
                 is_adjusted_to_u_t_c: false,
-                unit: TimeUnit::MICROS(Default::default()),
+                unit: TimeUnit::MICROS,
             },
             LogicalType::Timestamp {
                 is_adjusted_to_u_t_c: true,
-                unit: TimeUnit::NANOS(Default::default()),
+                unit: TimeUnit::NANOS,
             },
             LogicalType::Float16,
         ];
         check_sort_order(signed, SortOrder::SIGNED);
 
         // Undefined comparison
-        let undefined = vec![LogicalType::List, LogicalType::Map];
+        let undefined = vec![
+            LogicalType::List,
+            LogicalType::Map,
+            LogicalType::Geometry { crs: None },
+            LogicalType::Geography {
+                crs: None,
+                algorithm: Some(EdgeInterpolationAlgorithm::default()),
+            },
+        ];
         check_sort_order(undefined, SortOrder::UNDEFINED);
     }
 
@@ -2428,4 +2390,23 @@ mod tests {
             "Parquet error: unknown encoding: gzip(-10)"
         );
     }
+
+    #[test]
+    fn test_display_boundary_order() {
+        assert_eq!(BoundaryOrder::ASCENDING.to_string(), "ASCENDING");
+        assert_eq!(BoundaryOrder::DESCENDING.to_string(), "DESCENDING");
+        assert_eq!(BoundaryOrder::UNORDERED.to_string(), "UNORDERED");
+    }
+
+    #[test]
+    fn test_display_edge_algo() {
+        assert_eq!(
+            EdgeInterpolationAlgorithm::SPHERICAL.to_string(),
+            "SPHERICAL"
+        );
+        assert_eq!(EdgeInterpolationAlgorithm::VINCENTY.to_string(), "VINCENTY");
+        assert_eq!(EdgeInterpolationAlgorithm::THOMAS.to_string(), "THOMAS");
+        assert_eq!(EdgeInterpolationAlgorithm::ANDOYER.to_string(), "ANDOYER");
+        assert_eq!(EdgeInterpolationAlgorithm::KARNEY.to_string(), "KARNEY");
+    }
 }
diff --git a/parquet/src/bin/parquet-index.rs b/parquet/src/bin/parquet-index.rs
index 1a9b74dd78fb..397a75c76ae4 100644
--- a/parquet/src/bin/parquet-index.rs
+++ b/parquet/src/bin/parquet-index.rs
@@ -35,12 +35,14 @@
 //! [page index]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
 
 use clap::Parser;
+use parquet::data_type::ByteArray;
 use parquet::errors::{ParquetError, Result};
-use parquet::file::page_index::index::{Index, PageIndex};
-use parquet::file::page_index::offset_index::OffsetIndexMetaData;
+use parquet::file::page_index::column_index::{
+    ByteArrayColumnIndex, ColumnIndexMetaData, PrimitiveColumnIndex,
+};
+use parquet::file::page_index::offset_index::{OffsetIndexMetaData, PageLocation};
 use parquet::file::reader::{FileReader, SerializedFileReader};
 use parquet::file::serialized_reader::ReadOptionsBuilder;
-use parquet::format::PageLocation;
 use std::fs::File;
 
 #[derive(Debug, Parser)]
@@ -97,16 +99,20 @@ impl Args {
             let row_counts =
                 compute_row_counts(offset_index.page_locations.as_slice(), row_group.num_rows());
             match &column_indices[column_idx] {
-                Index::NONE => println!("NO INDEX"),
-                Index::BOOLEAN(v) => print_index(&v.indexes, offset_index, &row_counts)?,
-                Index::INT32(v) => print_index(&v.indexes, offset_index, &row_counts)?,
-                Index::INT64(v) => print_index(&v.indexes, offset_index, &row_counts)?,
-                Index::INT96(v) => print_index(&v.indexes, offset_index, &row_counts)?,
-                Index::FLOAT(v) => print_index(&v.indexes, offset_index, &row_counts)?,
-                Index::DOUBLE(v) => print_index(&v.indexes, offset_index, &row_counts)?,
-                Index::BYTE_ARRAY(v) => print_index(&v.indexes, offset_index, &row_counts)?,
-                Index::FIXED_LEN_BYTE_ARRAY(v) => {
-                    print_index(&v.indexes, offset_index, &row_counts)?
+                ColumnIndexMetaData::NONE => println!("NO INDEX"),
+                ColumnIndexMetaData::BOOLEAN(v) => {
+                    print_index::<bool>(v, offset_index, &row_counts)?
+                }
+                ColumnIndexMetaData::INT32(v) => print_index(v, offset_index, &row_counts)?,
+                ColumnIndexMetaData::INT64(v) => print_index(v, offset_index, &row_counts)?,
+                ColumnIndexMetaData::INT96(v) => print_index(v, offset_index, &row_counts)?,
+                ColumnIndexMetaData::FLOAT(v) => print_index(v, offset_index, &row_counts)?,
+                ColumnIndexMetaData::DOUBLE(v) => print_index(v, offset_index, &row_counts)?,
+                ColumnIndexMetaData::BYTE_ARRAY(v) => {
+                    print_bytes_index(v, offset_index, &row_counts)?
+                }
+                ColumnIndexMetaData::FIXED_LEN_BYTE_ARRAY(v) => {
+                    print_bytes_index(v, offset_index, &row_counts)?
                 }
             }
         }
@@ -132,20 +138,21 @@ fn compute_row_counts(offset_index: &[PageLocation], rows: i64) -> Vec<i64> {
 
 /// Prints index information for a single column chunk
 fn print_index<T: std::fmt::Display>(
-    column_index: &[PageIndex<T>],
+    column_index: &PrimitiveColumnIndex<T>,
     offset_index: &OffsetIndexMetaData,
     row_counts: &[i64],
 ) -> Result<()> {
-    if column_index.len() != offset_index.page_locations.len() {
+    if column_index.num_pages() as usize != offset_index.page_locations.len() {
         return Err(ParquetError::General(format!(
             "Index length mismatch, got {} and {}",
-            column_index.len(),
+            column_index.num_pages(),
             offset_index.page_locations.len()
         )));
     }
 
-    for (idx, ((c, o), row_count)) in column_index
-        .iter()
+    for (idx, (((min, max), o), row_count)) in column_index
+        .min_values_iter()
+        .zip(column_index.max_values_iter())
         .zip(offset_index.page_locations())
         .zip(row_counts)
         .enumerate()
@@ -154,12 +161,12 @@ fn print_index<T: std::fmt::Display>(
             "Page {:>5} at offset {:#010x} with length {:>10} and row count {:>10}",
             idx, o.offset, o.compressed_page_size, row_count
         );
-        match &c.min {
+        match min {
             Some(m) => print!(", min {m:>10}"),
             None => print!(", min {:>10}", "NONE"),
         }
 
-        match &c.max {
+        match max {
             Some(m) => print!(", max {m:>10}"),
             None => print!(", max {:>10}", "NONE"),
         }
@@ -169,6 +176,51 @@ fn print_index<T: std::fmt::Display>(
     Ok(())
 }
 
+fn print_bytes_index(
+    column_index: &ByteArrayColumnIndex,
+    offset_index: &OffsetIndexMetaData,
+    row_counts: &[i64],
+) -> Result<()> {
+    if column_index.num_pages() as usize != offset_index.page_locations.len() {
+        return Err(ParquetError::General(format!(
+            "Index length mismatch, got {} and {}",
+            column_index.num_pages(),
+            offset_index.page_locations.len()
+        )));
+    }
+
+    for (idx, (((min, max), o), row_count)) in column_index
+        .min_values_iter()
+        .zip(column_index.max_values_iter())
+        .zip(offset_index.page_locations())
+        .zip(row_counts)
+        .enumerate()
+    {
+        print!(
+            "Page {:>5} at offset {:#010x} with length {:>10} and row count {:>10}",
+            idx, o.offset, o.compressed_page_size, row_count
+        );
+        match min {
+            Some(m) => match String::from_utf8(m.to_vec()) {
+                Ok(s) => print!(", min {s:>10}"),
+                Err(_) => print!(", min {:>10}", ByteArray::from(m)),
+            },
+            None => print!(", min {:>10}", "NONE"),
+        }
+
+        match max {
+            Some(m) => match String::from_utf8(m.to_vec()) {
+                Ok(s) => print!(", max {s:>10}"),
+                Err(_) => print!(", min {:>10}", ByteArray::from(m)),
+            },
+            None => print!(", max {:>10}", "NONE"),
+        }
+        println!()
+    }
+
+    Ok(())
+}
+
 fn main() -> Result<()> {
     Args::parse().run()
 }
diff --git a/parquet/src/bin/parquet-layout.rs b/parquet/src/bin/parquet-layout.rs
index 46a231a7d02b..6f589fab66ed 100644
--- a/parquet/src/bin/parquet-layout.rs
+++ b/parquet/src/bin/parquet-layout.rs
@@ -41,7 +41,7 @@ use parquet::file::metadata::ParquetMetaDataReader;
 use serde::Serialize;
 use thrift::protocol::TCompactInputProtocol;
 
-use parquet::basic::{Compression, Encoding};
+use parquet::basic::Compression;
 use parquet::errors::Result;
 use parquet::file::reader::ChunkReader;
 use parquet::format::PageHeader;
@@ -105,7 +105,7 @@ fn do_layout<C: ChunkReader>(reader: &C) -> Result<ParquetFile> {
                         if let Some(dictionary) = header.dictionary_page_header {
                             pages.push(Page {
                                 compression,
-                                encoding: encoding(dictionary.encoding),
+                                encoding: encoding(dictionary.encoding.0),
                                 page_type: "dictionary",
                                 offset: start,
                                 compressed_bytes: header.compressed_page_size,
@@ -116,7 +116,7 @@ fn do_layout<C: ChunkReader>(reader: &C) -> Result<ParquetFile> {
                         } else if let Some(data_page) = header.data_page_header {
                             pages.push(Page {
                                 compression,
-                                encoding: encoding(data_page.encoding),
+                                encoding: encoding(data_page.encoding.0),
                                 page_type: "data_page_v1",
                                 offset: start,
                                 compressed_bytes: header.compressed_page_size,
@@ -129,7 +129,7 @@ fn do_layout<C: ChunkReader>(reader: &C) -> Result<ParquetFile> {
 
                             pages.push(Page {
                                 compression: compression.filter(|_| is_compressed),
-                                encoding: encoding(data_page.encoding),
+                                encoding: encoding(data_page.encoding.0),
                                 page_type: "data_page_v2",
                                 offset: start,
                                 compressed_bytes: header.compressed_page_size,
@@ -196,19 +196,19 @@ fn compression(compression: Compression) -> Option<&'static str> {
 }
 
 /// Returns a string representation for a given encoding
-fn encoding(encoding: parquet::format::Encoding) -> &'static str {
-    match Encoding::try_from(encoding) {
-        Ok(Encoding::PLAIN) => "plain",
-        Ok(Encoding::PLAIN_DICTIONARY) => "plain_dictionary",
-        Ok(Encoding::RLE) => "rle",
+fn encoding(encoding: i32) -> &'static str {
+    match encoding {
+        0 => "plain",
+        2 => "plain_dictionary",
+        3 => "rle",
         #[allow(deprecated)]
-        Ok(Encoding::BIT_PACKED) => "bit_packed",
-        Ok(Encoding::DELTA_BINARY_PACKED) => "delta_binary_packed",
-        Ok(Encoding::DELTA_LENGTH_BYTE_ARRAY) => "delta_length_byte_array",
-        Ok(Encoding::DELTA_BYTE_ARRAY) => "delta_byte_array",
-        Ok(Encoding::RLE_DICTIONARY) => "rle_dictionary",
-        Ok(Encoding::BYTE_STREAM_SPLIT) => "byte_stream_split",
-        Err(_) => "unknown",
+        4 => "bit_packed",
+        5 => "delta_binary_packed",
+        6 => "delta_length_byte_array",
+        7 => "delta_byte_array",
+        8 => "rle_dictionary",
+        9 => "byte_stream_split",
+        _ => "unknown",
     }
 }
 
diff --git a/parquet/src/bloom_filter/mod.rs b/parquet/src/bloom_filter/mod.rs
index 09302bab8fec..290a887b2960 100644
--- a/parquet/src/bloom_filter/mod.rs
+++ b/parquet/src/bloom_filter/mod.rs
@@ -72,18 +72,18 @@
 //! [sbbf-paper]: https://arxiv.org/pdf/2101.01719
 //! [bf-formulae]: http://tfk.mit.edu/pdf/bloom.pdf
 
+use crate::basic::{BloomFilterAlgorithm, BloomFilterCompression, BloomFilterHash};
 use crate::data_type::AsBytes;
-use crate::errors::ParquetError;
+use crate::errors::{ParquetError, Result};
 use crate::file::metadata::ColumnChunkMetaData;
 use crate::file::reader::ChunkReader;
-use crate::format::{
-    BloomFilterAlgorithm, BloomFilterCompression, BloomFilterHash, BloomFilterHeader,
-    SplitBlockAlgorithm, Uncompressed, XxHash,
+use crate::parquet_thrift::{
+    ElementType, FieldType, ReadThrift, ThriftCompactInputProtocol, ThriftCompactOutputProtocol,
+    ThriftSliceInputProtocol, WriteThrift, WriteThriftField,
 };
-use crate::thrift::{TCompactSliceInputProtocol, TSerializable};
+use crate::thrift_struct;
 use bytes::Bytes;
 use std::io::Write;
-use thrift::protocol::{TCompactOutputProtocol, TOutputProtocol};
 use twox_hash::XxHash64;
 
 /// Salt as defined in the [spec](https://github.com/apache/parquet-format/blob/master/BloomFilter.md#technical-approach).
@@ -98,6 +98,22 @@ const SALT: [u32; 8] = [
     0x5c6bfb31_u32,
 ];
 
+thrift_struct!(
+/// Bloom filter header is stored at beginning of Bloom filter data of each column
+/// and followed by its bitset.
+///
+pub struct BloomFilterHeader {
+  /// The size of bitset in bytes
+  1: required i32 num_bytes;
+  /// The algorithm for setting bits.
+  2: required BloomFilterAlgorithm algorithm;
+  /// The hash function used for Bloom filter
+  3: required BloomFilterHash hash;
+  /// The compression used in the Bloom filter
+  4: required BloomFilterCompression compression;
+}
+);
+
 /// Each block is 256 bits, broken up into eight contiguous "words", each consisting of 32 bits.
 /// Each word is thought of as an array of bits; each bit is either "set" or "not set".
 #[derive(Debug, Copy, Clone)]
@@ -201,8 +217,8 @@ pub(crate) fn read_bloom_filter_header_and_length(
     buffer: Bytes,
 ) -> Result<(BloomFilterHeader, u64), ParquetError> {
     let total_length = buffer.len();
-    let mut prot = TCompactSliceInputProtocol::new(buffer.as_ref());
-    let header = BloomFilterHeader::read_from_in_protocol(&mut prot)
+    let mut prot = ThriftSliceInputProtocol::new(buffer.as_ref());
+    let header = BloomFilterHeader::read_thrift(&mut prot)
         .map_err(|e| ParquetError::General(format!("Could not read bloom filter header: {e}")))?;
     Ok((header, (total_length - prot.as_slice().len()) as u64))
 }
@@ -268,12 +284,10 @@ impl Sbbf {
     /// flush the writer in order to boost performance of bulk writing all blocks. Caller
     /// must remember to flush the writer.
     pub(crate) fn write<W: Write>(&self, mut writer: W) -> Result<(), ParquetError> {
-        let mut protocol = TCompactOutputProtocol::new(&mut writer);
-        let header = self.header();
-        header.write_to_out_protocol(&mut protocol).map_err(|e| {
+        let mut protocol = ThriftCompactOutputProtocol::new(&mut writer);
+        self.header().write_thrift(&mut protocol).map_err(|e| {
             ParquetError::General(format!("Could not write bloom filter header: {e}"))
         })?;
-        protocol.flush()?;
         self.write_bitset(&mut writer)?;
         Ok(())
     }
@@ -312,9 +326,9 @@ impl Sbbf {
         BloomFilterHeader {
             // 8 i32 per block, 4 bytes per i32
             num_bytes: self.0.len() as i32 * 4 * 8,
-            algorithm: BloomFilterAlgorithm::BLOCK(SplitBlockAlgorithm {}),
-            hash: BloomFilterHash::XXHASH(XxHash {}),
-            compression: BloomFilterCompression::UNCOMPRESSED(Uncompressed {}),
+            algorithm: BloomFilterAlgorithm::BLOCK,
+            hash: BloomFilterHash::XXHASH,
+            compression: BloomFilterCompression::UNCOMPRESSED,
         }
     }
 
@@ -340,17 +354,17 @@ impl Sbbf {
             chunk_read_bloom_filter_header_and_offset(offset, buffer.clone())?;
 
         match header.algorithm {
-            BloomFilterAlgorithm::BLOCK(_) => {
+            BloomFilterAlgorithm::BLOCK => {
                 // this match exists to future proof the singleton algorithm enum
             }
         }
         match header.compression {
-            BloomFilterCompression::UNCOMPRESSED(_) => {
+            BloomFilterCompression::UNCOMPRESSED => {
                 // this match exists to future proof the singleton compression enum
             }
         }
         match header.hash {
-            BloomFilterHash::XXHASH(_) => {
+            BloomFilterHash::XXHASH => {
                 // this match exists to future proof the singleton hash enum
             }
         }
@@ -478,15 +492,9 @@ mod tests {
             read_length,
         ) = read_bloom_filter_header_and_length(Bytes::copy_from_slice(buffer)).unwrap();
         assert_eq!(read_length, 15);
-        assert_eq!(
-            algorithm,
-            BloomFilterAlgorithm::BLOCK(SplitBlockAlgorithm {})
-        );
-        assert_eq!(
-            compression,
-            BloomFilterCompression::UNCOMPRESSED(Uncompressed {})
-        );
-        assert_eq!(hash, BloomFilterHash::XXHASH(XxHash {}));
+        assert_eq!(algorithm, BloomFilterAlgorithm::BLOCK);
+        assert_eq!(compression, BloomFilterCompression::UNCOMPRESSED);
+        assert_eq!(hash, BloomFilterHash::XXHASH);
         assert_eq!(num_bytes, 32_i32);
         assert_eq!(20, SBBF_HEADER_SIZE_ESTIMATE);
     }
diff --git a/parquet/src/column/page.rs b/parquet/src/column/page.rs
index a2f683d71f4e..09125eaabf02 100644
--- a/parquet/src/column/page.rs
+++ b/parquet/src/column/page.rs
@@ -21,8 +21,10 @@ use bytes::Bytes;
 
 use crate::basic::{Encoding, PageType};
 use crate::errors::{ParquetError, Result};
-use crate::file::statistics::Statistics;
-use crate::format::PageHeader;
+use crate::file::metadata::thrift_gen::{
+    DataPageHeader, DataPageHeaderV2, DictionaryPageHeader, PageHeader,
+};
+use crate::file::statistics::{page_stats_to_thrift, Statistics};
 
 /// Parquet Page definition.
 ///
@@ -216,7 +218,7 @@ impl CompressedPage {
         let page_type = self.page_type();
 
         let mut page_header = PageHeader {
-            type_: page_type.into(),
+            r#type: page_type,
             uncompressed_page_size: uncompressed_size as i32,
             compressed_page_size: compressed_size as i32,
             // TODO: Add support for crc checksum
@@ -234,12 +236,12 @@ impl CompressedPage {
                 ref statistics,
                 ..
             } => {
-                let data_page_header = crate::format::DataPageHeader {
+                let data_page_header = DataPageHeader {
                     num_values: num_values as i32,
-                    encoding: encoding.into(),
-                    definition_level_encoding: def_level_encoding.into(),
-                    repetition_level_encoding: rep_level_encoding.into(),
-                    statistics: crate::file::statistics::to_thrift(statistics.as_ref()),
+                    encoding,
+                    definition_level_encoding: def_level_encoding,
+                    repetition_level_encoding: rep_level_encoding,
+                    statistics: page_stats_to_thrift(statistics.as_ref()),
                 };
                 page_header.data_page_header = Some(data_page_header);
             }
@@ -252,22 +254,22 @@ impl CompressedPage {
                 ref statistics,
                 ..
             } => {
-                let data_page_header_v2 = crate::format::DataPageHeaderV2 {
+                let data_page_header_v2 = DataPageHeaderV2 {
                     num_values: num_values as i32,
                     num_nulls: num_nulls as i32,
                     num_rows: num_rows as i32,
-                    encoding: encoding.into(),
+                    encoding,
                     definition_levels_byte_length: def_levels_byte_len as i32,
                     repetition_levels_byte_length: rep_levels_byte_len as i32,
                     is_compressed: Some(is_compressed),
-                    statistics: crate::file::statistics::to_thrift(statistics.as_ref()),
+                    statistics: page_stats_to_thrift(statistics.as_ref()),
                 };
                 page_header.data_page_header_v2 = Some(data_page_header_v2);
             }
             Page::DictionaryPage { is_sorted, .. } => {
-                let dictionary_page_header = crate::format::DictionaryPageHeader {
+                let dictionary_page_header = DictionaryPageHeader {
                     num_values: num_values as i32,
-                    encoding: encoding.into(),
+                    encoding,
                     is_sorted: Some(is_sorted),
                 };
                 page_header.dictionary_page_header = Some(dictionary_page_header);
@@ -343,12 +345,14 @@ pub struct PageMetadata {
     pub is_dict: bool,
 }
 
-impl TryFrom<&PageHeader> for PageMetadata {
+impl TryFrom<&crate::file::metadata::thrift_gen::PageHeader> for PageMetadata {
     type Error = ParquetError;
 
-    fn try_from(value: &PageHeader) -> std::result::Result<Self, Self::Error> {
-        match value.type_ {
-            crate::format::PageType::DATA_PAGE => {
+    fn try_from(
+        value: &crate::file::metadata::thrift_gen::PageHeader,
+    ) -> std::result::Result<Self, Self::Error> {
+        match value.r#type {
+            PageType::DATA_PAGE => {
                 let header = value.data_page_header.as_ref().unwrap();
                 Ok(PageMetadata {
                     num_rows: None,
@@ -356,12 +360,12 @@ impl TryFrom<&PageHeader> for PageMetadata {
                     is_dict: false,
                 })
             }
-            crate::format::PageType::DICTIONARY_PAGE => Ok(PageMetadata {
+            PageType::DICTIONARY_PAGE => Ok(PageMetadata {
                 num_rows: None,
                 num_levels: None,
                 is_dict: true,
             }),
-            crate::format::PageType::DATA_PAGE_V2 => {
+            PageType::DATA_PAGE_V2 => {
                 let header = value.data_page_header_v2.as_ref().unwrap();
                 Ok(PageMetadata {
                     num_rows: Some(header.num_rows as _),
diff --git a/parquet/src/column/page_encryption.rs b/parquet/src/column/page_encryption.rs
index 0fb7c8942675..2486c2c289c4 100644
--- a/parquet/src/column/page_encryption.rs
+++ b/parquet/src/column/page_encryption.rs
@@ -15,14 +15,14 @@
 // specific language governing permissions and limitations
 // under the License.
 
+use crate::basic::PageType;
 use crate::column::page::CompressedPage;
 use crate::encryption::ciphers::BlockEncryptor;
-use crate::encryption::encrypt::{encrypt_object, FileEncryptor};
+use crate::encryption::encrypt::{encrypt_thrift_object, FileEncryptor};
 use crate::encryption::modules::{create_module_aad, ModuleType};
 use crate::errors::ParquetError;
 use crate::errors::Result;
-use crate::format::PageHeader;
-use crate::format::PageType;
+use crate::file::metadata::thrift_gen::PageHeader;
 use bytes::Bytes;
 use std::io::Write;
 use std::sync::Arc;
@@ -95,14 +95,14 @@ impl PageEncryptor {
         page_header: &PageHeader,
         sink: &mut W,
     ) -> Result<()> {
-        let module_type = match page_header.type_ {
+        let module_type = match page_header.r#type {
             PageType::DATA_PAGE => ModuleType::DataPageHeader,
             PageType::DATA_PAGE_V2 => ModuleType::DataPageHeader,
             PageType::DICTIONARY_PAGE => ModuleType::DictionaryPageHeader,
             _ => {
                 return Err(general_err!(
                     "Unsupported page type for page header encryption: {:?}",
-                    page_header.type_
+                    page_header.r#type
                 ))
             }
         };
@@ -114,6 +114,6 @@ impl PageEncryptor {
             Some(self.page_index),
         )?;
 
-        encrypt_object(page_header, &mut self.block_encryptor, sink, &aad)
+        encrypt_thrift_object(page_header, &mut self.block_encryptor, sink, &aad)
     }
 }
diff --git a/parquet/src/column/page_encryption_disabled.rs b/parquet/src/column/page_encryption_disabled.rs
index e85b0281168a..347024f7f21f 100644
--- a/parquet/src/column/page_encryption_disabled.rs
+++ b/parquet/src/column/page_encryption_disabled.rs
@@ -17,7 +17,7 @@
 
 use crate::column::page::CompressedPage;
 use crate::errors::Result;
-use crate::format::PageHeader;
+use crate::file::metadata::thrift_gen::PageHeader;
 use std::io::Write;
 
 #[derive(Debug)]
diff --git a/parquet/src/column/writer/mod.rs b/parquet/src/column/writer/mod.rs
index 9eb5fb3b7131..2ef2d236e7f7 100644
--- a/parquet/src/column/writer/mod.rs
+++ b/parquet/src/column/writer/mod.rs
@@ -21,11 +21,14 @@ use bytes::Bytes;
 use half::f16;
 
 use crate::bloom_filter::Sbbf;
-use crate::format::{BoundaryOrder, ColumnIndex, OffsetIndex};
+use crate::file::page_index::column_index::ColumnIndexMetaData;
+use crate::file::page_index::offset_index::OffsetIndexMetaData;
 use std::collections::{BTreeSet, VecDeque};
 use std::str;
 
-use crate::basic::{Compression, ConvertedType, Encoding, LogicalType, PageType, Type};
+use crate::basic::{
+    BoundaryOrder, Compression, ConvertedType, Encoding, LogicalType, PageType, Type,
+};
 use crate::column::page::{CompressedPage, Page, PageWriteSpec, PageWriter};
 use crate::column::writer::encoder::{ColumnValueEncoder, ColumnValueEncoderImpl, ColumnValues};
 use crate::compression::{create_codec, Codec, CodecOptionsBuilder};
@@ -37,9 +40,8 @@ use crate::encryption::encrypt::get_column_crypto_metadata;
 use crate::errors::{ParquetError, Result};
 use crate::file::metadata::{
     ColumnChunkMetaData, ColumnChunkMetaDataBuilder, ColumnIndexBuilder, LevelHistogram,
-    OffsetIndexBuilder,
+    OffsetIndexBuilder, PageEncodingStats,
 };
-use crate::file::page_encoding_stats::PageEncodingStats;
 use crate::file::properties::{
     EnabledStatistics, WriterProperties, WriterPropertiesPtr, WriterVersion,
 };
@@ -189,9 +191,9 @@ pub struct ColumnCloseResult {
     /// Optional bloom filter for this column
     pub bloom_filter: Option<Sbbf>,
     /// Optional column index, for filtering
-    pub column_index: Option<ColumnIndex>,
+    pub column_index: Option<ColumnIndexMetaData>,
     /// Optional offset index, identifying page locations
-    pub offset_index: Option<OffsetIndex>,
+    pub offset_index: Option<OffsetIndexMetaData>,
 }
 
 // Metrics per page
@@ -388,7 +390,7 @@ impl<'a, E: ColumnValueEncoder> GenericColumnWriter<'a, E> {
         }
 
         // Disable column_index_builder if not collecting page statistics.
-        let mut column_index_builder = ColumnIndexBuilder::new();
+        let mut column_index_builder = ColumnIndexBuilder::new(descr.physical_type());
         if statistics_enabled != EnabledStatistics::Page {
             column_index_builder.to_invalid()
         }
@@ -619,12 +621,12 @@ impl<'a, E: ColumnValueEncoder> GenericColumnWriter<'a, E> {
         };
         self.column_index_builder.set_boundary_order(boundary_order);
 
-        let column_index = self
-            .column_index_builder
-            .valid()
-            .then(|| self.column_index_builder.build_to_thrift());
+        let column_index = match self.column_index_builder.valid() {
+            true => Some(self.column_index_builder.build()?),
+            false => None,
+        };
 
-        let offset_index = self.offset_index_builder.map(|b| b.build_to_thrift());
+        let offset_index = self.offset_index_builder.map(|b| b.build());
 
         Ok(ColumnCloseResult {
             bytes_written: self.column_metrics.total_bytes_written,
@@ -2265,6 +2267,7 @@ mod tests {
 
         let props = ReaderProperties::builder()
             .set_backward_compatible_lz4(false)
+            .set_read_page_statistics(true)
             .build();
         let reader = SerializedPageReader::new_with_properties(
             Arc::new(Bytes::from(buf)),
@@ -2954,19 +2957,23 @@ mod tests {
         let r = writer.close().unwrap();
         assert!(r.column_index.is_some());
         let col_idx = r.column_index.unwrap();
+        let col_idx = match col_idx {
+            ColumnIndexMetaData::INT32(col_idx) => col_idx,
+            _ => panic!("wrong stats type"),
+        };
         // null_pages should be true for page 0
-        assert!(col_idx.null_pages[0]);
+        assert!(col_idx.is_null_page(0));
         // min and max should be empty byte arrays
-        assert_eq!(col_idx.min_values[0].len(), 0);
-        assert_eq!(col_idx.max_values[0].len(), 0);
+        assert!(col_idx.min_value(0).is_none());
+        assert!(col_idx.max_value(0).is_none());
         // null_counts should be defined and be 4 for page 0
-        assert!(col_idx.null_counts.is_some());
-        assert_eq!(col_idx.null_counts.as_ref().unwrap()[0], 4);
+        assert!(col_idx.null_count(0).is_some());
+        assert_eq!(col_idx.null_count(0), Some(4));
         // there is no repetition so rep histogram should be absent
-        assert!(col_idx.repetition_level_histograms.is_none());
+        assert!(col_idx.repetition_level_histogram(0).is_none());
         // definition_level_histogram should be present and should be 0:4, 1:0
-        assert!(col_idx.definition_level_histograms.is_some());
-        assert_eq!(col_idx.definition_level_histograms.unwrap(), &[4, 0]);
+        assert!(col_idx.definition_level_histogram(0).is_some());
+        assert_eq!(col_idx.definition_level_histogram(0).unwrap(), &[4, 0]);
     }
 
     #[test]
@@ -2989,12 +2996,16 @@ mod tests {
         assert_eq!(8, r.rows_written);
 
         // column index
-        assert_eq!(2, column_index.null_pages.len());
+        let column_index = match column_index {
+            ColumnIndexMetaData::INT32(column_index) => column_index,
+            _ => panic!("wrong stats type"),
+        };
+        assert_eq!(2, column_index.num_pages());
         assert_eq!(2, offset_index.page_locations.len());
         assert_eq!(BoundaryOrder::UNORDERED, column_index.boundary_order);
         for idx in 0..2 {
-            assert!(!column_index.null_pages[idx]);
-            assert_eq!(0, column_index.null_counts.as_ref().unwrap()[idx]);
+            assert!(!column_index.is_null_page(idx));
+            assert_eq!(0, column_index.null_count(0).unwrap());
         }
 
         if let Some(stats) = r.metadata.statistics() {
@@ -3004,14 +3015,8 @@ mod tests {
                 // first page is [1,2,3,4]
                 // second page is [-5,2,4,8]
                 // note that we don't increment here, as this is a non BinaryArray type.
-                assert_eq!(
-                    stats.min_bytes_opt(),
-                    Some(column_index.min_values[1].as_slice())
-                );
-                assert_eq!(
-                    stats.max_bytes_opt(),
-                    column_index.max_values.get(1).map(Vec::as_slice)
-                );
+                assert_eq!(stats.min_opt(), column_index.min_value(1));
+                assert_eq!(stats.max_opt(), column_index.max_value(1));
             } else {
                 panic!("expecting Statistics::Int32");
             }
@@ -3051,37 +3056,36 @@ mod tests {
         let column_index = r.column_index.unwrap();
         let offset_index = r.offset_index.unwrap();
 
+        let column_index = match column_index {
+            ColumnIndexMetaData::FIXED_LEN_BYTE_ARRAY(column_index) => column_index,
+            _ => panic!("wrong stats type"),
+        };
+
         assert_eq!(3, r.rows_written);
 
         // column index
-        assert_eq!(1, column_index.null_pages.len());
+        assert_eq!(1, column_index.num_pages());
         assert_eq!(1, offset_index.page_locations.len());
         assert_eq!(BoundaryOrder::ASCENDING, column_index.boundary_order);
-        assert!(!column_index.null_pages[0]);
-        assert_eq!(0, column_index.null_counts.as_ref().unwrap()[0]);
+        assert!(!column_index.is_null_page(0));
+        assert_eq!(Some(0), column_index.null_count(0));
 
         if let Some(stats) = r.metadata.statistics() {
             assert_eq!(stats.null_count_opt(), Some(0));
             assert_eq!(stats.distinct_count_opt(), None);
             if let Statistics::FixedLenByteArray(stats) = stats {
-                let column_index_min_value = &column_index.min_values[0];
-                let column_index_max_value = &column_index.max_values[0];
+                let column_index_min_value = column_index.min_value(0).unwrap();
+                let column_index_max_value = column_index.max_value(0).unwrap();
 
                 // Column index stats are truncated, while the column chunk's aren't.
-                assert_ne!(
-                    stats.min_bytes_opt(),
-                    Some(column_index_min_value.as_slice())
-                );
-                assert_ne!(
-                    stats.max_bytes_opt(),
-                    Some(column_index_max_value.as_slice())
-                );
+                assert_ne!(stats.min_bytes_opt().unwrap(), column_index_min_value);
+                assert_ne!(stats.max_bytes_opt().unwrap(), column_index_max_value);
 
                 assert_eq!(
                     column_index_min_value.len(),
                     DEFAULT_COLUMN_INDEX_TRUNCATE_LENGTH.unwrap()
                 );
-                assert_eq!(column_index_min_value.as_slice(), &[97_u8; 64]);
+                assert_eq!(column_index_min_value, &[97_u8; 64]);
                 assert_eq!(
                     column_index_max_value.len(),
                     DEFAULT_COLUMN_INDEX_TRUNCATE_LENGTH.unwrap()
@@ -3123,27 +3127,32 @@ mod tests {
         let column_index = r.column_index.unwrap();
         let offset_index = r.offset_index.unwrap();
 
+        let column_index = match column_index {
+            ColumnIndexMetaData::FIXED_LEN_BYTE_ARRAY(column_index) => column_index,
+            _ => panic!("wrong stats type"),
+        };
+
         assert_eq!(1, r.rows_written);
 
         // column index
-        assert_eq!(1, column_index.null_pages.len());
+        assert_eq!(1, column_index.num_pages());
         assert_eq!(1, offset_index.page_locations.len());
         assert_eq!(BoundaryOrder::ASCENDING, column_index.boundary_order);
-        assert!(!column_index.null_pages[0]);
-        assert_eq!(0, column_index.null_counts.as_ref().unwrap()[0]);
+        assert!(!column_index.is_null_page(0));
+        assert_eq!(Some(0), column_index.null_count(0));
 
         if let Some(stats) = r.metadata.statistics() {
             assert_eq!(stats.null_count_opt(), Some(0));
             assert_eq!(stats.distinct_count_opt(), None);
             if let Statistics::FixedLenByteArray(_stats) = stats {
-                let column_index_min_value = &column_index.min_values[0];
-                let column_index_max_value = &column_index.max_values[0];
+                let column_index_min_value = column_index.min_value(0).unwrap();
+                let column_index_max_value = column_index.max_value(0).unwrap();
 
                 assert_eq!(column_index_min_value.len(), 1);
                 assert_eq!(column_index_max_value.len(), 1);
 
-                assert_eq!("B".as_bytes(), column_index_min_value.as_slice());
-                assert_eq!("C".as_bytes(), column_index_max_value.as_slice());
+                assert_eq!("B".as_bytes(), column_index_min_value);
+                assert_eq!("C".as_bytes(), column_index_max_value);
 
                 assert_ne!(column_index_min_value, stats.min_bytes_opt().unwrap());
                 assert_ne!(column_index_max_value, stats.max_bytes_opt().unwrap());
@@ -3173,8 +3182,12 @@ mod tests {
         // stats should still be written
         // ensure bytes weren't truncated for column index
         let column_index = r.column_index.unwrap();
-        let column_index_min_bytes = column_index.min_values[0].as_slice();
-        let column_index_max_bytes = column_index.max_values[0].as_slice();
+        let column_index = match column_index {
+            ColumnIndexMetaData::FIXED_LEN_BYTE_ARRAY(column_index) => column_index,
+            _ => panic!("wrong stats type"),
+        };
+        let column_index_min_bytes = column_index.min_value(0).unwrap();
+        let column_index_max_bytes = column_index.max_value(0).unwrap();
         assert_eq!(expected_value, column_index_min_bytes);
         assert_eq!(expected_value, column_index_max_bytes);
 
@@ -3212,8 +3225,12 @@ mod tests {
         // stats should still be written
         // ensure bytes weren't truncated for column index
         let column_index = r.column_index.unwrap();
-        let column_index_min_bytes = column_index.min_values[0].as_slice();
-        let column_index_max_bytes = column_index.max_values[0].as_slice();
+        let column_index = match column_index {
+            ColumnIndexMetaData::FIXED_LEN_BYTE_ARRAY(column_index) => column_index,
+            _ => panic!("wrong stats type"),
+        };
+        let column_index_min_bytes = column_index.min_value(0).unwrap();
+        let column_index_max_bytes = column_index.max_value(0).unwrap();
         assert_eq!(expected_value, column_index_min_bytes);
         assert_eq!(expected_value, column_index_max_bytes);
 
@@ -3584,19 +3601,12 @@ mod tests {
         col_writer.close().unwrap();
         row_group_writer.close().unwrap();
         let file_metadata = writer.close().unwrap();
-        assert!(file_metadata.row_groups[0].columns[0].meta_data.is_some());
-        let stats = file_metadata.row_groups[0].columns[0]
-            .meta_data
-            .as_ref()
-            .unwrap()
-            .statistics
-            .as_ref()
-            .unwrap();
-        assert!(!stats.is_max_value_exact.unwrap());
+        let stats = file_metadata.row_group(0).column(0).statistics().unwrap();
+        assert!(!stats.max_is_exact());
         // Truncation of invalid UTF-8 should fall back to binary truncation, so last byte should
         // be incremented by 1.
         assert_eq!(
-            stats.max_value,
+            stats.max_bytes_opt().map(|v| v.to_vec()),
             Some([128, 128, 128, 128, 128, 128, 128, 129].to_vec())
         );
     }
@@ -3693,8 +3703,11 @@ mod tests {
                 &[Some(-5), Some(11)],
             ],
         )?;
-        let boundary_order = column_close_result.column_index.unwrap().boundary_order;
-        assert_eq!(boundary_order, BoundaryOrder::ASCENDING);
+        let boundary_order = column_close_result
+            .column_index
+            .unwrap()
+            .get_boundary_order();
+        assert_eq!(boundary_order, Some(BoundaryOrder::ASCENDING));
 
         // min max both descending
         let column_close_result = write_multiple_pages::<Int32Type>(
@@ -3706,34 +3719,49 @@ mod tests {
                 &[Some(-5), Some(0)],
             ],
         )?;
-        let boundary_order = column_close_result.column_index.unwrap().boundary_order;
-        assert_eq!(boundary_order, BoundaryOrder::DESCENDING);
+        let boundary_order = column_close_result
+            .column_index
+            .unwrap()
+            .get_boundary_order();
+        assert_eq!(boundary_order, Some(BoundaryOrder::DESCENDING));
 
         // min max both equal
         let column_close_result = write_multiple_pages::<Int32Type>(
             &descr,
             &[&[Some(10), Some(11)], &[None], &[Some(10), Some(11)]],
         )?;
-        let boundary_order = column_close_result.column_index.unwrap().boundary_order;
-        assert_eq!(boundary_order, BoundaryOrder::ASCENDING);
+        let boundary_order = column_close_result
+            .column_index
+            .unwrap()
+            .get_boundary_order();
+        assert_eq!(boundary_order, Some(BoundaryOrder::ASCENDING));
 
         // only nulls
         let column_close_result =
             write_multiple_pages::<Int32Type>(&descr, &[&[None], &[None], &[None]])?;
-        let boundary_order = column_close_result.column_index.unwrap().boundary_order;
-        assert_eq!(boundary_order, BoundaryOrder::ASCENDING);
+        let boundary_order = column_close_result
+            .column_index
+            .unwrap()
+            .get_boundary_order();
+        assert_eq!(boundary_order, Some(BoundaryOrder::ASCENDING));
 
         // one page
         let column_close_result =
             write_multiple_pages::<Int32Type>(&descr, &[&[Some(-10), Some(10)]])?;
-        let boundary_order = column_close_result.column_index.unwrap().boundary_order;
-        assert_eq!(boundary_order, BoundaryOrder::ASCENDING);
+        let boundary_order = column_close_result
+            .column_index
+            .unwrap()
+            .get_boundary_order();
+        assert_eq!(boundary_order, Some(BoundaryOrder::ASCENDING));
 
         // one non-null page
         let column_close_result =
             write_multiple_pages::<Int32Type>(&descr, &[&[Some(-10), Some(10)], &[None]])?;
-        let boundary_order = column_close_result.column_index.unwrap().boundary_order;
-        assert_eq!(boundary_order, BoundaryOrder::ASCENDING);
+        let boundary_order = column_close_result
+            .column_index
+            .unwrap()
+            .get_boundary_order();
+        assert_eq!(boundary_order, Some(BoundaryOrder::ASCENDING));
 
         // min max both unordered
         let column_close_result = write_multiple_pages::<Int32Type>(
@@ -3745,8 +3773,11 @@ mod tests {
                 &[Some(-5), Some(0)],
             ],
         )?;
-        let boundary_order = column_close_result.column_index.unwrap().boundary_order;
-        assert_eq!(boundary_order, BoundaryOrder::UNORDERED);
+        let boundary_order = column_close_result
+            .column_index
+            .unwrap()
+            .get_boundary_order();
+        assert_eq!(boundary_order, Some(BoundaryOrder::UNORDERED));
 
         // min max both ordered in different orders
         let column_close_result = write_multiple_pages::<Int32Type>(
@@ -3758,8 +3789,11 @@ mod tests {
                 &[Some(3), Some(7)],
             ],
         )?;
-        let boundary_order = column_close_result.column_index.unwrap().boundary_order;
-        assert_eq!(boundary_order, BoundaryOrder::UNORDERED);
+        let boundary_order = column_close_result
+            .column_index
+            .unwrap()
+            .get_boundary_order();
+        assert_eq!(boundary_order, Some(BoundaryOrder::UNORDERED));
 
         Ok(())
     }
@@ -3796,14 +3830,20 @@ mod tests {
         // f16 descending
         let column_close_result =
             write_multiple_pages::<FixedLenByteArrayType>(&f16_descr, values)?;
-        let boundary_order = column_close_result.column_index.unwrap().boundary_order;
-        assert_eq!(boundary_order, BoundaryOrder::DESCENDING);
+        let boundary_order = column_close_result
+            .column_index
+            .unwrap()
+            .get_boundary_order();
+        assert_eq!(boundary_order, Some(BoundaryOrder::DESCENDING));
 
         // same bytes, but fba unordered
         let column_close_result =
             write_multiple_pages::<FixedLenByteArrayType>(&fba_descr, values)?;
-        let boundary_order = column_close_result.column_index.unwrap().boundary_order;
-        assert_eq!(boundary_order, BoundaryOrder::UNORDERED);
+        let boundary_order = column_close_result
+            .column_index
+            .unwrap()
+            .get_boundary_order();
+        assert_eq!(boundary_order, Some(BoundaryOrder::UNORDERED));
 
         Ok(())
     }
diff --git a/parquet/src/encryption/decrypt.rs b/parquet/src/encryption/decrypt.rs
index d9b9ff0326b4..d285f6a1237c 100644
--- a/parquet/src/encryption/decrypt.rs
+++ b/parquet/src/encryption/decrypt.rs
@@ -142,13 +142,13 @@ impl CryptoContext {
         column_ordinal: usize,
     ) -> Result<Self> {
         let (data_decryptor, metadata_decryptor) = match column_crypto_metadata {
-            ColumnCryptoMetaData::EncryptionWithFooterKey => {
+            ColumnCryptoMetaData::ENCRYPTION_WITH_FOOTER_KEY => {
                 // TODO: In GCM-CTR mode will this need to be a non-GCM decryptor?
                 let data_decryptor = file_decryptor.get_footer_decryptor()?;
                 let metadata_decryptor = file_decryptor.get_footer_decryptor()?;
                 (data_decryptor, metadata_decryptor)
             }
-            ColumnCryptoMetaData::EncryptionWithColumnKey(column_key_encryption) => {
+            ColumnCryptoMetaData::ENCRYPTION_WITH_COLUMN_KEY(column_key_encryption) => {
                 let key_metadata = &column_key_encryption.key_metadata;
                 let full_column_name;
                 let column_name = if column_key_encryption.path_in_schema.len() == 1 {
diff --git a/parquet/src/encryption/encrypt.rs b/parquet/src/encryption/encrypt.rs
index c8d3ffc0eef4..1a22abff56fa 100644
--- a/parquet/src/encryption/encrypt.rs
+++ b/parquet/src/encryption/encrypt.rs
@@ -22,12 +22,11 @@ use crate::encryption::ciphers::{
 };
 use crate::errors::{ParquetError, Result};
 use crate::file::column_crypto_metadata::{ColumnCryptoMetaData, EncryptionWithColumnKey};
+use crate::parquet_thrift::{ThriftCompactOutputProtocol, WriteThrift};
 use crate::schema::types::{ColumnDescPtr, SchemaDescriptor};
-use crate::thrift::TSerializable;
 use ring::rand::{SecureRandom, SystemRandom};
 use std::collections::{HashMap, HashSet};
 use std::io::Write;
-use thrift::protocol::TCompactOutputProtocol;
 
 #[derive(Debug, Clone, PartialEq)]
 struct EncryptionKey {
@@ -365,18 +364,18 @@ impl FileEncryptor {
 }
 
 /// Write an encrypted Thrift serializable object
-pub(crate) fn encrypt_object<T: TSerializable, W: Write>(
+pub(crate) fn encrypt_thrift_object<T: WriteThrift, W: Write>(
     object: &T,
     encryptor: &mut Box<dyn BlockEncryptor>,
     sink: &mut W,
     module_aad: &[u8],
 ) -> Result<()> {
-    let encrypted_buffer = encrypt_object_to_vec(object, encryptor, module_aad)?;
+    let encrypted_buffer = encrypt_thrift_object_to_vec(object, encryptor, module_aad)?;
     sink.write_all(&encrypted_buffer)?;
     Ok(())
 }
 
-pub(crate) fn write_signed_plaintext_object<T: TSerializable, W: Write>(
+pub(crate) fn write_signed_plaintext_thrift_object<T: WriteThrift, W: Write>(
     object: &T,
     encryptor: &mut Box<dyn BlockEncryptor>,
     sink: &mut W,
@@ -384,8 +383,8 @@ pub(crate) fn write_signed_plaintext_object<T: TSerializable, W: Write>(
 ) -> Result<()> {
     let mut buffer: Vec<u8> = vec![];
     {
-        let mut protocol = TCompactOutputProtocol::new(&mut buffer);
-        object.write_to_out_protocol(&mut protocol)?;
+        let mut protocol = ThriftCompactOutputProtocol::new(&mut buffer);
+        object.write_thrift(&mut protocol)?;
     }
     sink.write_all(&buffer)?;
     buffer = encryptor.encrypt(buffer.as_ref(), module_aad)?;
@@ -400,15 +399,15 @@ pub(crate) fn write_signed_plaintext_object<T: TSerializable, W: Write>(
 }
 
 /// Encrypt a Thrift serializable object to a byte vector
-pub(crate) fn encrypt_object_to_vec<T: TSerializable>(
+pub(crate) fn encrypt_thrift_object_to_vec<T: WriteThrift>(
     object: &T,
     encryptor: &mut Box<dyn BlockEncryptor>,
     module_aad: &[u8],
 ) -> Result<Vec<u8>> {
     let mut buffer: Vec<u8> = vec![];
     {
-        let mut unencrypted_protocol = TCompactOutputProtocol::new(&mut buffer);
-        object.write_to_out_protocol(&mut unencrypted_protocol)?;
+        let mut unencrypted_protocol = ThriftCompactOutputProtocol::new(&mut buffer);
+        object.write_thrift(&mut unencrypted_protocol)?;
     }
 
     encryptor.encrypt(buffer.as_ref(), module_aad)
@@ -421,14 +420,14 @@ pub(crate) fn get_column_crypto_metadata(
 ) -> Option<ColumnCryptoMetaData> {
     if properties.column_keys.is_empty() {
         // Uniform encryption
-        Some(ColumnCryptoMetaData::EncryptionWithFooterKey)
+        Some(ColumnCryptoMetaData::ENCRYPTION_WITH_FOOTER_KEY)
     } else {
         properties
             .column_keys
             .get(&column.path().string())
             .map(|encryption_key| {
                 // Column is encrypted with a column specific key
-                ColumnCryptoMetaData::EncryptionWithColumnKey(EncryptionWithColumnKey {
+                ColumnCryptoMetaData::ENCRYPTION_WITH_COLUMN_KEY(EncryptionWithColumnKey {
                     path_in_schema: column.path().parts().to_vec(),
                     key_metadata: encryption_key.key_metadata.clone(),
                 })
diff --git a/parquet/src/errors.rs b/parquet/src/errors.rs
index be08245e956c..dab444a28f4f 100644
--- a/parquet/src/errors.rs
+++ b/parquet/src/errors.rs
@@ -19,6 +19,7 @@
 
 use core::num::TryFromIntError;
 use std::error::Error;
+use std::string::FromUtf8Error;
 use std::{cell, io, result, str};
 
 #[cfg(feature = "arrow")]
@@ -124,6 +125,13 @@ impl From<str::Utf8Error> for ParquetError {
         ParquetError::External(Box::new(e))
     }
 }
+
+impl From<FromUtf8Error> for ParquetError {
+    fn from(e: FromUtf8Error) -> ParquetError {
+        ParquetError::External(Box::new(e))
+    }
+}
+
 #[cfg(feature = "arrow")]
 impl From<ArrowError> for ParquetError {
     fn from(e: ArrowError) -> ParquetError {
diff --git a/parquet/src/file/column_crypto_metadata.rs b/parquet/src/file/column_crypto_metadata.rs
index af670e675fcd..7628fb615a9d 100644
--- a/parquet/src/file/column_crypto_metadata.rs
+++ b/parquet/src/file/column_crypto_metadata.rs
@@ -17,60 +17,49 @@
 
 //! Column chunk encryption metadata
 
-use crate::errors::Result;
-use crate::format::{
-    ColumnCryptoMetaData as TColumnCryptoMetaData,
-    EncryptionWithColumnKey as TEncryptionWithColumnKey,
-    EncryptionWithFooterKey as TEncryptionWithFooterKey,
+use std::io::Write;
+
+use crate::errors::{ParquetError, Result};
+use crate::file::metadata::HeapSize;
+use crate::parquet_thrift::{
+    read_thrift_vec, ElementType, FieldType, ReadThrift, ThriftCompactInputProtocol,
+    ThriftCompactOutputProtocol, WriteThrift, WriteThriftField,
 };
+use crate::{thrift_struct, thrift_union};
 
-/// ColumnCryptoMetadata for a column chunk
-#[derive(Clone, Debug, PartialEq, Eq)]
-pub enum ColumnCryptoMetaData {
-    /// The column is encrypted with the footer key
-    EncryptionWithFooterKey,
-    /// The column is encrypted with a column-specific key
-    EncryptionWithColumnKey(EncryptionWithColumnKey),
-}
+// define this and ColumnCryptoMetadata here so they're only defined when
+// the encryption feature is enabled
 
+thrift_struct!(
 /// Encryption metadata for a column chunk encrypted with a column-specific key
-#[derive(Clone, Debug, PartialEq, Eq)]
 pub struct EncryptionWithColumnKey {
-    /// Path to the column in the Parquet schema
-    pub path_in_schema: Vec<String>,
-    /// Metadata required to retrieve the column encryption key
-    pub key_metadata: Option<Vec<u8>>,
+  /// Path to the column in the Parquet schema
+  1: required list<string> path_in_schema
+
+  /// Path to the column in the Parquet schema
+  2: optional binary key_metadata
 }
+);
 
-/// Converts Thrift definition into `ColumnCryptoMetadata`.
-pub fn try_from_thrift(
-    thrift_column_crypto_metadata: &TColumnCryptoMetaData,
-) -> Result<ColumnCryptoMetaData> {
-    let crypto_metadata = match thrift_column_crypto_metadata {
-        TColumnCryptoMetaData::ENCRYPTIONWITHFOOTERKEY(_) => {
-            ColumnCryptoMetaData::EncryptionWithFooterKey
-        }
-        TColumnCryptoMetaData::ENCRYPTIONWITHCOLUMNKEY(encryption_with_column_key) => {
-            ColumnCryptoMetaData::EncryptionWithColumnKey(EncryptionWithColumnKey {
-                path_in_schema: encryption_with_column_key.path_in_schema.clone(),
-                key_metadata: encryption_with_column_key.key_metadata.clone(),
-            })
-        }
-    };
-    Ok(crypto_metadata)
+impl HeapSize for EncryptionWithColumnKey {
+    fn heap_size(&self) -> usize {
+        self.path_in_schema.heap_size() + self.key_metadata.heap_size()
+    }
 }
 
-/// Converts `ColumnCryptoMetadata` into Thrift definition.
-pub fn to_thrift(column_crypto_metadata: &ColumnCryptoMetaData) -> TColumnCryptoMetaData {
-    match column_crypto_metadata {
-        ColumnCryptoMetaData::EncryptionWithFooterKey => {
-            TColumnCryptoMetaData::ENCRYPTIONWITHFOOTERKEY(TEncryptionWithFooterKey {})
-        }
-        ColumnCryptoMetaData::EncryptionWithColumnKey(encryption_with_column_key) => {
-            TColumnCryptoMetaData::ENCRYPTIONWITHCOLUMNKEY(TEncryptionWithColumnKey {
-                path_in_schema: encryption_with_column_key.path_in_schema.clone(),
-                key_metadata: encryption_with_column_key.key_metadata.clone(),
-            })
+thrift_union!(
+/// ColumnCryptoMetadata for a column chunk
+union ColumnCryptoMetaData {
+  1: ENCRYPTION_WITH_FOOTER_KEY
+  2: (EncryptionWithColumnKey) ENCRYPTION_WITH_COLUMN_KEY
+}
+);
+
+impl HeapSize for ColumnCryptoMetaData {
+    fn heap_size(&self) -> usize {
+        match self {
+            Self::ENCRYPTION_WITH_FOOTER_KEY => 0,
+            Self::ENCRYPTION_WITH_COLUMN_KEY(path) => path.heap_size(),
         }
     }
 }
@@ -78,21 +67,25 @@ pub fn to_thrift(column_crypto_metadata: &ColumnCryptoMetaData) -> TColumnCrypto
 #[cfg(test)]
 mod tests {
     use super::*;
+    use crate::parquet_thrift::tests::test_roundtrip;
 
     #[test]
-    fn test_encryption_with_footer_key_from_thrift() {
-        let metadata = ColumnCryptoMetaData::EncryptionWithFooterKey;
-
-        assert_eq!(try_from_thrift(&to_thrift(&metadata)).unwrap(), metadata);
-    }
-
-    #[test]
-    fn test_encryption_with_column_key_from_thrift() {
-        let metadata = ColumnCryptoMetaData::EncryptionWithColumnKey(EncryptionWithColumnKey {
-            path_in_schema: vec!["abc".to_owned(), "def".to_owned()],
-            key_metadata: Some(vec![0, 1, 2, 3, 4, 5]),
-        });
+    fn test_column_crypto_roundtrip() {
+        test_roundtrip(ColumnCryptoMetaData::ENCRYPTION_WITH_FOOTER_KEY);
 
-        assert_eq!(try_from_thrift(&to_thrift(&metadata)).unwrap(), metadata);
+        let path_in_schema = vec!["foo".to_owned(), "bar".to_owned(), "really".to_owned()];
+        let key_metadata = vec![1u8; 32];
+        test_roundtrip(ColumnCryptoMetaData::ENCRYPTION_WITH_COLUMN_KEY(
+            EncryptionWithColumnKey {
+                path_in_schema: path_in_schema.clone(),
+                key_metadata: None,
+            },
+        ));
+        test_roundtrip(ColumnCryptoMetaData::ENCRYPTION_WITH_COLUMN_KEY(
+            EncryptionWithColumnKey {
+                path_in_schema,
+                key_metadata: Some(key_metadata),
+            },
+        ));
     }
 }
diff --git a/parquet/src/file/metadata/memory.rs b/parquet/src/file/metadata/memory.rs
index ad452267901a..208e62537bcb 100644
--- a/parquet/src/file/metadata/memory.rs
+++ b/parquet/src/file/metadata/memory.rs
@@ -18,14 +18,16 @@
 //! Memory calculations for [`ParquetMetadata::memory_size`]
 //!
 //! [`ParquetMetadata::memory_size`]: crate::file::metadata::ParquetMetaData::memory_size
-use crate::basic::{ColumnOrder, Compression, Encoding, PageType};
+use crate::basic::{BoundaryOrder, ColumnOrder, Compression, Encoding, PageType};
 use crate::data_type::private::ParquetValueType;
-use crate::file::metadata::{ColumnChunkMetaData, FileMetaData, KeyValue, RowGroupMetaData};
-use crate::file::page_encoding_stats::PageEncodingStats;
-use crate::file::page_index::index::{Index, NativeIndex, PageIndex};
-use crate::file::page_index::offset_index::OffsetIndexMetaData;
+use crate::file::metadata::{
+    ColumnChunkMetaData, FileMetaData, KeyValue, PageEncodingStats, RowGroupMetaData, SortingColumn,
+};
+use crate::file::page_index::column_index::{
+    ByteArrayColumnIndex, ColumnIndex, ColumnIndexMetaData, PrimitiveColumnIndex,
+};
+use crate::file::page_index::offset_index::{OffsetIndexMetaData, PageLocation};
 use crate::file::statistics::{Statistics, ValueStatistics};
-use crate::format::{BoundaryOrder, PageLocation, SortingColumn};
 use std::sync::Arc;
 
 /// Trait for calculating the size of various containers
@@ -91,6 +93,12 @@ impl HeapSize for RowGroupMetaData {
 
 impl HeapSize for ColumnChunkMetaData {
     fn heap_size(&self) -> usize {
+        #[cfg(feature = "encryption")]
+        let encryption_heap_size =
+            self.column_crypto_metadata.heap_size() + self.encrypted_column_metadata.heap_size();
+        #[cfg(not(feature = "encryption"))]
+        let encryption_heap_size = 0;
+
         // don't count column_descr here because it is already counted in
         // FileMetaData
         self.encodings.heap_size()
@@ -101,6 +109,7 @@ impl HeapSize for ColumnChunkMetaData {
             + self.unencoded_byte_array_data_bytes.heap_size()
             + self.repetition_level_histogram.heap_size()
             + self.definition_level_histogram.heap_size()
+            + encryption_heap_size
     }
 }
 
@@ -153,31 +162,45 @@ impl HeapSize for OffsetIndexMetaData {
     }
 }
 
-impl HeapSize for Index {
+impl HeapSize for ColumnIndexMetaData {
     fn heap_size(&self) -> usize {
         match self {
-            Index::NONE => 0,
-            Index::BOOLEAN(native_index) => native_index.heap_size(),
-            Index::INT32(native_index) => native_index.heap_size(),
-            Index::INT64(native_index) => native_index.heap_size(),
-            Index::INT96(native_index) => native_index.heap_size(),
-            Index::FLOAT(native_index) => native_index.heap_size(),
-            Index::DOUBLE(native_index) => native_index.heap_size(),
-            Index::BYTE_ARRAY(native_index) => native_index.heap_size(),
-            Index::FIXED_LEN_BYTE_ARRAY(native_index) => native_index.heap_size(),
+            Self::NONE => 0,
+            Self::BOOLEAN(native_index) => native_index.heap_size(),
+            Self::INT32(native_index) => native_index.heap_size(),
+            Self::INT64(native_index) => native_index.heap_size(),
+            Self::INT96(native_index) => native_index.heap_size(),
+            Self::FLOAT(native_index) => native_index.heap_size(),
+            Self::DOUBLE(native_index) => native_index.heap_size(),
+            Self::BYTE_ARRAY(native_index) => native_index.heap_size(),
+            Self::FIXED_LEN_BYTE_ARRAY(native_index) => native_index.heap_size(),
         }
     }
 }
 
-impl<T: ParquetValueType> HeapSize for NativeIndex<T> {
+impl HeapSize for ColumnIndex {
+    fn heap_size(&self) -> usize {
+        self.null_pages.heap_size()
+            + self.boundary_order.heap_size()
+            + self.null_counts.heap_size()
+            + self.definition_level_histograms.heap_size()
+            + self.repetition_level_histograms.heap_size()
+    }
+}
+
+impl<T: ParquetValueType> HeapSize for PrimitiveColumnIndex<T> {
     fn heap_size(&self) -> usize {
-        self.indexes.heap_size() + self.boundary_order.heap_size()
+        self.column_index.heap_size() + self.min_values.heap_size() + self.max_values.heap_size()
     }
 }
 
-impl<T: ParquetValueType> HeapSize for PageIndex<T> {
+impl HeapSize for ByteArrayColumnIndex {
     fn heap_size(&self) -> usize {
-        self.min.heap_size() + self.max.heap_size() + self.null_count.heap_size()
+        self.column_index.heap_size()
+            + self.min_bytes.heap_size()
+            + self.min_offsets.heap_size()
+            + self.max_bytes.heap_size()
+            + self.max_offsets.heap_size()
     }
 }
 
@@ -192,6 +215,11 @@ impl HeapSize for bool {
         0 // no heap allocations
     }
 }
+impl HeapSize for u8 {
+    fn heap_size(&self) -> usize {
+        0 // no heap allocations
+    }
+}
 impl HeapSize for i32 {
     fn heap_size(&self) -> usize {
         0 // no heap allocations
diff --git a/parquet/src/file/metadata/mod.rs b/parquet/src/file/metadata/mod.rs
index 4aa0388fd2fc..b7e99e67b632 100644
--- a/parquet/src/file/metadata/mod.rs
+++ b/parquet/src/file/metadata/mod.rs
@@ -17,9 +17,7 @@
 
 //! Parquet metadata API
 //!
-//! Most users should use these structures to interact with Parquet metadata.
-//! The [crate::format] module contains lower level structures generated from the
-//! Parquet thrift definition.
+//! Users should use these structures to interact with Parquet metadata.
 //!
 //! * [`ParquetMetaData`]: Top level metadata container, read from the Parquet
 //!   file footer.
@@ -66,7 +64,6 @@
 //!    with a more idiomatic API. Note that, confusingly, some but not all
 //!    of these structures have the same name as the [`format`] structures.
 //!
-//! [`format`]: crate::format
 //! [`file::metadata`]: crate::file::metadata
 //! [parquet.thrift]:  https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift
 //!
@@ -95,37 +92,43 @@ mod memory;
 mod parser;
 mod push_decoder;
 pub(crate) mod reader;
+pub(crate) mod thrift_gen;
 mod writer;
 
-use crate::basic::{ColumnOrder, Compression, Encoding, Type};
+use crate::basic::PageType;
 #[cfg(feature = "encryption")]
-use crate::encryption::{
-    decrypt::FileDecryptor,
-    modules::{create_module_aad, ModuleType},
-};
-use crate::errors::{ParquetError, Result};
+use crate::encryption::decrypt::FileDecryptor;
 #[cfg(feature = "encryption")]
-use crate::file::column_crypto_metadata::{self, ColumnCryptoMetaData};
+use crate::file::column_crypto_metadata::ColumnCryptoMetaData;
 pub(crate) use crate::file::metadata::memory::HeapSize;
-use crate::file::page_encoding_stats::{self, PageEncodingStats};
-use crate::file::page_index::index::Index;
-use crate::file::page_index::offset_index::OffsetIndexMetaData;
-use crate::file::statistics::{self, Statistics};
-use crate::format::ColumnCryptoMetaData as TColumnCryptoMetaData;
-use crate::format::{
-    BoundaryOrder, ColumnChunk, ColumnIndex, ColumnMetaData, OffsetIndex, PageLocation, RowGroup,
-    SizeStatistics, SortingColumn,
-};
+use crate::file::page_index::column_index::{ByteArrayColumnIndex, PrimitiveColumnIndex};
+use crate::file::page_index::{column_index::ColumnIndexMetaData, offset_index::PageLocation};
+use crate::file::statistics::Statistics;
 use crate::geospatial::statistics as geo_statistics;
 use crate::schema::types::{
     ColumnDescPtr, ColumnDescriptor, ColumnPath, SchemaDescPtr, SchemaDescriptor,
     Type as SchemaType,
 };
-#[cfg(feature = "encryption")]
-use crate::thrift::{TCompactSliceInputProtocol, TSerializable};
+use crate::thrift_struct;
+use crate::{
+    basic::BoundaryOrder,
+    errors::{ParquetError, Result},
+};
+use crate::{
+    basic::{ColumnOrder, Compression, Encoding, Type},
+    parquet_thrift::{
+        ElementType, FieldType, ReadThrift, ThriftCompactInputProtocol,
+        ThriftCompactOutputProtocol, WriteThrift, WriteThriftField,
+    },
+};
+use crate::{
+    data_type::private::ParquetValueType, file::page_index::offset_index::OffsetIndexMetaData,
+};
+
 pub use footer_tail::FooterTail;
 pub use push_decoder::ParquetMetaDataPushDecoder;
 pub use reader::{PageIndexPolicy, ParquetMetaDataReader};
+use std::io::Write;
 use std::ops::Range;
 use std::sync::Arc;
 pub use writer::ParquetMetaDataWriter;
@@ -135,18 +138,19 @@ pub(crate) use writer::ThriftMetadataWriter;
 ///
 /// This structure is an in-memory representation of multiple [`ColumnIndex`]
 /// structures in a parquet file footer, as described in the Parquet [PageIndex
-/// documentation]. Each [`Index`] holds statistics about all the pages in a
+/// documentation]. Each [`ColumnIndex`] holds statistics about all the pages in a
 /// particular column chunk.
 ///
 /// `column_index[row_group_number][column_number]` holds the
-/// [`Index`] corresponding to column `column_number` of row group
+/// [`ColumnIndex`] corresponding to column `column_number` of row group
 /// `row_group_number`.
 ///
-/// For example `column_index[2][3]` holds the [`Index`] for the fourth
+/// For example `column_index[2][3]` holds the [`ColumnIndex`] for the fourth
 /// column in the third row group of the parquet file.
 ///
 /// [PageIndex documentation]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
-pub type ParquetColumnIndex = Vec<Vec<Index>>;
+/// [`ColumnIndex`]: crate::file::page_index::column_index::ColumnIndexMetaData
+pub type ParquetColumnIndex = Vec<Vec<ColumnIndexMetaData>>;
 
 /// [`OffsetIndexMetaData`] for each data page of each row group of each column
 ///
@@ -158,6 +162,7 @@ pub type ParquetColumnIndex = Vec<Vec<Index>>;
 /// `column_number`of row group `row_group_number`.
 ///
 /// [PageIndex documentation]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
+/// [`OffsetIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
 pub type ParquetOffsetIndex = Vec<Vec<OffsetIndexMetaData>>;
 
 /// Parsed metadata for a single Parquet file
@@ -419,8 +424,35 @@ impl From<ParquetMetaData> for ParquetMetaDataBuilder {
     }
 }
 
+thrift_struct!(
 /// A key-value pair for [`FileMetaData`].
-pub type KeyValue = crate::format::KeyValue;
+pub struct KeyValue {
+  1: required string key
+  2: optional string value
+}
+);
+
+impl KeyValue {
+    /// Create a new key value pair
+    pub fn new<F2>(key: String, value: F2) -> KeyValue
+    where
+        F2: Into<Option<String>>,
+    {
+        KeyValue {
+            key,
+            value: value.into(),
+        }
+    }
+}
+
+thrift_struct!(
+/// PageEncodingStats for a column chunk and data page.
+pub struct PageEncodingStats {
+  1: required PageType page_type;
+  2: required Encoding encoding;
+  3: required i32 count;
+}
+);
 
 /// Reference counted pointer for [`FileMetaData`].
 pub type FileMetaDataPtr = Arc<FileMetaData>;
@@ -523,6 +555,21 @@ impl FileMetaData {
     }
 }
 
+thrift_struct!(
+/// Sort order within a RowGroup of a leaf column
+pub struct SortingColumn {
+  /// The ordinal position of the column (in this row group)
+  1: required i32 column_idx
+
+  /// If true, indicates this column is sorted in descending order.
+  2: required bool descending
+
+  /// If true, nulls will come before non-null values, otherwise,
+  /// nulls go at the end. */
+  3: required bool nulls_first
+}
+);
+
 /// Reference counted pointer for [`RowGroupMetaData`].
 pub type RowGroupMetaDataPtr = Arc<RowGroupMetaData>;
 
@@ -614,129 +661,6 @@ impl RowGroupMetaData {
         self.file_offset
     }
 
-    /// Method to convert from encrypted Thrift.
-    #[cfg(feature = "encryption")]
-    fn from_encrypted_thrift(
-        schema_descr: SchemaDescPtr,
-        mut rg: RowGroup,
-        decryptor: Option<&FileDecryptor>,
-    ) -> Result<RowGroupMetaData> {
-        if schema_descr.num_columns() != rg.columns.len() {
-            return Err(general_err!(
-                "Column count mismatch. Schema has {} columns while Row Group has {}",
-                schema_descr.num_columns(),
-                rg.columns.len()
-            ));
-        }
-        let total_byte_size = rg.total_byte_size;
-        let num_rows = rg.num_rows;
-        let mut columns = vec![];
-
-        for (i, (mut c, d)) in rg
-            .columns
-            .drain(0..)
-            .zip(schema_descr.columns())
-            .enumerate()
-        {
-            // Read encrypted metadata if it's present and we have a decryptor.
-            if let (true, Some(decryptor)) = (c.encrypted_column_metadata.is_some(), decryptor) {
-                let column_decryptor = match c.crypto_metadata.as_ref() {
-                    None => {
-                        return Err(general_err!(
-                            "No crypto_metadata is set for column '{}', which has encrypted metadata",
-                            d.path().string()
-                        ));
-                    }
-                    Some(TColumnCryptoMetaData::ENCRYPTIONWITHCOLUMNKEY(crypto_metadata)) => {
-                        let column_name = crypto_metadata.path_in_schema.join(".");
-                        decryptor.get_column_metadata_decryptor(
-                            column_name.as_str(),
-                            crypto_metadata.key_metadata.as_deref(),
-                        )?
-                    }
-                    Some(TColumnCryptoMetaData::ENCRYPTIONWITHFOOTERKEY(_)) => {
-                        decryptor.get_footer_decryptor()?
-                    }
-                };
-
-                let column_aad = create_module_aad(
-                    decryptor.file_aad(),
-                    ModuleType::ColumnMetaData,
-                    rg.ordinal.unwrap() as usize,
-                    i,
-                    None,
-                )?;
-
-                let buf = c.encrypted_column_metadata.clone().unwrap();
-                let decrypted_cc_buf = column_decryptor
-                    .decrypt(buf.as_slice(), column_aad.as_ref())
-                    .map_err(|_| {
-                        general_err!(
-                            "Unable to decrypt column '{}', perhaps the column key is wrong?",
-                            d.path().string()
-                        )
-                    })?;
-
-                let mut prot = TCompactSliceInputProtocol::new(decrypted_cc_buf.as_slice());
-                c.meta_data = Some(ColumnMetaData::read_from_in_protocol(&mut prot)?);
-            }
-            columns.push(ColumnChunkMetaData::from_thrift(d.clone(), c)?);
-        }
-
-        let sorting_columns = rg.sorting_columns;
-        Ok(RowGroupMetaData {
-            columns,
-            num_rows,
-            sorting_columns,
-            total_byte_size,
-            schema_descr,
-            file_offset: rg.file_offset,
-            ordinal: rg.ordinal,
-        })
-    }
-
-    /// Method to convert from Thrift.
-    pub fn from_thrift(schema_descr: SchemaDescPtr, mut rg: RowGroup) -> Result<RowGroupMetaData> {
-        if schema_descr.num_columns() != rg.columns.len() {
-            return Err(general_err!(
-                "Column count mismatch. Schema has {} columns while Row Group has {}",
-                schema_descr.num_columns(),
-                rg.columns.len()
-            ));
-        }
-        let total_byte_size = rg.total_byte_size;
-        let num_rows = rg.num_rows;
-        let mut columns = vec![];
-
-        for (c, d) in rg.columns.drain(0..).zip(schema_descr.columns()) {
-            columns.push(ColumnChunkMetaData::from_thrift(d.clone(), c)?);
-        }
-
-        let sorting_columns = rg.sorting_columns;
-        Ok(RowGroupMetaData {
-            columns,
-            num_rows,
-            sorting_columns,
-            total_byte_size,
-            schema_descr,
-            file_offset: rg.file_offset,
-            ordinal: rg.ordinal,
-        })
-    }
-
-    /// Method to convert to Thrift.
-    pub fn to_thrift(&self) -> RowGroup {
-        RowGroup {
-            columns: self.columns().iter().map(|v| v.to_thrift()).collect(),
-            total_byte_size: self.total_byte_size,
-            num_rows: self.num_rows,
-            sorting_columns: self.sorting_columns().cloned(),
-            file_offset: self.file_offset(),
-            total_compressed_size: Some(self.compressed_size()),
-            ordinal: self.ordinal,
-        }
-    }
-
     /// Converts this [`RowGroupMetaData`] into a [`RowGroupMetaDataBuilder`]
     pub fn into_builder(self) -> RowGroupMetaDataBuilder {
         RowGroupMetaDataBuilder(self)
@@ -853,6 +777,8 @@ pub struct ColumnChunkMetaData {
     definition_level_histogram: Option<LevelHistogram>,
     #[cfg(feature = "encryption")]
     column_crypto_metadata: Option<ColumnCryptoMetaData>,
+    #[cfg(feature = "encryption")]
+    encrypted_column_metadata: Option<Vec<u8>>,
 }
 
 /// Histograms for repetition and definition levels.
@@ -1154,183 +1080,10 @@ impl ColumnChunkMetaData {
         self.column_crypto_metadata.as_ref()
     }
 
-    /// Method to convert from Thrift.
-    pub fn from_thrift(column_descr: ColumnDescPtr, cc: ColumnChunk) -> Result<Self> {
-        if cc.meta_data.is_none() {
-            return Err(general_err!("Expected to have column metadata"));
-        }
-        let mut col_metadata: ColumnMetaData = cc.meta_data.unwrap();
-        let column_type = Type::try_from(col_metadata.type_)?;
-        let encodings = col_metadata
-            .encodings
-            .drain(0..)
-            .map(Encoding::try_from)
-            .collect::<Result<_>>()?;
-        let compression = Compression::try_from(col_metadata.codec)?;
-        let file_path = cc.file_path;
-        let file_offset = cc.file_offset;
-        let num_values = col_metadata.num_values;
-        let total_compressed_size = col_metadata.total_compressed_size;
-        let total_uncompressed_size = col_metadata.total_uncompressed_size;
-        let data_page_offset = col_metadata.data_page_offset;
-        let index_page_offset = col_metadata.index_page_offset;
-        let dictionary_page_offset = col_metadata.dictionary_page_offset;
-        let statistics = statistics::from_thrift(column_type, col_metadata.statistics)?;
-        let geo_statistics =
-            geo_statistics::from_thrift(col_metadata.geospatial_statistics).map(Box::new);
-        let encoding_stats = col_metadata
-            .encoding_stats
-            .as_ref()
-            .map(|vec| {
-                vec.iter()
-                    .map(page_encoding_stats::try_from_thrift)
-                    .collect::<Result<_>>()
-            })
-            .transpose()?;
-        let bloom_filter_offset = col_metadata.bloom_filter_offset;
-        let bloom_filter_length = col_metadata.bloom_filter_length;
-        let offset_index_offset = cc.offset_index_offset;
-        let offset_index_length = cc.offset_index_length;
-        let column_index_offset = cc.column_index_offset;
-        let column_index_length = cc.column_index_length;
-        let (
-            unencoded_byte_array_data_bytes,
-            repetition_level_histogram,
-            definition_level_histogram,
-        ) = if let Some(size_stats) = col_metadata.size_statistics {
-            (
-                size_stats.unencoded_byte_array_data_bytes,
-                size_stats.repetition_level_histogram,
-                size_stats.definition_level_histogram,
-            )
-        } else {
-            (None, None, None)
-        };
-
-        let repetition_level_histogram = repetition_level_histogram.map(LevelHistogram::from);
-        let definition_level_histogram = definition_level_histogram.map(LevelHistogram::from);
-
-        #[cfg(feature = "encryption")]
-        let column_crypto_metadata = if let Some(crypto_metadata) = cc.crypto_metadata {
-            Some(column_crypto_metadata::try_from_thrift(&crypto_metadata)?)
-        } else {
-            None
-        };
-
-        let result = ColumnChunkMetaData {
-            column_descr,
-            encodings,
-            file_path,
-            file_offset,
-            num_values,
-            compression,
-            total_compressed_size,
-            total_uncompressed_size,
-            data_page_offset,
-            index_page_offset,
-            dictionary_page_offset,
-            statistics,
-            encoding_stats,
-            bloom_filter_offset,
-            bloom_filter_length,
-            offset_index_offset,
-            offset_index_length,
-            column_index_offset,
-            column_index_length,
-            unencoded_byte_array_data_bytes,
-            repetition_level_histogram,
-            definition_level_histogram,
-            geo_statistics,
-            #[cfg(feature = "encryption")]
-            column_crypto_metadata,
-        };
-        Ok(result)
-    }
-
-    /// Method to convert to Thrift.
-    pub fn to_thrift(&self) -> ColumnChunk {
-        let column_metadata = self.to_column_metadata_thrift();
-
-        ColumnChunk {
-            file_path: self.file_path().map(|s| s.to_owned()),
-            file_offset: self.file_offset,
-            meta_data: Some(column_metadata),
-            offset_index_offset: self.offset_index_offset,
-            offset_index_length: self.offset_index_length,
-            column_index_offset: self.column_index_offset,
-            column_index_length: self.column_index_length,
-            crypto_metadata: self.column_crypto_metadata_thrift(),
-            encrypted_column_metadata: None,
-        }
-    }
-
-    /// Method to convert to Thrift `ColumnMetaData`
-    pub fn to_column_metadata_thrift(&self) -> ColumnMetaData {
-        let size_statistics = if self.unencoded_byte_array_data_bytes.is_some()
-            || self.repetition_level_histogram.is_some()
-            || self.definition_level_histogram.is_some()
-        {
-            let repetition_level_histogram = self
-                .repetition_level_histogram
-                .as_ref()
-                .map(|hist| hist.clone().into_inner());
-
-            let definition_level_histogram = self
-                .definition_level_histogram
-                .as_ref()
-                .map(|hist| hist.clone().into_inner());
-
-            Some(SizeStatistics {
-                unencoded_byte_array_data_bytes: self.unencoded_byte_array_data_bytes,
-                repetition_level_histogram,
-                definition_level_histogram,
-            })
-        } else {
-            None
-        };
-
-        ColumnMetaData {
-            type_: self.column_type().into(),
-            encodings: self.encodings().iter().map(|&v| v.into()).collect(),
-            path_in_schema: self.column_path().as_ref().to_vec(),
-            codec: self.compression.into(),
-            num_values: self.num_values,
-            total_uncompressed_size: self.total_uncompressed_size,
-            total_compressed_size: self.total_compressed_size,
-            key_value_metadata: None,
-            data_page_offset: self.data_page_offset,
-            index_page_offset: self.index_page_offset,
-            dictionary_page_offset: self.dictionary_page_offset,
-            statistics: statistics::to_thrift(self.statistics.as_ref()),
-            encoding_stats: self
-                .encoding_stats
-                .as_ref()
-                .map(|vec| vec.iter().map(page_encoding_stats::to_thrift).collect()),
-            bloom_filter_offset: self.bloom_filter_offset,
-            bloom_filter_length: self.bloom_filter_length,
-            size_statistics,
-            geospatial_statistics: geo_statistics::to_thrift(
-                self.geo_statistics.as_ref().map(|boxed| boxed.as_ref()),
-            ),
-        }
-    }
-
     /// Converts this [`ColumnChunkMetaData`] into a [`ColumnChunkMetaDataBuilder`]
     pub fn into_builder(self) -> ColumnChunkMetaDataBuilder {
         ColumnChunkMetaDataBuilder::from(self)
     }
-
-    #[cfg(feature = "encryption")]
-    fn column_crypto_metadata_thrift(&self) -> Option<TColumnCryptoMetaData> {
-        self.column_crypto_metadata
-            .as_ref()
-            .map(column_crypto_metadata::to_thrift)
-    }
-
-    #[cfg(not(feature = "encryption"))]
-    fn column_crypto_metadata_thrift(&self) -> Option<TColumnCryptoMetaData> {
-        None
-    }
 }
 
 /// Builder for [`ColumnChunkMetaData`]
@@ -1384,6 +1137,8 @@ impl ColumnChunkMetaDataBuilder {
             definition_level_histogram: None,
             #[cfg(feature = "encryption")]
             column_crypto_metadata: None,
+            #[cfg(feature = "encryption")]
+            encrypted_column_metadata: None,
         })
     }
 
@@ -1541,7 +1296,9 @@ impl ColumnChunkMetaDataBuilder {
 /// Builder for Parquet [`ColumnIndex`], part of the Parquet [PageIndex]
 ///
 /// [PageIndex]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
+/// [`ColumnIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
 pub struct ColumnIndexBuilder {
+    column_type: Type,
     null_pages: Vec<bool>,
     min_values: Vec<Vec<u8>>,
     max_values: Vec<Vec<u8>>,
@@ -1561,16 +1318,11 @@ pub struct ColumnIndexBuilder {
     valid: bool,
 }
 
-impl Default for ColumnIndexBuilder {
-    fn default() -> Self {
-        Self::new()
-    }
-}
-
 impl ColumnIndexBuilder {
     /// Creates a new column index builder.
-    pub fn new() -> Self {
+    pub fn new(column_type: Type) -> Self {
         ColumnIndexBuilder {
+            column_type,
             null_pages: Vec::new(),
             min_values: Vec::new(),
             max_values: Vec::new(),
@@ -1598,6 +1350,8 @@ impl ColumnIndexBuilder {
 
     /// Append the given page-level histograms to the [`ColumnIndex`] histograms.
     /// Does nothing if the `ColumnIndexBuilder` is not in the `valid` state.
+    ///
+    /// [`ColumnIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
     pub fn append_histograms(
         &mut self,
         repetition_level_histogram: &Option<LevelHistogram>,
@@ -1633,18 +1387,76 @@ impl ColumnIndexBuilder {
         self.valid
     }
 
-    /// Build and get the thrift metadata of column index
+    /// Build and get the column index
     ///
     /// Note: callers should check [`Self::valid`] before calling this method
-    pub fn build_to_thrift(self) -> ColumnIndex {
-        ColumnIndex::new(
+    pub fn build(self) -> Result<ColumnIndexMetaData> {
+        Ok(match self.column_type {
+            Type::BOOLEAN => {
+                let index = self.build_page_index()?;
+                ColumnIndexMetaData::BOOLEAN(index)
+            }
+            Type::INT32 => {
+                let index = self.build_page_index()?;
+                ColumnIndexMetaData::INT32(index)
+            }
+            Type::INT64 => {
+                let index = self.build_page_index()?;
+                ColumnIndexMetaData::INT64(index)
+            }
+            Type::INT96 => {
+                let index = self.build_page_index()?;
+                ColumnIndexMetaData::INT96(index)
+            }
+            Type::FLOAT => {
+                let index = self.build_page_index()?;
+                ColumnIndexMetaData::FLOAT(index)
+            }
+            Type::DOUBLE => {
+                let index = self.build_page_index()?;
+                ColumnIndexMetaData::DOUBLE(index)
+            }
+            Type::BYTE_ARRAY => {
+                let index = self.build_byte_array_index()?;
+                ColumnIndexMetaData::BYTE_ARRAY(index)
+            }
+            Type::FIXED_LEN_BYTE_ARRAY => {
+                let index = self.build_byte_array_index()?;
+                ColumnIndexMetaData::FIXED_LEN_BYTE_ARRAY(index)
+            }
+        })
+    }
+
+    fn build_page_index<T>(self) -> Result<PrimitiveColumnIndex<T>>
+    where
+        T: ParquetValueType,
+    {
+        let min_values: Vec<&[u8]> = self.min_values.iter().map(|v| v.as_slice()).collect();
+        let max_values: Vec<&[u8]> = self.max_values.iter().map(|v| v.as_slice()).collect();
+
+        PrimitiveColumnIndex::try_new(
+            self.null_pages,
+            self.boundary_order,
+            Some(self.null_counts),
+            self.repetition_level_histograms,
+            self.definition_level_histograms,
+            min_values,
+            max_values,
+        )
+    }
+
+    fn build_byte_array_index(self) -> Result<ByteArrayColumnIndex> {
+        let min_values: Vec<&[u8]> = self.min_values.iter().map(|v| v.as_slice()).collect();
+        let max_values: Vec<&[u8]> = self.max_values.iter().map(|v| v.as_slice()).collect();
+
+        ByteArrayColumnIndex::try_new(
             self.null_pages,
-            self.min_values,
-            self.max_values,
             self.boundary_order,
-            self.null_counts,
+            Some(self.null_counts),
             self.repetition_level_histograms,
             self.definition_level_histograms,
+            min_values,
+            max_values,
         )
     }
 }
@@ -1710,15 +1522,22 @@ impl OffsetIndexBuilder {
     }
 
     /// Build and get the thrift metadata of offset index
-    pub fn build_to_thrift(self) -> OffsetIndex {
+    pub fn build(self) -> OffsetIndexMetaData {
         let locations = self
             .offset_array
             .iter()
             .zip(self.compressed_page_size_array.iter())
             .zip(self.first_row_index_array.iter())
-            .map(|((offset, size), row_index)| PageLocation::new(*offset, *size, *row_index))
+            .map(|((offset, size), row_index)| PageLocation {
+                offset: *offset,
+                compressed_page_size: *size,
+                first_row_index: *row_index,
+            })
             .collect::<Vec<_>>();
-        OffsetIndex::new(locations, self.unencoded_byte_array_data_bytes_array)
+        OffsetIndexMetaData {
+            page_locations: locations,
+            unencoded_byte_array_data_bytes: self.unencoded_byte_array_data_bytes_array,
+        }
     }
 }
 
@@ -1726,7 +1545,7 @@ impl OffsetIndexBuilder {
 mod tests {
     use super::*;
     use crate::basic::{PageType, SortOrder};
-    use crate::file::page_index::index::NativeIndex;
+    use crate::file::metadata::thrift_gen::tests::{read_column_chunk, read_row_group};
 
     #[test]
     fn test_row_group_metadata_thrift_conversion() {
@@ -1745,12 +1564,13 @@ mod tests {
             .build()
             .unwrap();
 
-        let row_group_exp = row_group_meta.to_thrift();
-        let row_group_res = RowGroupMetaData::from_thrift(schema_descr, row_group_exp.clone())
-            .unwrap()
-            .to_thrift();
+        let mut buf = Vec::new();
+        let mut writer = ThriftCompactOutputProtocol::new(&mut buf);
+        row_group_meta.write_thrift(&mut writer).unwrap();
 
-        assert_eq!(row_group_res, row_group_exp);
+        let row_group_res = read_row_group(&mut buf, schema_descr).unwrap();
+
+        assert_eq!(row_group_res, row_group_meta);
     }
 
     #[test]
@@ -1826,11 +1646,13 @@ mod tests {
             .set_ordinal(1)
             .build()
             .unwrap();
+        let mut buf = Vec::new();
+        let mut writer = ThriftCompactOutputProtocol::new(&mut buf);
+        row_group_meta_2cols.write_thrift(&mut writer).unwrap();
 
-        let err =
-            RowGroupMetaData::from_thrift(schema_descr_3cols, row_group_meta_2cols.to_thrift())
-                .unwrap_err()
-                .to_string();
+        let err = read_row_group(&mut buf, schema_descr_3cols)
+            .unwrap_err()
+            .to_string();
         assert_eq!(
             err,
             "Parquet error: Column count mismatch. Schema has 3 columns while Row Group has 2"
@@ -1874,8 +1696,10 @@ mod tests {
             .build()
             .unwrap();
 
-        let col_chunk_res =
-            ColumnChunkMetaData::from_thrift(column_descr, col_metadata.to_thrift()).unwrap();
+        let mut buf = Vec::new();
+        let mut writer = ThriftCompactOutputProtocol::new(&mut buf);
+        col_metadata.write_thrift(&mut writer).unwrap();
+        let col_chunk_res = read_column_chunk(&mut buf, column_descr).unwrap();
 
         assert_eq!(col_chunk_res, col_metadata);
     }
@@ -1888,12 +1712,12 @@ mod tests {
             .build()
             .unwrap();
 
-        let col_chunk_exp = col_metadata.to_thrift();
-        let col_chunk_res = ColumnChunkMetaData::from_thrift(column_descr, col_chunk_exp.clone())
-            .unwrap()
-            .to_thrift();
+        let mut buf = Vec::new();
+        let mut writer = ThriftCompactOutputProtocol::new(&mut buf);
+        col_metadata.write_thrift(&mut writer).unwrap();
+        let col_chunk_res = read_column_chunk(&mut buf, column_descr).unwrap();
 
-        assert_eq!(col_chunk_res, col_chunk_exp);
+        assert_eq!(col_chunk_res, col_metadata);
     }
 
     #[test]
@@ -1992,16 +1816,19 @@ mod tests {
             .build();
 
         #[cfg(not(feature = "encryption"))]
-        let base_expected_size = 2344;
+        let base_expected_size = 2312;
         #[cfg(feature = "encryption")]
-        let base_expected_size = 2680;
+        let base_expected_size = 2744;
 
         assert_eq!(parquet_meta.memory_size(), base_expected_size);
 
-        let mut column_index = ColumnIndexBuilder::new();
+        let mut column_index = ColumnIndexBuilder::new(Type::BOOLEAN);
         column_index.append(false, vec![1u8], vec![2u8, 3u8], 4);
-        let column_index = column_index.build_to_thrift();
-        let native_index = NativeIndex::<bool>::try_new(column_index).unwrap();
+        let column_index = column_index.build().unwrap();
+        let native_index = match column_index {
+            ColumnIndexMetaData::BOOLEAN(index) => index,
+            _ => panic!("wrong type of column index"),
+        };
 
         // Now, add in OffsetIndex
         let mut offset_index = OffsetIndexBuilder::new();
@@ -2011,20 +1838,18 @@ mod tests {
         offset_index.append_row_count(1);
         offset_index.append_offset_and_size(2, 3);
         offset_index.append_unencoded_byte_array_data_bytes(Some(10));
-        let offset_index = offset_index.build_to_thrift();
+        let offset_index = offset_index.build();
 
         let parquet_meta = ParquetMetaDataBuilder::new(file_metadata)
             .set_row_groups(row_group_meta)
-            .set_column_index(Some(vec![vec![Index::BOOLEAN(native_index)]]))
-            .set_offset_index(Some(vec![vec![
-                OffsetIndexMetaData::try_new(offset_index).unwrap()
-            ]]))
+            .set_column_index(Some(vec![vec![ColumnIndexMetaData::BOOLEAN(native_index)]]))
+            .set_offset_index(Some(vec![vec![offset_index]]))
             .build();
 
         #[cfg(not(feature = "encryption"))]
-        let bigger_expected_size = 2848;
+        let bigger_expected_size = 2738;
         #[cfg(feature = "encryption")]
-        let bigger_expected_size = 3184;
+        let bigger_expected_size = 3170;
 
         // more set fields means more memory usage
         assert!(bigger_expected_size > base_expected_size);
diff --git a/parquet/src/file/metadata/parser.rs b/parquet/src/file/metadata/parser.rs
index 2a297e227377..cbe005d8f96a 100644
--- a/parquet/src/file/metadata/parser.rs
+++ b/parquet/src/file/metadata/parser.rs
@@ -20,28 +20,14 @@
 //! These functions parse thrift-encoded metadata from a byte slice
 //! into the corresponding Rust structures
 
-use crate::basic::ColumnOrder;
 use crate::errors::ParquetError;
-use crate::file::metadata::{
-    ColumnChunkMetaData, FileMetaData, PageIndexPolicy, ParquetMetaData, RowGroupMetaData,
-};
-use crate::file::page_index::index::Index;
+use crate::file::metadata::{ColumnChunkMetaData, PageIndexPolicy, ParquetMetaData};
+
+use crate::file::page_index::column_index::ColumnIndexMetaData;
 use crate::file::page_index::index_reader::{decode_column_index, decode_offset_index};
 use crate::file::page_index::offset_index::OffsetIndexMetaData;
-use crate::schema::types;
-use crate::schema::types::SchemaDescriptor;
-use crate::thrift::TCompactSliceInputProtocol;
-use crate::thrift::TSerializable;
+use crate::parquet_thrift::{ReadThrift, ThriftSliceInputProtocol};
 use bytes::Bytes;
-use std::sync::Arc;
-
-#[cfg(feature = "encryption")]
-use crate::encryption::{
-    decrypt::{FileDecryptionProperties, FileDecryptor},
-    modules::create_footer_aad,
-};
-#[cfg(feature = "encryption")]
-use crate::format::EncryptionAlgorithm;
 
 /// Helper struct for metadata parsing
 ///
@@ -49,11 +35,13 @@ use crate::format::EncryptionAlgorithm;
 /// such as [`ParquetMetaData`], handling decryption if necessary.
 //
 // Note this structure is used to minimize the number of
-// places need to add `#[cfg(feature = "encryption")]` checks.
+// places to add `#[cfg(feature = "encryption")]` checks.
 pub(crate) use inner::MetadataParser;
 
 #[cfg(feature = "encryption")]
 mod inner {
+    use std::sync::Arc;
+
     use super::*;
     use crate::encryption::decrypt::FileDecryptionProperties;
     use crate::errors::Result;
@@ -83,13 +71,69 @@ mod inner {
             buf: &[u8],
             encrypted_footer: bool,
         ) -> Result<ParquetMetaData> {
-            decode_metadata_with_encryption(
-                buf,
-                encrypted_footer,
+            crate::file::metadata::thrift_gen::parquet_metadata_with_encryption(
                 self.file_decryption_properties.as_deref(),
+                encrypted_footer,
+                buf,
             )
         }
     }
+
+    pub(super) fn parse_single_column_index(
+        bytes: &[u8],
+        metadata: &ParquetMetaData,
+        column: &ColumnChunkMetaData,
+        row_group_index: usize,
+        col_index: usize,
+    ) -> crate::errors::Result<ColumnIndexMetaData> {
+        use crate::encryption::decrypt::CryptoContext;
+        match &column.column_crypto_metadata {
+            Some(crypto_metadata) => {
+                let file_decryptor = metadata.file_decryptor.as_ref().ok_or_else(|| {
+                    general_err!("Cannot decrypt column index, no file decryptor set")
+                })?;
+                let crypto_context = CryptoContext::for_column(
+                    file_decryptor,
+                    crypto_metadata,
+                    row_group_index,
+                    col_index,
+                )?;
+                let column_decryptor = crypto_context.metadata_decryptor();
+                let aad = crypto_context.create_column_index_aad()?;
+                let plaintext = column_decryptor.decrypt(bytes, &aad)?;
+                decode_column_index(&plaintext, column.column_type())
+            }
+            None => decode_column_index(bytes, column.column_type()),
+        }
+    }
+
+    pub(super) fn parse_single_offset_index(
+        bytes: &[u8],
+        metadata: &ParquetMetaData,
+        column: &ColumnChunkMetaData,
+        row_group_index: usize,
+        col_index: usize,
+    ) -> crate::errors::Result<OffsetIndexMetaData> {
+        use crate::encryption::decrypt::CryptoContext;
+        match &column.column_crypto_metadata {
+            Some(crypto_metadata) => {
+                let file_decryptor = metadata.file_decryptor.as_ref().ok_or_else(|| {
+                    general_err!("Cannot decrypt offset index, no file decryptor set")
+                })?;
+                let crypto_context = CryptoContext::for_column(
+                    file_decryptor,
+                    crypto_metadata,
+                    row_group_index,
+                    col_index,
+                )?;
+                let column_decryptor = crypto_context.metadata_decryptor();
+                let aad = crypto_context.create_offset_index_aad()?;
+                let plaintext = column_decryptor.decrypt(bytes, &aad)?;
+                decode_offset_index(&plaintext)
+            }
+            None => decode_offset_index(bytes),
+        }
+    }
 }
 
 #[cfg(not(feature = "encryption"))]
@@ -121,6 +165,26 @@ mod inner {
             }
         }
     }
+
+    pub(super) fn parse_single_column_index(
+        bytes: &[u8],
+        _metadata: &ParquetMetaData,
+        column: &ColumnChunkMetaData,
+        _row_group_index: usize,
+        _col_index: usize,
+    ) -> crate::errors::Result<ColumnIndexMetaData> {
+        decode_column_index(bytes, column.column_type())
+    }
+
+    pub(super) fn parse_single_offset_index(
+        bytes: &[u8],
+        _metadata: &ParquetMetaData,
+        _column: &ColumnChunkMetaData,
+        _row_group_index: usize,
+        _col_index: usize,
+    ) -> crate::errors::Result<OffsetIndexMetaData> {
+        decode_offset_index(bytes)
+    }
 }
 
 /// Decodes [`ParquetMetaData`] from the provided bytes.
@@ -131,61 +195,8 @@ mod inner {
 ///
 /// [Parquet Spec]: https://github.com/apache/parquet-format#metadata
 pub(crate) fn decode_metadata(buf: &[u8]) -> crate::errors::Result<ParquetMetaData> {
-    let mut prot = TCompactSliceInputProtocol::new(buf);
-
-    let t_file_metadata: crate::format::FileMetaData =
-        crate::format::FileMetaData::read_from_in_protocol(&mut prot)
-            .map_err(|e| general_err!("Could not parse metadata: {}", e))?;
-    let schema = types::from_thrift(&t_file_metadata.schema)?;
-    let schema_descr = Arc::new(SchemaDescriptor::new(schema));
-
-    let mut row_groups = Vec::new();
-    for rg in t_file_metadata.row_groups {
-        row_groups.push(RowGroupMetaData::from_thrift(schema_descr.clone(), rg)?);
-    }
-    let column_orders = parse_column_orders(t_file_metadata.column_orders, &schema_descr)?;
-
-    let file_metadata = FileMetaData::new(
-        t_file_metadata.version,
-        t_file_metadata.num_rows,
-        t_file_metadata.created_by,
-        t_file_metadata.key_value_metadata,
-        schema_descr,
-        column_orders,
-    );
-
-    Ok(ParquetMetaData::new(file_metadata, row_groups))
-}
-
-/// Parses column orders from Thrift definition.
-/// If no column orders are defined, returns `None`.
-fn parse_column_orders(
-    t_column_orders: Option<Vec<crate::format::ColumnOrder>>,
-    schema_descr: &SchemaDescriptor,
-) -> crate::errors::Result<Option<Vec<ColumnOrder>>> {
-    match t_column_orders {
-        Some(orders) => {
-            // Should always be the case
-            if orders.len() != schema_descr.num_columns() {
-                return Err(general_err!("Column order length mismatch"));
-            };
-            let mut res = Vec::new();
-            for (i, column) in schema_descr.columns().iter().enumerate() {
-                match orders[i] {
-                    crate::format::ColumnOrder::TYPEORDER(_) => {
-                        let sort_order = ColumnOrder::get_sort_order(
-                            column.logical_type(),
-                            column.converted_type(),
-                            column.physical_type(),
-                        );
-                        res.push(ColumnOrder::TYPE_DEFINED_ORDER(sort_order));
-                    }
-                }
-            }
-            Ok(Some(res))
-        }
-        None => Ok(None),
-    }
+    let mut prot = ThriftSliceInputProtocol::new(buf);
+    ParquetMetaData::read_thrift(&mut prot)
 }
 
 /// Parses column index from the provided bytes and adds it to the metadata.
@@ -217,7 +228,7 @@ pub(crate) fn parse_column_index(
                     Some(r) => {
                         let r_start = usize::try_from(r.start - start_offset)?;
                         let r_end = usize::try_from(r.end - start_offset)?;
-                        parse_single_column_index(
+                        inner::parse_single_column_index(
                             &bytes[r_start..r_end],
                             metadata,
                             c,
@@ -225,7 +236,7 @@ pub(crate) fn parse_column_index(
                             col_idx,
                         )
                     }
-                    None => Ok(Index::NONE),
+                    None => Ok(ColumnIndexMetaData::NONE),
                 })
                 .collect::<crate::errors::Result<Vec<_>>>()
         })
@@ -235,46 +246,6 @@ pub(crate) fn parse_column_index(
     Ok(())
 }
 
-#[cfg(feature = "encryption")]
-fn parse_single_column_index(
-    bytes: &[u8],
-    metadata: &ParquetMetaData,
-    column: &ColumnChunkMetaData,
-    row_group_index: usize,
-    col_index: usize,
-) -> crate::errors::Result<Index> {
-    use crate::encryption::decrypt::CryptoContext;
-    match &column.column_crypto_metadata {
-        Some(crypto_metadata) => {
-            let file_decryptor = metadata.file_decryptor.as_ref().ok_or_else(|| {
-                general_err!("Cannot decrypt column index, no file decryptor set")
-            })?;
-            let crypto_context = CryptoContext::for_column(
-                file_decryptor,
-                crypto_metadata,
-                row_group_index,
-                col_index,
-            )?;
-            let column_decryptor = crypto_context.metadata_decryptor();
-            let aad = crypto_context.create_column_index_aad()?;
-            let plaintext = column_decryptor.decrypt(bytes, &aad)?;
-            decode_column_index(&plaintext, column.column_type())
-        }
-        None => decode_column_index(bytes, column.column_type()),
-    }
-}
-
-#[cfg(not(feature = "encryption"))]
-fn parse_single_column_index(
-    bytes: &[u8],
-    _metadata: &ParquetMetaData,
-    column: &ColumnChunkMetaData,
-    _row_group_index: usize,
-    _col_index: usize,
-) -> crate::errors::Result<Index> {
-    decode_column_index(bytes, column.column_type())
-}
-
 pub(crate) fn parse_offset_index(
     metadata: &mut ParquetMetaData,
     offset_index_policy: PageIndexPolicy,
@@ -293,7 +264,13 @@ pub(crate) fn parse_offset_index(
                 Some(r) => {
                     let r_start = usize::try_from(r.start - start_offset)?;
                     let r_end = usize::try_from(r.end - start_offset)?;
-                    parse_single_offset_index(&bytes[r_start..r_end], metadata, c, rg_idx, col_idx)
+                    inner::parse_single_offset_index(
+                        &bytes[r_start..r_end],
+                        metadata,
+                        c,
+                        rg_idx,
+                        col_idx,
+                    )
                 }
                 None => Err(general_err!("missing offset index")),
             };
@@ -317,239 +294,3 @@ pub(crate) fn parse_offset_index(
     metadata.set_offset_index(Some(all_indexes));
     Ok(())
 }
-
-#[cfg(feature = "encryption")]
-fn parse_single_offset_index(
-    bytes: &[u8],
-    metadata: &ParquetMetaData,
-    column: &ColumnChunkMetaData,
-    row_group_index: usize,
-    col_index: usize,
-) -> crate::errors::Result<OffsetIndexMetaData> {
-    use crate::encryption::decrypt::CryptoContext;
-    match &column.column_crypto_metadata {
-        Some(crypto_metadata) => {
-            let file_decryptor = metadata.file_decryptor.as_ref().ok_or_else(|| {
-                general_err!("Cannot decrypt offset index, no file decryptor set")
-            })?;
-            let crypto_context = CryptoContext::for_column(
-                file_decryptor,
-                crypto_metadata,
-                row_group_index,
-                col_index,
-            )?;
-            let column_decryptor = crypto_context.metadata_decryptor();
-            let aad = crypto_context.create_offset_index_aad()?;
-            let plaintext = column_decryptor.decrypt(bytes, &aad)?;
-            decode_offset_index(&plaintext)
-        }
-        None => decode_offset_index(bytes),
-    }
-}
-
-#[cfg(not(feature = "encryption"))]
-fn parse_single_offset_index(
-    bytes: &[u8],
-    _metadata: &ParquetMetaData,
-    _column: &ColumnChunkMetaData,
-    _row_group_index: usize,
-    _col_index: usize,
-) -> crate::errors::Result<OffsetIndexMetaData> {
-    decode_offset_index(bytes)
-}
-
-/// Decodes [`ParquetMetaData`] from the provided bytes, handling metadata that may be encrypted.
-///
-/// Typically this is used to decode the metadata from the end of a parquet
-/// file. The format of `buf` is the Thrift compact binary protocol, as specified
-/// by the [Parquet Spec]. Buffer can be encrypted with AES GCM or AES CTR
-/// ciphers as specfied in the [Parquet Encryption Spec].
-///
-/// [Parquet Spec]: https://github.com/apache/parquet-format#metadata
-/// [Parquet Encryption Spec]: https://parquet.apache.org/docs/file-format/data-pages/encryption/
-#[cfg(feature = "encryption")]
-fn decode_metadata_with_encryption(
-    buf: &[u8],
-    encrypted_footer: bool,
-    file_decryption_properties: Option<&FileDecryptionProperties>,
-) -> crate::errors::Result<ParquetMetaData> {
-    let mut prot = TCompactSliceInputProtocol::new(buf);
-    let mut file_decryptor = None;
-    let decrypted_fmd_buf;
-
-    if encrypted_footer {
-        if let Some(file_decryption_properties) = file_decryption_properties {
-            let t_file_crypto_metadata: crate::format::FileCryptoMetaData =
-                crate::format::FileCryptoMetaData::read_from_in_protocol(&mut prot)
-                    .map_err(|e| general_err!("Could not parse crypto metadata: {}", e))?;
-            let supply_aad_prefix = match &t_file_crypto_metadata.encryption_algorithm {
-                EncryptionAlgorithm::AESGCMV1(algo) => algo.supply_aad_prefix,
-                _ => Some(false),
-            }
-            .unwrap_or(false);
-            if supply_aad_prefix && file_decryption_properties.aad_prefix().is_none() {
-                return Err(general_err!(
-                        "Parquet file was encrypted with an AAD prefix that is not stored in the file, \
-                        but no AAD prefix was provided in the file decryption properties"
-                    ));
-            }
-            let decryptor = get_file_decryptor(
-                t_file_crypto_metadata.encryption_algorithm,
-                t_file_crypto_metadata.key_metadata.as_deref(),
-                file_decryption_properties,
-            )?;
-            let footer_decryptor = decryptor.get_footer_decryptor();
-            let aad_footer = create_footer_aad(decryptor.file_aad())?;
-
-            decrypted_fmd_buf = footer_decryptor?
-                .decrypt(prot.as_slice().as_ref(), aad_footer.as_ref())
-                .map_err(|_| {
-                    general_err!(
-                        "Provided footer key and AAD were unable to decrypt parquet footer"
-                    )
-                })?;
-            prot = TCompactSliceInputProtocol::new(decrypted_fmd_buf.as_ref());
-
-            file_decryptor = Some(decryptor);
-        } else {
-            return Err(general_err!(
-                "Parquet file has an encrypted footer but decryption properties were not provided"
-            ));
-        }
-    }
-
-    use crate::format::FileMetaData as TFileMetaData;
-    let t_file_metadata: TFileMetaData = TFileMetaData::read_from_in_protocol(&mut prot)
-        .map_err(|e| general_err!("Could not parse metadata: {}", e))?;
-    let schema = types::from_thrift(&t_file_metadata.schema)?;
-    let schema_descr = Arc::new(SchemaDescriptor::new(schema));
-
-    if let (Some(algo), Some(file_decryption_properties)) = (
-        t_file_metadata.encryption_algorithm,
-        file_decryption_properties,
-    ) {
-        // File has a plaintext footer but encryption algorithm is set
-        let file_decryptor_value = get_file_decryptor(
-            algo,
-            t_file_metadata.footer_signing_key_metadata.as_deref(),
-            file_decryption_properties,
-        )?;
-        if file_decryption_properties.check_plaintext_footer_integrity() && !encrypted_footer {
-            file_decryptor_value.verify_plaintext_footer_signature(buf)?;
-        }
-        file_decryptor = Some(file_decryptor_value);
-    }
-
-    let mut row_groups = Vec::new();
-    for rg in t_file_metadata.row_groups {
-        let r = RowGroupMetaData::from_encrypted_thrift(
-            schema_descr.clone(),
-            rg,
-            file_decryptor.as_ref(),
-        )?;
-        row_groups.push(r);
-    }
-    let column_orders = parse_column_orders(t_file_metadata.column_orders, &schema_descr)?;
-
-    let file_metadata = FileMetaData::new(
-        t_file_metadata.version,
-        t_file_metadata.num_rows,
-        t_file_metadata.created_by,
-        t_file_metadata.key_value_metadata,
-        schema_descr,
-        column_orders,
-    );
-    let mut metadata = ParquetMetaData::new(file_metadata, row_groups);
-
-    metadata.with_file_decryptor(file_decryptor);
-
-    Ok(metadata)
-}
-
-#[cfg(feature = "encryption")]
-fn get_file_decryptor(
-    encryption_algorithm: EncryptionAlgorithm,
-    footer_key_metadata: Option<&[u8]>,
-    file_decryption_properties: &FileDecryptionProperties,
-) -> crate::errors::Result<FileDecryptor> {
-    match encryption_algorithm {
-        EncryptionAlgorithm::AESGCMV1(algo) => {
-            let aad_file_unique = algo
-                .aad_file_unique
-                .ok_or_else(|| general_err!("AAD unique file identifier is not set"))?;
-            let aad_prefix = if let Some(aad_prefix) = file_decryption_properties.aad_prefix() {
-                aad_prefix.clone()
-            } else {
-                algo.aad_prefix.unwrap_or_default()
-            };
-
-            FileDecryptor::new(
-                file_decryption_properties,
-                footer_key_metadata,
-                aad_file_unique,
-                aad_prefix,
-            )
-        }
-        EncryptionAlgorithm::AESGCMCTRV1(_) => Err(nyi_err!(
-            "The AES_GCM_CTR_V1 encryption algorithm is not yet supported"
-        )),
-    }
-}
-
-#[cfg(test)]
-mod test {
-    use super::*;
-    use crate::basic::{SortOrder, Type};
-    use crate::file::metadata::SchemaType;
-    use crate::format::ColumnOrder as TColumnOrder;
-    use crate::format::TypeDefinedOrder;
-    #[test]
-    fn test_metadata_column_orders_parse() {
-        // Define simple schema, we do not need to provide logical types.
-        let fields = vec![
-            Arc::new(
-                SchemaType::primitive_type_builder("col1", Type::INT32)
-                    .build()
-                    .unwrap(),
-            ),
-            Arc::new(
-                SchemaType::primitive_type_builder("col2", Type::FLOAT)
-                    .build()
-                    .unwrap(),
-            ),
-        ];
-        let schema = SchemaType::group_type_builder("schema")
-            .with_fields(fields)
-            .build()
-            .unwrap();
-        let schema_descr = SchemaDescriptor::new(Arc::new(schema));
-
-        let t_column_orders = Some(vec![
-            TColumnOrder::TYPEORDER(TypeDefinedOrder::new()),
-            TColumnOrder::TYPEORDER(TypeDefinedOrder::new()),
-        ]);
-
-        assert_eq!(
-            parse_column_orders(t_column_orders, &schema_descr).unwrap(),
-            Some(vec![
-                ColumnOrder::TYPE_DEFINED_ORDER(SortOrder::SIGNED),
-                ColumnOrder::TYPE_DEFINED_ORDER(SortOrder::SIGNED)
-            ])
-        );
-
-        // Test when no column orders are defined.
-        assert_eq!(parse_column_orders(None, &schema_descr).unwrap(), None);
-    }
-
-    #[test]
-    fn test_metadata_column_orders_len_mismatch() {
-        let schema = SchemaType::group_type_builder("schema").build().unwrap();
-        let schema_descr = SchemaDescriptor::new(Arc::new(schema));
-
-        let t_column_orders = Some(vec![TColumnOrder::TYPEORDER(TypeDefinedOrder::new())]);
-
-        let res = parse_column_orders(t_column_orders, &schema_descr);
-        assert!(res.is_err());
-        assert!(format!("{:?}", res.unwrap_err()).contains("Column order length mismatch"));
-    }
-}
diff --git a/parquet/src/file/metadata/reader.rs b/parquet/src/file/metadata/reader.rs
index 61bfcd443cd1..26104ea81ac1 100644
--- a/parquet/src/file/metadata/reader.rs
+++ b/parquet/src/file/metadata/reader.rs
@@ -18,6 +18,7 @@
 #[cfg(feature = "encryption")]
 use crate::encryption::decrypt::FileDecryptionProperties;
 use crate::errors::{ParquetError, Result};
+use crate::file::metadata::parser::decode_metadata;
 use crate::file::metadata::{FooterTail, ParquetMetaData, ParquetMetaDataPushDecoder};
 use crate::file::reader::ChunkReader;
 use crate::file::FOOTER_SIZE;
@@ -26,7 +27,6 @@ use std::{io::Read, ops::Range};
 
 #[cfg(all(feature = "async", feature = "arrow"))]
 use crate::arrow::async_reader::{MetadataFetch, MetadataSuffixFetch};
-use crate::file::metadata::parser::decode_metadata;
 use crate::DecodeResult;
 
 /// Reads [`ParquetMetaData`] from a byte stream, with either synchronous or
@@ -789,7 +789,6 @@ impl ParquetMetaDataReader {
     ///
     /// [Parquet Spec]: https://github.com/apache/parquet-format#metadata
     pub fn decode_metadata(buf: &[u8]) -> Result<ParquetMetaData> {
-        // Note this API does not support encryption.
         decode_metadata(buf)
     }
 }
diff --git a/parquet/src/file/metadata/thrift_gen.rs b/parquet/src/file/metadata/thrift_gen.rs
new file mode 100644
index 000000000000..489cb44cd77b
--- /dev/null
+++ b/parquet/src/file/metadata/thrift_gen.rs
@@ -0,0 +1,1767 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// a collection of generated structs used to parse thrift metadata
+
+use std::io::Write;
+use std::sync::Arc;
+
+use crate::{
+    basic::{
+        ColumnOrder, Compression, ConvertedType, Encoding, LogicalType, PageType, Repetition, Type,
+    },
+    data_type::{ByteArray, FixedLenByteArray, Int96},
+    errors::{ParquetError, Result},
+    file::{
+        metadata::{
+            ColumnChunkMetaData, KeyValue, LevelHistogram, PageEncodingStats, ParquetMetaData,
+            RowGroupMetaData, SortingColumn,
+        },
+        statistics::ValueStatistics,
+    },
+    parquet_thrift::{
+        read_thrift_vec, ElementType, FieldType, ReadThrift, ThriftCompactInputProtocol,
+        ThriftCompactOutputProtocol, WriteThrift, WriteThriftField,
+    },
+    schema::types::{
+        num_nodes, parquet_schema_from_array, ColumnDescriptor, SchemaDescriptor, TypePtr,
+    },
+    thrift_struct, thrift_union,
+    util::bit_util::FromBytes,
+};
+#[cfg(feature = "encryption")]
+use crate::{
+    encryption::decrypt::{FileDecryptionProperties, FileDecryptor},
+    file::column_crypto_metadata::ColumnCryptoMetaData,
+    parquet_thrift::ThriftSliceInputProtocol,
+    schema::types::SchemaDescPtr,
+};
+
+// this needs to be visible to the schema conversion code
+thrift_struct!(
+pub(crate) struct SchemaElement<'a> {
+  /// Data type for this field. Not set if the current element is a non-leaf node
+  1: optional Type r#type;
+  /// If type is FIXED_LEN_BYTE_ARRAY, this is the byte length of the values.
+  /// Otherwise, if specified, this is the maximum bit length to store any of the values.
+  /// (e.g. a low cardinality INT col could have this set to 3).  Note that this is
+  /// in the schema, and therefore fixed for the entire file.
+  2: optional i32 type_length;
+  /// Repetition of the field. The root of the schema does not have a repetition_type.
+  /// All other nodes must have one.
+  3: optional Repetition repetition_type;
+  /// Name of the field in the schema
+  4: required string<'a> name;
+  /// Nested fields. Since thrift does not support nested fields,
+  /// the nesting is flattened to a single list by a depth-first traversal.
+  /// The children count is used to construct the nested relationship.
+  /// This field is not set when the element is a primitive type.
+  5: optional i32 num_children;
+  /// DEPRECATED: When the schema is the result of a conversion from another model.
+  /// Used to record the original type to help with cross conversion.
+  ///
+  /// This is superseded by logical_type.
+  6: optional ConvertedType converted_type;
+  /// DEPRECATED: Used when this column contains decimal data.
+  /// See the DECIMAL converted type for more details.
+  ///
+  /// This is superseded by using the DecimalType annotation in logical_type.
+  7: optional i32 scale
+  8: optional i32 precision
+  /// When the original schema supports field ids, this will save the
+  /// original field id in the parquet schema
+  9: optional i32 field_id;
+  /// The logical type of this SchemaElement
+  ///
+  /// LogicalType replaces ConvertedType, but ConvertedType is still required
+  /// for some logical types to ensure forward-compatibility in format v1.
+  10: optional LogicalType logical_type
+}
+);
+
+thrift_struct!(
+pub(crate) struct AesGcmV1 {
+  /// AAD prefix
+  1: optional binary aad_prefix
+
+  /// Unique file identifier part of AAD suffix
+  2: optional binary aad_file_unique
+
+  /// In files encrypted with AAD prefix without storing it,
+  /// readers must supply the prefix
+  3: optional bool supply_aad_prefix
+}
+);
+
+thrift_struct!(
+pub(crate) struct AesGcmCtrV1 {
+  /// AAD prefix
+  1: optional binary aad_prefix
+
+  /// Unique file identifier part of AAD suffix
+  2: optional binary aad_file_unique
+
+  /// In files encrypted with AAD prefix without storing it,
+  /// readers must supply the prefix
+  3: optional bool supply_aad_prefix
+}
+);
+
+thrift_union!(
+union EncryptionAlgorithm {
+  1: (AesGcmV1) AES_GCM_V1
+  2: (AesGcmCtrV1) AES_GCM_CTR_V1
+}
+);
+
+#[cfg(feature = "encryption")]
+thrift_struct!(
+/// Crypto metadata for files with encrypted footer
+pub(crate) struct FileCryptoMetaData<'a> {
+  /// Encryption algorithm. This field is only used for files
+  /// with encrypted footer. Files with plaintext footer store algorithm id
+  /// inside footer (FileMetaData structure).
+  1: required EncryptionAlgorithm encryption_algorithm
+
+  /// Retrieval metadata of key used for encryption of footer,
+  /// and (possibly) columns.
+  2: optional binary<'a> key_metadata
+}
+);
+
+// the following are only used internally so are private
+thrift_struct!(
+struct FileMetaData<'a> {
+  1: required i32 version
+  2: required list<'a><SchemaElement> schema;
+  3: required i64 num_rows
+  4: required list<'a><RowGroup> row_groups
+  5: optional list<KeyValue> key_value_metadata
+  6: optional string<'a> created_by
+  7: optional list<ColumnOrder> column_orders;
+  8: optional EncryptionAlgorithm encryption_algorithm
+  9: optional binary<'a> footer_signing_key_metadata
+}
+);
+
+thrift_struct!(
+struct RowGroup<'a> {
+  1: required list<'a><ColumnChunk> columns
+  2: required i64 total_byte_size
+  3: required i64 num_rows
+  4: optional list<SortingColumn> sorting_columns
+  5: optional i64 file_offset
+  // we don't expose total_compressed_size so skip
+  //6: optional i64 total_compressed_size
+  7: optional i16 ordinal
+}
+);
+
+#[cfg(feature = "encryption")]
+thrift_struct!(
+struct ColumnChunk<'a> {
+  1: optional string<'a> file_path
+  2: required i64 file_offset = 0
+  3: optional ColumnMetaData<'a> meta_data
+  4: optional i64 offset_index_offset
+  5: optional i32 offset_index_length
+  6: optional i64 column_index_offset
+  7: optional i32 column_index_length
+  8: optional ColumnCryptoMetaData crypto_metadata
+  9: optional binary<'a> encrypted_column_metadata
+}
+);
+#[cfg(not(feature = "encryption"))]
+thrift_struct!(
+struct ColumnChunk<'a> {
+  1: optional string<'a> file_path
+  2: required i64 file_offset = 0
+  3: optional ColumnMetaData<'a> meta_data
+  4: optional i64 offset_index_offset
+  5: optional i32 offset_index_length
+  6: optional i64 column_index_offset
+  7: optional i32 column_index_length
+}
+);
+
+type CompressionCodec = Compression;
+thrift_struct!(
+struct ColumnMetaData<'a> {
+  1: required Type r#type
+  2: required list<Encoding> encodings
+  // we don't expose path_in_schema so skip
+  //3: required list<string> path_in_schema
+  4: required CompressionCodec codec
+  5: required i64 num_values
+  6: required i64 total_uncompressed_size
+  7: required i64 total_compressed_size
+  // we don't expose key_value_metadata so skip
+  //8: optional list<KeyValue> key_value_metadata
+  9: required i64 data_page_offset
+  10: optional i64 index_page_offset
+  11: optional i64 dictionary_page_offset
+  12: optional Statistics<'a> statistics
+  13: optional list<PageEncodingStats> encoding_stats;
+  14: optional i64 bloom_filter_offset;
+  15: optional i32 bloom_filter_length;
+  16: optional SizeStatistics size_statistics;
+  17: optional GeospatialStatistics geospatial_statistics;
+}
+);
+
+thrift_struct!(
+struct BoundingBox {
+  1: required double xmin;
+  2: required double xmax;
+  3: required double ymin;
+  4: required double ymax;
+  5: optional double zmin;
+  6: optional double zmax;
+  7: optional double mmin;
+  8: optional double mmax;
+}
+);
+
+thrift_struct!(
+struct GeospatialStatistics {
+  1: optional BoundingBox bbox;
+  2: optional list<i32> geospatial_types;
+}
+);
+
+thrift_struct!(
+struct SizeStatistics {
+   1: optional i64 unencoded_byte_array_data_bytes;
+   2: optional list<i64> repetition_level_histogram;
+   3: optional list<i64> definition_level_histogram;
+}
+);
+
+thrift_struct!(
+pub(crate) struct Statistics<'a> {
+   1: optional binary<'a> max;
+   2: optional binary<'a> min;
+   3: optional i64 null_count;
+   4: optional i64 distinct_count;
+   5: optional binary<'a> max_value;
+   6: optional binary<'a> min_value;
+   7: optional bool is_max_value_exact;
+   8: optional bool is_min_value_exact;
+}
+);
+
+// convert collection of thrift RowGroups into RowGroupMetaData
+fn convert_row_groups(
+    mut row_groups: Vec<RowGroup>,
+    schema_descr: Arc<SchemaDescriptor>,
+) -> Result<Vec<RowGroupMetaData>> {
+    let mut res: Vec<RowGroupMetaData> = Vec::with_capacity(row_groups.len());
+    for rg in row_groups.drain(0..) {
+        res.push(convert_row_group(rg, schema_descr.clone())?);
+    }
+
+    Ok(res)
+}
+
+fn convert_row_group(
+    row_group: RowGroup,
+    schema_descr: Arc<SchemaDescriptor>,
+) -> Result<RowGroupMetaData> {
+    if schema_descr.num_columns() != row_group.columns.len() {
+        return Err(general_err!(
+            "Column count mismatch. Schema has {} columns while Row Group has {}",
+            schema_descr.num_columns(),
+            row_group.columns.len()
+        ));
+    }
+
+    let num_rows = row_group.num_rows;
+    let sorting_columns = row_group.sorting_columns;
+    let total_byte_size = row_group.total_byte_size;
+    let file_offset = row_group.file_offset;
+    let ordinal = row_group.ordinal;
+
+    let columns = convert_columns(row_group.columns, schema_descr.clone())?;
+
+    Ok(RowGroupMetaData {
+        columns,
+        num_rows,
+        sorting_columns,
+        total_byte_size,
+        schema_descr,
+        file_offset,
+        ordinal,
+    })
+}
+
+fn convert_columns(
+    mut columns: Vec<ColumnChunk>,
+    schema_descr: Arc<SchemaDescriptor>,
+) -> Result<Vec<ColumnChunkMetaData>> {
+    let mut res: Vec<ColumnChunkMetaData> = Vec::with_capacity(columns.len());
+    for (c, d) in columns.drain(0..).zip(schema_descr.columns()) {
+        res.push(convert_column(c, d.clone())?);
+    }
+
+    Ok(res)
+}
+
+fn convert_column(
+    column: ColumnChunk,
+    column_descr: Arc<ColumnDescriptor>,
+) -> Result<ColumnChunkMetaData> {
+    if column.meta_data.is_none() {
+        return Err(general_err!("Expected to have column metadata"));
+    }
+    let col_metadata = column.meta_data.unwrap();
+    let column_type = col_metadata.r#type;
+    let encodings = col_metadata.encodings;
+    let compression = col_metadata.codec;
+    let file_path = column.file_path.map(|v| v.to_owned());
+    let file_offset = column.file_offset;
+    let num_values = col_metadata.num_values;
+    let total_compressed_size = col_metadata.total_compressed_size;
+    let total_uncompressed_size = col_metadata.total_uncompressed_size;
+    let data_page_offset = col_metadata.data_page_offset;
+    let index_page_offset = col_metadata.index_page_offset;
+    let dictionary_page_offset = col_metadata.dictionary_page_offset;
+    let statistics = convert_stats(column_type, col_metadata.statistics)?;
+    let encoding_stats = col_metadata.encoding_stats;
+    let bloom_filter_offset = col_metadata.bloom_filter_offset;
+    let bloom_filter_length = col_metadata.bloom_filter_length;
+    let offset_index_offset = column.offset_index_offset;
+    let offset_index_length = column.offset_index_length;
+    let column_index_offset = column.column_index_offset;
+    let column_index_length = column.column_index_length;
+    let (unencoded_byte_array_data_bytes, repetition_level_histogram, definition_level_histogram) =
+        if let Some(size_stats) = col_metadata.size_statistics {
+            (
+                size_stats.unencoded_byte_array_data_bytes,
+                size_stats.repetition_level_histogram,
+                size_stats.definition_level_histogram,
+            )
+        } else {
+            (None, None, None)
+        };
+
+    let geo_statistics = convert_geo_stats(col_metadata.geospatial_statistics);
+
+    let repetition_level_histogram = repetition_level_histogram.map(LevelHistogram::from);
+    let definition_level_histogram = definition_level_histogram.map(LevelHistogram::from);
+
+    let result = ColumnChunkMetaData {
+        column_descr,
+        encodings,
+        file_path,
+        file_offset,
+        num_values,
+        compression,
+        total_compressed_size,
+        total_uncompressed_size,
+        data_page_offset,
+        index_page_offset,
+        dictionary_page_offset,
+        statistics,
+        geo_statistics,
+        encoding_stats,
+        bloom_filter_offset,
+        bloom_filter_length,
+        offset_index_offset,
+        offset_index_length,
+        column_index_offset,
+        column_index_length,
+        unencoded_byte_array_data_bytes,
+        repetition_level_histogram,
+        definition_level_histogram,
+        #[cfg(feature = "encryption")]
+        column_crypto_metadata: column.crypto_metadata,
+        #[cfg(feature = "encryption")]
+        encrypted_column_metadata: None,
+    };
+    Ok(result)
+}
+
+fn convert_geo_stats(
+    stats: Option<GeospatialStatistics>,
+) -> Option<Box<crate::geospatial::statistics::GeospatialStatistics>> {
+    stats.map(|st| {
+        let bbox = convert_bounding_box(st.bbox);
+        let geospatial_types: Option<Vec<i32>> = st.geospatial_types.filter(|v| !v.is_empty());
+        Box::new(crate::geospatial::statistics::GeospatialStatistics::new(
+            bbox,
+            geospatial_types,
+        ))
+    })
+}
+
+fn convert_bounding_box(
+    bbox: Option<BoundingBox>,
+) -> Option<crate::geospatial::bounding_box::BoundingBox> {
+    bbox.map(|bb| {
+        let mut newbb = crate::geospatial::bounding_box::BoundingBox::new(
+            bb.xmin.into(),
+            bb.xmax.into(),
+            bb.ymin.into(),
+            bb.ymax.into(),
+        );
+
+        newbb = match (bb.zmin, bb.zmax) {
+            (Some(zmin), Some(zmax)) => newbb.with_zrange(zmin.into(), zmax.into()),
+            // If either None or mismatch, leave it as None and don't error
+            _ => newbb,
+        };
+
+        newbb = match (bb.mmin, bb.mmax) {
+            (Some(mmin), Some(mmax)) => newbb.with_mrange(mmin.into(), mmax.into()),
+            // If either None or mismatch, leave it as None and don't error
+            _ => newbb,
+        };
+
+        newbb
+    })
+}
+
+pub(crate) fn convert_stats(
+    physical_type: Type,
+    thrift_stats: Option<Statistics>,
+) -> Result<Option<crate::file::statistics::Statistics>> {
+    use crate::file::statistics::Statistics as FStatistics;
+    Ok(match thrift_stats {
+        Some(stats) => {
+            // Number of nulls recorded, when it is not available, we just mark it as 0.
+            // TODO this should be `None` if there is no information about NULLS.
+            // see https://github.com/apache/arrow-rs/pull/6216/files
+            let null_count = stats.null_count.unwrap_or(0);
+
+            if null_count < 0 {
+                return Err(ParquetError::General(format!(
+                    "Statistics null count is negative {null_count}",
+                )));
+            }
+
+            // Generic null count.
+            let null_count = Some(null_count as u64);
+            // Generic distinct count (count of distinct values occurring)
+            let distinct_count = stats.distinct_count.map(|value| value as u64);
+            // Whether or not statistics use deprecated min/max fields.
+            let old_format = stats.min_value.is_none() && stats.max_value.is_none();
+            // Generic min value as bytes.
+            let min = if old_format {
+                stats.min
+            } else {
+                stats.min_value
+            };
+            // Generic max value as bytes.
+            let max = if old_format {
+                stats.max
+            } else {
+                stats.max_value
+            };
+
+            fn check_len(min: &Option<&[u8]>, max: &Option<&[u8]>, len: usize) -> Result<()> {
+                if let Some(min) = min {
+                    if min.len() < len {
+                        return Err(ParquetError::General(
+                            "Insufficient bytes to parse min statistic".to_string(),
+                        ));
+                    }
+                }
+                if let Some(max) = max {
+                    if max.len() < len {
+                        return Err(ParquetError::General(
+                            "Insufficient bytes to parse max statistic".to_string(),
+                        ));
+                    }
+                }
+                Ok(())
+            }
+
+            match physical_type {
+                Type::BOOLEAN => check_len(&min, &max, 1),
+                Type::INT32 | Type::FLOAT => check_len(&min, &max, 4),
+                Type::INT64 | Type::DOUBLE => check_len(&min, &max, 8),
+                Type::INT96 => check_len(&min, &max, 12),
+                _ => Ok(()),
+            }?;
+
+            // Values are encoded using PLAIN encoding definition, except that
+            // variable-length byte arrays do not include a length prefix.
+            //
+            // Instead of using actual decoder, we manually convert values.
+            let res = match physical_type {
+                Type::BOOLEAN => FStatistics::boolean(
+                    min.map(|data| data[0] != 0),
+                    max.map(|data| data[0] != 0),
+                    distinct_count,
+                    null_count,
+                    old_format,
+                ),
+                Type::INT32 => FStatistics::int32(
+                    min.map(|data| i32::from_le_bytes(data[..4].try_into().unwrap())),
+                    max.map(|data| i32::from_le_bytes(data[..4].try_into().unwrap())),
+                    distinct_count,
+                    null_count,
+                    old_format,
+                ),
+                Type::INT64 => FStatistics::int64(
+                    min.map(|data| i64::from_le_bytes(data[..8].try_into().unwrap())),
+                    max.map(|data| i64::from_le_bytes(data[..8].try_into().unwrap())),
+                    distinct_count,
+                    null_count,
+                    old_format,
+                ),
+                Type::INT96 => {
+                    // INT96 statistics may not be correct, because comparison is signed
+                    let min = if let Some(data) = min {
+                        assert_eq!(data.len(), 12);
+                        Some(Int96::try_from_le_slice(data)?)
+                    } else {
+                        None
+                    };
+                    let max = if let Some(data) = max {
+                        assert_eq!(data.len(), 12);
+                        Some(Int96::try_from_le_slice(data)?)
+                    } else {
+                        None
+                    };
+                    FStatistics::int96(min, max, distinct_count, null_count, old_format)
+                }
+                Type::FLOAT => FStatistics::float(
+                    min.map(|data| f32::from_le_bytes(data[..4].try_into().unwrap())),
+                    max.map(|data| f32::from_le_bytes(data[..4].try_into().unwrap())),
+                    distinct_count,
+                    null_count,
+                    old_format,
+                ),
+                Type::DOUBLE => FStatistics::double(
+                    min.map(|data| f64::from_le_bytes(data[..8].try_into().unwrap())),
+                    max.map(|data| f64::from_le_bytes(data[..8].try_into().unwrap())),
+                    distinct_count,
+                    null_count,
+                    old_format,
+                ),
+                Type::BYTE_ARRAY => FStatistics::ByteArray(
+                    ValueStatistics::new(
+                        min.map(ByteArray::from),
+                        max.map(ByteArray::from),
+                        distinct_count,
+                        null_count,
+                        old_format,
+                    )
+                    .with_max_is_exact(stats.is_max_value_exact.unwrap_or(false))
+                    .with_min_is_exact(stats.is_min_value_exact.unwrap_or(false)),
+                ),
+                Type::FIXED_LEN_BYTE_ARRAY => FStatistics::FixedLenByteArray(
+                    ValueStatistics::new(
+                        min.map(ByteArray::from).map(FixedLenByteArray::from),
+                        max.map(ByteArray::from).map(FixedLenByteArray::from),
+                        distinct_count,
+                        null_count,
+                        old_format,
+                    )
+                    .with_max_is_exact(stats.is_max_value_exact.unwrap_or(false))
+                    .with_min_is_exact(stats.is_min_value_exact.unwrap_or(false)),
+                ),
+            };
+
+            Some(res)
+        }
+        None => None,
+    })
+}
+
+#[cfg(feature = "encryption")]
+fn row_group_from_encrypted_thrift(
+    mut rg: RowGroup,
+    schema_descr: SchemaDescPtr,
+    decryptor: Option<&FileDecryptor>,
+) -> Result<RowGroupMetaData> {
+    if schema_descr.num_columns() != rg.columns.len() {
+        return Err(general_err!(
+            "Column count mismatch. Schema has {} columns while Row Group has {}",
+            schema_descr.num_columns(),
+            rg.columns.len()
+        ));
+    }
+    let total_byte_size = rg.total_byte_size;
+    let num_rows = rg.num_rows;
+    let mut columns = vec![];
+
+    for (i, (mut c, d)) in rg
+        .columns
+        .drain(0..)
+        .zip(schema_descr.columns())
+        .enumerate()
+    {
+        // Read encrypted metadata if it's present and we have a decryptor.
+        if let (true, Some(decryptor)) = (c.encrypted_column_metadata.is_some(), decryptor) {
+            let column_decryptor = match c.crypto_metadata.as_ref() {
+                None => {
+                    return Err(general_err!(
+                        "No crypto_metadata is set for column '{}', which has encrypted metadata",
+                        d.path().string()
+                    ));
+                }
+                Some(ColumnCryptoMetaData::ENCRYPTION_WITH_COLUMN_KEY(crypto_metadata)) => {
+                    let column_name = crypto_metadata.path_in_schema.join(".");
+                    decryptor.get_column_metadata_decryptor(
+                        column_name.as_str(),
+                        crypto_metadata.key_metadata.as_deref(),
+                    )?
+                }
+                Some(ColumnCryptoMetaData::ENCRYPTION_WITH_FOOTER_KEY) => {
+                    decryptor.get_footer_decryptor()?
+                }
+            };
+
+            let column_aad = crate::encryption::modules::create_module_aad(
+                decryptor.file_aad(),
+                crate::encryption::modules::ModuleType::ColumnMetaData,
+                rg.ordinal.unwrap() as usize,
+                i,
+                None,
+            )?;
+
+            let buf = c.encrypted_column_metadata.unwrap();
+            let decrypted_cc_buf =
+                column_decryptor
+                    .decrypt(buf, column_aad.as_ref())
+                    .map_err(|_| {
+                        general_err!(
+                            "Unable to decrypt column '{}', perhaps the column key is wrong?",
+                            d.path().string()
+                        )
+                    })?;
+
+            let mut prot = ThriftSliceInputProtocol::new(decrypted_cc_buf.as_slice());
+            let col_meta = ColumnMetaData::read_thrift(&mut prot)?;
+            c.meta_data = Some(col_meta);
+            columns.push(convert_column(c, d.clone())?);
+        } else {
+            columns.push(convert_column(c, d.clone())?);
+        }
+    }
+
+    let sorting_columns = rg.sorting_columns;
+    let file_offset = rg.file_offset;
+    let ordinal = rg.ordinal;
+
+    Ok(RowGroupMetaData {
+        columns,
+        num_rows,
+        sorting_columns,
+        total_byte_size,
+        schema_descr,
+        file_offset,
+        ordinal,
+    })
+}
+
+#[cfg(feature = "encryption")]
+/// Decodes [`ParquetMetaData`] from the provided bytes, handling metadata that may be encrypted.
+///
+/// Typically this is used to decode the metadata from the end of a parquet
+/// file. The format of `buf` is the Thrift compact binary protocol, as specified
+/// by the [Parquet Spec]. Buffer can be encrypted with AES GCM or AES CTR
+/// ciphers as specfied in the [Parquet Encryption Spec].
+///
+/// [Parquet Spec]: https://github.com/apache/parquet-format#metadata
+/// [Parquet Encryption Spec]: https://parquet.apache.org/docs/file-format/data-pages/encryption/
+pub(crate) fn parquet_metadata_with_encryption(
+    file_decryption_properties: Option<&FileDecryptionProperties>,
+    encrypted_footer: bool,
+    buf: &[u8],
+) -> Result<ParquetMetaData> {
+    let mut prot = ThriftSliceInputProtocol::new(buf);
+    let mut file_decryptor = None;
+    let decrypted_fmd_buf;
+
+    if encrypted_footer {
+        if let Some(file_decryption_properties) = file_decryption_properties {
+            let t_file_crypto_metadata: FileCryptoMetaData =
+                FileCryptoMetaData::read_thrift(&mut prot)
+                    .map_err(|e| general_err!("Could not parse crypto metadata: {}", e))?;
+            let supply_aad_prefix = match &t_file_crypto_metadata.encryption_algorithm {
+                EncryptionAlgorithm::AES_GCM_V1(algo) => algo.supply_aad_prefix,
+                _ => Some(false),
+            }
+            .unwrap_or(false);
+            if supply_aad_prefix && file_decryption_properties.aad_prefix().is_none() {
+                return Err(general_err!(
+                        "Parquet file was encrypted with an AAD prefix that is not stored in the file, \
+                        but no AAD prefix was provided in the file decryption properties"
+                    ));
+            }
+            let decryptor = get_file_decryptor(
+                t_file_crypto_metadata.encryption_algorithm,
+                t_file_crypto_metadata.key_metadata,
+                file_decryption_properties,
+            )?;
+            let footer_decryptor = decryptor.get_footer_decryptor();
+            let aad_footer = crate::encryption::modules::create_footer_aad(decryptor.file_aad())?;
+
+            decrypted_fmd_buf = footer_decryptor?
+                .decrypt(prot.as_slice().as_ref(), aad_footer.as_ref())
+                .map_err(|_| {
+                    general_err!(
+                        "Provided footer key and AAD were unable to decrypt parquet footer"
+                    )
+                })?;
+            prot = ThriftSliceInputProtocol::new(decrypted_fmd_buf.as_ref());
+
+            file_decryptor = Some(decryptor);
+        } else {
+            return Err(general_err!(
+                "Parquet file has an encrypted footer but decryption properties were not provided"
+            ));
+        }
+    }
+
+    let file_meta = FileMetaData::read_thrift(&mut prot)
+        .map_err(|e| general_err!("Could not parse metadata: {}", e))?;
+
+    let version = file_meta.version;
+    let num_rows = file_meta.num_rows;
+    let created_by = file_meta.created_by.map(|c| c.to_owned());
+    let key_value_metadata = file_meta.key_value_metadata;
+
+    let val = parquet_schema_from_array(file_meta.schema)?;
+    let schema_descr = Arc::new(SchemaDescriptor::new(val));
+
+    if let (Some(algo), Some(file_decryption_properties)) =
+        (file_meta.encryption_algorithm, file_decryption_properties)
+    {
+        // File has a plaintext footer but encryption algorithm is set
+        let file_decryptor_value = get_file_decryptor(
+            algo,
+            file_meta.footer_signing_key_metadata,
+            file_decryption_properties,
+        )?;
+        if file_decryption_properties.check_plaintext_footer_integrity() && !encrypted_footer {
+            file_decryptor_value.verify_plaintext_footer_signature(buf)?;
+        }
+        file_decryptor = Some(file_decryptor_value);
+    }
+
+    // decrypt column chunk info
+    let mut row_groups = Vec::with_capacity(file_meta.row_groups.len());
+    for rg in file_meta.row_groups {
+        let r = row_group_from_encrypted_thrift(rg, schema_descr.clone(), file_decryptor.as_ref())?;
+        row_groups.push(r);
+    }
+
+    // need to map read column orders to actual values based on the schema
+    if file_meta
+        .column_orders
+        .as_ref()
+        .is_some_and(|cos| cos.len() != schema_descr.num_columns())
+    {
+        return Err(general_err!("Column order length mismatch"));
+    }
+
+    let column_orders = file_meta.column_orders.map(|cos| {
+        let mut res = Vec::with_capacity(cos.len());
+        for (i, column) in schema_descr.columns().iter().enumerate() {
+            match cos[i] {
+                ColumnOrder::TYPE_DEFINED_ORDER(_) => {
+                    let sort_order = ColumnOrder::get_sort_order(
+                        column.logical_type(),
+                        column.converted_type(),
+                        column.physical_type(),
+                    );
+                    res.push(ColumnOrder::TYPE_DEFINED_ORDER(sort_order));
+                }
+                _ => res.push(cos[i]),
+            }
+        }
+        res
+    });
+
+    let fmd = crate::file::metadata::FileMetaData::new(
+        version,
+        num_rows,
+        created_by,
+        key_value_metadata,
+        schema_descr,
+        column_orders,
+    );
+    let mut metadata = ParquetMetaData::new(fmd, row_groups);
+
+    metadata.with_file_decryptor(file_decryptor);
+
+    Ok(metadata)
+}
+
+#[cfg(feature = "encryption")]
+fn get_file_decryptor(
+    encryption_algorithm: EncryptionAlgorithm,
+    footer_key_metadata: Option<&[u8]>,
+    file_decryption_properties: &FileDecryptionProperties,
+) -> Result<FileDecryptor> {
+    match encryption_algorithm {
+        EncryptionAlgorithm::AES_GCM_V1(algo) => {
+            let aad_file_unique = algo
+                .aad_file_unique
+                .ok_or_else(|| general_err!("AAD unique file identifier is not set"))?;
+            let aad_prefix = if let Some(aad_prefix) = file_decryption_properties.aad_prefix() {
+                aad_prefix.clone()
+            } else {
+                algo.aad_prefix.map(|v| v.to_vec()).unwrap_or_default()
+            };
+            let aad_file_unique = aad_file_unique.to_vec();
+
+            FileDecryptor::new(
+                file_decryption_properties,
+                footer_key_metadata,
+                aad_file_unique,
+                aad_prefix,
+            )
+        }
+        EncryptionAlgorithm::AES_GCM_CTR_V1(_) => Err(nyi_err!(
+            "The AES_GCM_CTR_V1 encryption algorithm is not yet supported"
+        )),
+    }
+}
+
+/// Create ParquetMetaData from thrift input. Note that this only decodes the file metadata in
+/// the Parquet footer. Page indexes will need to be added later.
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for ParquetMetaData {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        let file_meta = FileMetaData::read_thrift(prot)?;
+
+        let version = file_meta.version;
+        let num_rows = file_meta.num_rows;
+        let row_groups = file_meta.row_groups;
+        let created_by = file_meta.created_by.map(|c| c.to_owned());
+        let key_value_metadata = file_meta.key_value_metadata;
+
+        let val = parquet_schema_from_array(file_meta.schema)?;
+        let schema_descr = Arc::new(SchemaDescriptor::new(val));
+
+        // need schema_descr to get final RowGroupMetaData
+        let row_groups = convert_row_groups(row_groups, schema_descr.clone())?;
+
+        // need to map read column orders to actual values based on the schema
+        if file_meta
+            .column_orders
+            .as_ref()
+            .is_some_and(|cos| cos.len() != schema_descr.num_columns())
+        {
+            return Err(general_err!("Column order length mismatch"));
+        }
+
+        let column_orders = file_meta.column_orders.map(|cos| {
+            let mut res = Vec::with_capacity(cos.len());
+            for (i, column) in schema_descr.columns().iter().enumerate() {
+                match cos[i] {
+                    ColumnOrder::TYPE_DEFINED_ORDER(_) => {
+                        let sort_order = ColumnOrder::get_sort_order(
+                            column.logical_type(),
+                            column.converted_type(),
+                            column.physical_type(),
+                        );
+                        res.push(ColumnOrder::TYPE_DEFINED_ORDER(sort_order));
+                    }
+                    _ => res.push(cos[i]),
+                }
+            }
+            res
+        });
+
+        let fmd = crate::file::metadata::FileMetaData::new(
+            version,
+            num_rows,
+            created_by,
+            key_value_metadata,
+            schema_descr,
+            column_orders,
+        );
+
+        Ok(ParquetMetaData::new(fmd, row_groups))
+    }
+}
+
+thrift_struct!(
+    pub(crate) struct IndexPageHeader {}
+);
+
+thrift_struct!(
+pub(crate) struct DictionaryPageHeader {
+  /// Number of values in the dictionary
+  1: required i32 num_values;
+
+  /// Encoding using this dictionary page
+  2: required Encoding encoding
+
+  /// If true, the entries in the dictionary are sorted in ascending order
+  3: optional bool is_sorted;
+}
+);
+
+thrift_struct!(
+/// Statistics for the page header.
+///
+/// This is a duplicate of the [`Statistics`] struct above. Because the page reader uses
+/// the [`Read`] API, we cannot read the min/max values as slices. This should not be
+/// a huge problem since this crate no longer reads the page header statistics by default.
+///
+/// [`Read`]: crate::parquet_thrift::ThriftReadInputProtocol
+pub(crate) struct PageStatistics {
+   1: optional binary max;
+   2: optional binary min;
+   3: optional i64 null_count;
+   4: optional i64 distinct_count;
+   5: optional binary max_value;
+   6: optional binary min_value;
+   7: optional bool is_max_value_exact;
+   8: optional bool is_min_value_exact;
+}
+);
+
+thrift_struct!(
+pub(crate) struct DataPageHeader {
+  1: required i32 num_values
+  2: required Encoding encoding
+  3: required Encoding definition_level_encoding;
+  4: required Encoding repetition_level_encoding;
+  5: optional PageStatistics statistics;
+}
+);
+
+impl DataPageHeader {
+    // reader that skips decoding page statistics
+    fn read_thrift_without_stats<'a, R>(prot: &mut R) -> Result<Self>
+    where
+        R: ThriftCompactInputProtocol<'a>,
+    {
+        let mut num_values: Option<i32> = None;
+        let mut encoding: Option<Encoding> = None;
+        let mut definition_level_encoding: Option<Encoding> = None;
+        let mut repetition_level_encoding: Option<Encoding> = None;
+        let statistics: Option<PageStatistics> = None;
+        let mut last_field_id = 0i16;
+        loop {
+            let field_ident = prot.read_field_begin(last_field_id)?;
+            if field_ident.field_type == FieldType::Stop {
+                break;
+            }
+            match field_ident.id {
+                1 => {
+                    let val = i32::read_thrift(&mut *prot)?;
+                    num_values = Some(val);
+                }
+                2 => {
+                    let val = Encoding::read_thrift(&mut *prot)?;
+                    encoding = Some(val);
+                }
+                3 => {
+                    let val = Encoding::read_thrift(&mut *prot)?;
+                    definition_level_encoding = Some(val);
+                }
+                4 => {
+                    let val = Encoding::read_thrift(&mut *prot)?;
+                    repetition_level_encoding = Some(val);
+                }
+                _ => {
+                    prot.skip(field_ident.field_type)?;
+                }
+            };
+            last_field_id = field_ident.id;
+        }
+        let Some(num_values) = num_values else {
+            return Err(ParquetError::General(
+                "Required field num_values is missing".to_owned(),
+            ));
+        };
+        let Some(encoding) = encoding else {
+            return Err(ParquetError::General(
+                "Required field encoding is missing".to_owned(),
+            ));
+        };
+        let Some(definition_level_encoding) = definition_level_encoding else {
+            return Err(ParquetError::General(
+                "Required field definition_level_encoding is missing".to_owned(),
+            ));
+        };
+        let Some(repetition_level_encoding) = repetition_level_encoding else {
+            return Err(ParquetError::General(
+                "Required field repetition_level_encoding is missing".to_owned(),
+            ));
+        };
+        Ok(Self {
+            num_values,
+            encoding,
+            definition_level_encoding,
+            repetition_level_encoding,
+            statistics,
+        })
+    }
+}
+
+thrift_struct!(
+pub(crate) struct DataPageHeaderV2 {
+  1: required i32 num_values
+  2: required i32 num_nulls
+  3: required i32 num_rows
+  4: required Encoding encoding
+  5: required i32 definition_levels_byte_length;
+  6: required i32 repetition_levels_byte_length;
+  7: optional bool is_compressed = true;
+  8: optional PageStatistics statistics;
+}
+);
+
+impl DataPageHeaderV2 {
+    // reader that skips decoding page statistics
+    fn read_thrift_without_stats<'a, R>(prot: &mut R) -> Result<Self>
+    where
+        R: ThriftCompactInputProtocol<'a>,
+    {
+        let mut num_values: Option<i32> = None;
+        let mut num_nulls: Option<i32> = None;
+        let mut num_rows: Option<i32> = None;
+        let mut encoding: Option<Encoding> = None;
+        let mut definition_levels_byte_length: Option<i32> = None;
+        let mut repetition_levels_byte_length: Option<i32> = None;
+        let mut is_compressed: Option<bool> = None;
+        let statistics: Option<PageStatistics> = None;
+        let mut last_field_id = 0i16;
+        loop {
+            let field_ident = prot.read_field_begin(last_field_id)?;
+            if field_ident.field_type == FieldType::Stop {
+                break;
+            }
+            match field_ident.id {
+                1 => {
+                    let val = i32::read_thrift(&mut *prot)?;
+                    num_values = Some(val);
+                }
+                2 => {
+                    let val = i32::read_thrift(&mut *prot)?;
+                    num_nulls = Some(val);
+                }
+                3 => {
+                    let val = i32::read_thrift(&mut *prot)?;
+                    num_rows = Some(val);
+                }
+                4 => {
+                    let val = Encoding::read_thrift(&mut *prot)?;
+                    encoding = Some(val);
+                }
+                5 => {
+                    let val = i32::read_thrift(&mut *prot)?;
+                    definition_levels_byte_length = Some(val);
+                }
+                6 => {
+                    let val = i32::read_thrift(&mut *prot)?;
+                    repetition_levels_byte_length = Some(val);
+                }
+                7 => {
+                    let val = field_ident.bool_val.unwrap();
+                    is_compressed = Some(val);
+                }
+                _ => {
+                    prot.skip(field_ident.field_type)?;
+                }
+            };
+            last_field_id = field_ident.id;
+        }
+        let Some(num_values) = num_values else {
+            return Err(ParquetError::General(
+                "Required field num_values is missing".to_owned(),
+            ));
+        };
+        let Some(num_nulls) = num_nulls else {
+            return Err(ParquetError::General(
+                "Required field num_nulls is missing".to_owned(),
+            ));
+        };
+        let Some(num_rows) = num_rows else {
+            return Err(ParquetError::General(
+                "Required field num_rows is missing".to_owned(),
+            ));
+        };
+        let Some(encoding) = encoding else {
+            return Err(ParquetError::General(
+                "Required field encoding is missing".to_owned(),
+            ));
+        };
+        let Some(definition_levels_byte_length) = definition_levels_byte_length else {
+            return Err(ParquetError::General(
+                "Required field definition_levels_byte_length is missing".to_owned(),
+            ));
+        };
+        let Some(repetition_levels_byte_length) = repetition_levels_byte_length else {
+            return Err(ParquetError::General(
+                "Required field repetition_levels_byte_length is missing".to_owned(),
+            ));
+        };
+        Ok(Self {
+            num_values,
+            num_nulls,
+            num_rows,
+            encoding,
+            definition_levels_byte_length,
+            repetition_levels_byte_length,
+            is_compressed,
+            statistics,
+        })
+    }
+}
+
+thrift_struct!(
+pub(crate) struct PageHeader {
+  /// the type of the page: indicates which of the *_header fields is set
+  1: required PageType r#type
+
+  /// Uncompressed page size in bytes (not including this header)
+  2: required i32 uncompressed_page_size
+
+  /// Compressed (and potentially encrypted) page size in bytes, not including this header
+  3: required i32 compressed_page_size
+
+  /// The 32-bit CRC checksum for the page, to be be calculated as follows:
+  4: optional i32 crc
+
+  // Headers for page specific data.  One only will be set.
+  5: optional DataPageHeader data_page_header;
+  6: optional IndexPageHeader index_page_header;
+  7: optional DictionaryPageHeader dictionary_page_header;
+  8: optional DataPageHeaderV2 data_page_header_v2;
+}
+);
+
+impl PageHeader {
+    // reader that skips reading page statistics. obtained by running
+    // `cargo expand -p parquet --all-features --lib file::metadata::thrift_gen`
+    // and modifying the impl of `read_thrift`
+    pub(crate) fn read_thrift_without_stats<'a, R>(prot: &mut R) -> Result<Self>
+    where
+        R: ThriftCompactInputProtocol<'a>,
+    {
+        let mut type_: Option<PageType> = None;
+        let mut uncompressed_page_size: Option<i32> = None;
+        let mut compressed_page_size: Option<i32> = None;
+        let mut crc: Option<i32> = None;
+        let mut data_page_header: Option<DataPageHeader> = None;
+        let mut index_page_header: Option<IndexPageHeader> = None;
+        let mut dictionary_page_header: Option<DictionaryPageHeader> = None;
+        let mut data_page_header_v2: Option<DataPageHeaderV2> = None;
+        let mut last_field_id = 0i16;
+        loop {
+            let field_ident = prot.read_field_begin(last_field_id)?;
+            if field_ident.field_type == FieldType::Stop {
+                break;
+            }
+            match field_ident.id {
+                1 => {
+                    let val = PageType::read_thrift(&mut *prot)?;
+                    type_ = Some(val);
+                }
+                2 => {
+                    let val = i32::read_thrift(&mut *prot)?;
+                    uncompressed_page_size = Some(val);
+                }
+                3 => {
+                    let val = i32::read_thrift(&mut *prot)?;
+                    compressed_page_size = Some(val);
+                }
+                4 => {
+                    let val = i32::read_thrift(&mut *prot)?;
+                    crc = Some(val);
+                }
+                5 => {
+                    let val = DataPageHeader::read_thrift_without_stats(&mut *prot)?;
+                    data_page_header = Some(val);
+                }
+                6 => {
+                    let val = IndexPageHeader::read_thrift(&mut *prot)?;
+                    index_page_header = Some(val);
+                }
+                7 => {
+                    let val = DictionaryPageHeader::read_thrift(&mut *prot)?;
+                    dictionary_page_header = Some(val);
+                }
+                8 => {
+                    let val = DataPageHeaderV2::read_thrift_without_stats(&mut *prot)?;
+                    data_page_header_v2 = Some(val);
+                }
+                _ => {
+                    prot.skip(field_ident.field_type)?;
+                }
+            };
+            last_field_id = field_ident.id;
+        }
+        let Some(type_) = type_ else {
+            return Err(ParquetError::General(
+                "Required field type_ is missing".to_owned(),
+            ));
+        };
+        let Some(uncompressed_page_size) = uncompressed_page_size else {
+            return Err(ParquetError::General(
+                "Required field uncompressed_page_size is missing".to_owned(),
+            ));
+        };
+        let Some(compressed_page_size) = compressed_page_size else {
+            return Err(ParquetError::General(
+                "Required field compressed_page_size is missing".to_owned(),
+            ));
+        };
+        Ok(Self {
+            r#type: type_,
+            uncompressed_page_size,
+            compressed_page_size,
+            crc,
+            data_page_header,
+            index_page_header,
+            dictionary_page_header,
+            data_page_header_v2,
+        })
+    }
+}
+
+/////////////////////////////////////////////////
+// helper functions for writing file meta data
+
+// serialize the bits of the column chunk needed for a thrift ColumnMetaData
+// struct ColumnMetaData {
+//   1: required Type type
+//   2: required list<Encoding> encodings
+//   3: required list<string> path_in_schema
+//   4: required CompressionCodec codec
+//   5: required i64 num_values
+//   6: required i64 total_uncompressed_size
+//   7: required i64 total_compressed_size
+//   8: optional list<KeyValue> key_value_metadata
+//   9: required i64 data_page_offset
+//   10: optional i64 index_page_offset
+//   11: optional i64 dictionary_page_offset
+//   12: optional Statistics statistics;
+//   13: optional list<PageEncodingStats> encoding_stats;
+//   14: optional i64 bloom_filter_offset;
+//   15: optional i32 bloom_filter_length;
+//   16: optional SizeStatistics size_statistics;
+//   17: optional GeospatialStatistics geospatial_statistics;
+// }
+pub(crate) fn serialize_column_meta_data<W: Write>(
+    column_chunk: &ColumnChunkMetaData,
+    w: &mut ThriftCompactOutputProtocol<W>,
+) -> Result<()> {
+    use crate::file::statistics::page_stats_to_thrift;
+
+    column_chunk.column_type().write_thrift_field(w, 1, 0)?;
+    column_chunk.encodings.write_thrift_field(w, 2, 1)?;
+    let path = column_chunk.column_descr.path().parts();
+    let path: Vec<&str> = path.iter().map(|v| v.as_str()).collect();
+    path.write_thrift_field(w, 3, 2)?;
+    column_chunk.compression.write_thrift_field(w, 4, 3)?;
+    column_chunk.num_values.write_thrift_field(w, 5, 4)?;
+    column_chunk
+        .total_uncompressed_size
+        .write_thrift_field(w, 6, 5)?;
+    column_chunk
+        .total_compressed_size
+        .write_thrift_field(w, 7, 6)?;
+    // no key_value_metadata here
+    let mut last_field_id = column_chunk.data_page_offset.write_thrift_field(w, 9, 7)?;
+    if let Some(index_page_offset) = column_chunk.index_page_offset {
+        last_field_id = index_page_offset.write_thrift_field(w, 10, last_field_id)?;
+    }
+    if let Some(dictionary_page_offset) = column_chunk.dictionary_page_offset {
+        last_field_id = dictionary_page_offset.write_thrift_field(w, 11, last_field_id)?;
+    }
+    // PageStatistics is the same as thrift Statistics, but writable
+    let stats = page_stats_to_thrift(column_chunk.statistics());
+    if let Some(stats) = stats {
+        last_field_id = stats.write_thrift_field(w, 12, last_field_id)?;
+    }
+    if let Some(page_encoding_stats) = column_chunk.page_encoding_stats() {
+        last_field_id = page_encoding_stats.write_thrift_field(w, 13, last_field_id)?;
+    }
+    if let Some(bloom_filter_offset) = column_chunk.bloom_filter_offset {
+        last_field_id = bloom_filter_offset.write_thrift_field(w, 14, last_field_id)?;
+    }
+    if let Some(bloom_filter_length) = column_chunk.bloom_filter_length {
+        last_field_id = bloom_filter_length.write_thrift_field(w, 15, last_field_id)?;
+    }
+
+    // SizeStatistics
+    let size_stats = if column_chunk.unencoded_byte_array_data_bytes.is_some()
+        || column_chunk.repetition_level_histogram.is_some()
+        || column_chunk.definition_level_histogram.is_some()
+    {
+        let repetition_level_histogram = column_chunk
+            .repetition_level_histogram()
+            .map(|hist| hist.clone().into_inner());
+
+        let definition_level_histogram = column_chunk
+            .definition_level_histogram()
+            .map(|hist| hist.clone().into_inner());
+
+        Some(SizeStatistics {
+            unencoded_byte_array_data_bytes: column_chunk.unencoded_byte_array_data_bytes,
+            repetition_level_histogram,
+            definition_level_histogram,
+        })
+    } else {
+        None
+    };
+    if let Some(size_stats) = size_stats {
+        last_field_id = size_stats.write_thrift_field(w, 16, last_field_id)?;
+    }
+
+    if let Some(geo_stats) = column_chunk.geo_statistics() {
+        geo_stats.write_thrift_field(w, 17, last_field_id)?;
+    }
+
+    w.write_struct_end()
+}
+
+// temp struct used for writing
+pub(crate) struct FileMeta<'a> {
+    pub(crate) file_metadata: &'a crate::file::metadata::FileMetaData,
+    pub(crate) row_groups: &'a Vec<RowGroupMetaData>,
+    pub(crate) encryption_algorithm: Option<EncryptionAlgorithm>,
+    pub(crate) footer_signing_key_metadata: Option<Vec<u8>>,
+}
+
+impl<'a> WriteThrift for FileMeta<'a> {
+    const ELEMENT_TYPE: ElementType = ElementType::Struct;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        self.file_metadata
+            .version
+            .write_thrift_field(writer, 1, 0)?;
+
+        // field 2 is schema. do depth-first traversal of tree, converting to SchemaElement and
+        // writing along the way.
+        let root = self.file_metadata.schema_descr().root_schema_ptr();
+        let schema_len = num_nodes(&root)?;
+        writer.write_field_begin(FieldType::List, 2, 1)?;
+        writer.write_list_begin(ElementType::Struct, schema_len)?;
+        // recursively write Type nodes as SchemaElements
+        write_schema(&root, writer)?;
+
+        self.file_metadata
+            .num_rows
+            .write_thrift_field(writer, 3, 2)?;
+
+        // this will call RowGroupMetaData::write_thrift
+        let mut last_field_id = self.row_groups.write_thrift_field(writer, 4, 3)?;
+
+        if let Some(kv_metadata) = self.file_metadata.key_value_metadata() {
+            last_field_id = kv_metadata.write_thrift_field(writer, 5, last_field_id)?;
+        }
+        if let Some(created_by) = self.file_metadata.created_by() {
+            last_field_id = created_by.write_thrift_field(writer, 6, last_field_id)?;
+        }
+        if let Some(column_orders) = self.file_metadata.column_orders() {
+            last_field_id = column_orders.write_thrift_field(writer, 7, last_field_id)?;
+        }
+        if let Some(algo) = self.encryption_algorithm.as_ref() {
+            last_field_id = algo.write_thrift_field(writer, 8, last_field_id)?;
+        }
+        if let Some(key) = self.footer_signing_key_metadata.as_ref() {
+            key.as_slice()
+                .write_thrift_field(writer, 9, last_field_id)?;
+        }
+
+        writer.write_struct_end()
+    }
+}
+
+fn write_schema<W: Write>(
+    schema: &TypePtr,
+    writer: &mut ThriftCompactOutputProtocol<W>,
+) -> Result<()> {
+    if !schema.is_group() {
+        return Err(general_err!("Root schema must be Group type"));
+    }
+    write_schema_helper(schema, writer)
+}
+
+fn write_schema_helper<W: Write>(
+    node: &TypePtr,
+    writer: &mut ThriftCompactOutputProtocol<W>,
+) -> Result<()> {
+    match node.as_ref() {
+        crate::schema::types::Type::PrimitiveType {
+            basic_info,
+            physical_type,
+            type_length,
+            scale,
+            precision,
+        } => {
+            let element = SchemaElement {
+                r#type: Some(*physical_type),
+                type_length: if *type_length >= 0 {
+                    Some(*type_length)
+                } else {
+                    None
+                },
+                repetition_type: Some(basic_info.repetition()),
+                name: basic_info.name(),
+                num_children: None,
+                converted_type: match basic_info.converted_type() {
+                    ConvertedType::NONE => None,
+                    other => Some(other),
+                },
+                scale: if *scale >= 0 { Some(*scale) } else { None },
+                precision: if *precision >= 0 {
+                    Some(*precision)
+                } else {
+                    None
+                },
+                field_id: if basic_info.has_id() {
+                    Some(basic_info.id())
+                } else {
+                    None
+                },
+                logical_type: basic_info.logical_type(),
+            };
+            element.write_thrift(writer)
+        }
+        crate::schema::types::Type::GroupType { basic_info, fields } => {
+            let repetition = if basic_info.has_repetition() {
+                Some(basic_info.repetition())
+            } else {
+                None
+            };
+
+            let element = SchemaElement {
+                r#type: None,
+                type_length: None,
+                repetition_type: repetition,
+                name: basic_info.name(),
+                num_children: Some(fields.len().try_into()?),
+                converted_type: match basic_info.converted_type() {
+                    ConvertedType::NONE => None,
+                    other => Some(other),
+                },
+                scale: None,
+                precision: None,
+                field_id: if basic_info.has_id() {
+                    Some(basic_info.id())
+                } else {
+                    None
+                },
+                logical_type: basic_info.logical_type(),
+            };
+
+            element.write_thrift(writer)?;
+
+            // Add child elements for a group
+            for field in fields {
+                write_schema_helper(field, writer)?;
+            }
+            Ok(())
+        }
+    }
+}
+
+// struct RowGroup {
+//   1: required list<ColumnChunk> columns
+//   2: required i64 total_byte_size
+//   3: required i64 num_rows
+//   4: optional list<SortingColumn> sorting_columns
+//   5: optional i64 file_offset
+//   6: optional i64 total_compressed_size
+//   7: optional i16 ordinal
+// }
+impl WriteThrift for RowGroupMetaData {
+    const ELEMENT_TYPE: ElementType = ElementType::Struct;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        // this will call ColumnChunkMetaData::write_thrift
+        self.columns.write_thrift_field(writer, 1, 0)?;
+        self.total_byte_size.write_thrift_field(writer, 2, 1)?;
+        let mut last_field_id = self.num_rows.write_thrift_field(writer, 3, 2)?;
+        if let Some(sorting_columns) = self.sorting_columns() {
+            last_field_id = sorting_columns.write_thrift_field(writer, 4, last_field_id)?;
+        }
+        if let Some(file_offset) = self.file_offset() {
+            last_field_id = file_offset.write_thrift_field(writer, 5, last_field_id)?;
+        }
+        // this is optional, but we'll always write it
+        last_field_id = self
+            .compressed_size()
+            .write_thrift_field(writer, 6, last_field_id)?;
+        if let Some(ordinal) = self.ordinal() {
+            ordinal.write_thrift_field(writer, 7, last_field_id)?;
+        }
+        writer.write_struct_end()
+    }
+}
+
+// struct ColumnChunk {
+//   1: optional string file_path
+//   2: required i64 file_offset = 0
+//   3: optional ColumnMetaData meta_data
+//   4: optional i64 offset_index_offset
+//   5: optional i32 offset_index_length
+//   6: optional i64 column_index_offset
+//   7: optional i32 column_index_length
+//   8: optional ColumnCryptoMetaData crypto_metadata
+//   9: optional binary encrypted_column_metadata
+// }
+impl WriteThrift for ColumnChunkMetaData {
+    const ELEMENT_TYPE: ElementType = ElementType::Struct;
+
+    #[allow(unused_assignments)]
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        let mut last_field_id = 0i16;
+        if let Some(file_path) = self.file_path() {
+            last_field_id = file_path.write_thrift_field(writer, 1, last_field_id)?;
+        }
+        last_field_id = self
+            .file_offset()
+            .write_thrift_field(writer, 2, last_field_id)?;
+
+        #[cfg(feature = "encryption")]
+        {
+            // only write the ColumnMetaData if we haven't already encrypted it
+            if self.encrypted_column_metadata.is_none() {
+                writer.write_field_begin(FieldType::Struct, 3, last_field_id)?;
+                serialize_column_meta_data(self, writer)?;
+                last_field_id = 3;
+            }
+        }
+        #[cfg(not(feature = "encryption"))]
+        {
+            // always write the ColumnMetaData
+            writer.write_field_begin(FieldType::Struct, 3, last_field_id)?;
+            serialize_column_meta_data(self, writer)?;
+            last_field_id = 3;
+        }
+
+        if let Some(offset_idx_off) = self.offset_index_offset() {
+            last_field_id = offset_idx_off.write_thrift_field(writer, 4, last_field_id)?;
+        }
+        if let Some(offset_idx_len) = self.offset_index_length() {
+            last_field_id = offset_idx_len.write_thrift_field(writer, 5, last_field_id)?;
+        }
+        if let Some(column_idx_off) = self.column_index_offset() {
+            last_field_id = column_idx_off.write_thrift_field(writer, 6, last_field_id)?;
+        }
+        if let Some(column_idx_len) = self.column_index_length() {
+            last_field_id = column_idx_len.write_thrift_field(writer, 7, last_field_id)?;
+        }
+        #[cfg(feature = "encryption")]
+        {
+            if let Some(crypto_metadata) = self.crypto_metadata() {
+                last_field_id = crypto_metadata.write_thrift_field(writer, 8, last_field_id)?;
+            }
+            if let Some(encrypted_meta) = self.encrypted_column_metadata.as_ref() {
+                encrypted_meta
+                    .as_slice()
+                    .write_thrift_field(writer, 9, last_field_id)?;
+            }
+        }
+
+        writer.write_struct_end()
+    }
+}
+
+// struct GeospatialStatistics {
+//   1: optional BoundingBox bbox;
+//   2: optional list<i32> geospatial_types;
+// }
+impl WriteThrift for crate::geospatial::statistics::GeospatialStatistics {
+    const ELEMENT_TYPE: ElementType = ElementType::Struct;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        let mut last_field_id = 0i16;
+        if let Some(bbox) = self.bounding_box() {
+            last_field_id = bbox.write_thrift_field(writer, 1, last_field_id)?;
+        }
+        if let Some(geo_types) = self.geospatial_types() {
+            geo_types.write_thrift_field(writer, 2, last_field_id)?;
+        }
+
+        writer.write_struct_end()
+    }
+}
+
+impl WriteThriftField for crate::geospatial::statistics::GeospatialStatistics {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::Struct, field_id, last_field_id)?;
+        self.write_thrift(writer)?;
+        Ok(field_id)
+    }
+}
+
+// struct BoundingBox {
+//   1: required double xmin;
+//   2: required double xmax;
+//   3: required double ymin;
+//   4: required double ymax;
+//   5: optional double zmin;
+//   6: optional double zmax;
+//   7: optional double mmin;
+//   8: optional double mmax;
+// }
+impl WriteThrift for crate::geospatial::bounding_box::BoundingBox {
+    const ELEMENT_TYPE: ElementType = ElementType::Struct;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        self.get_xmin().write_thrift_field(writer, 1, 0)?;
+        self.get_xmax().write_thrift_field(writer, 2, 1)?;
+        self.get_ymin().write_thrift_field(writer, 3, 2)?;
+        let mut last_field_id = self.get_ymax().write_thrift_field(writer, 4, 3)?;
+
+        if let Some(zmin) = self.get_zmin() {
+            last_field_id = zmin.write_thrift_field(writer, 5, last_field_id)?;
+        }
+        if let Some(zmax) = self.get_zmax() {
+            last_field_id = zmax.write_thrift_field(writer, 6, last_field_id)?;
+        }
+        if let Some(mmin) = self.get_mmin() {
+            last_field_id = mmin.write_thrift_field(writer, 7, last_field_id)?;
+        }
+        if let Some(mmax) = self.get_mmax() {
+            mmax.write_thrift_field(writer, 8, last_field_id)?;
+        }
+
+        writer.write_struct_end()
+    }
+}
+
+impl WriteThriftField for crate::geospatial::bounding_box::BoundingBox {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::Struct, field_id, last_field_id)?;
+        self.write_thrift(writer)?;
+        Ok(field_id)
+    }
+}
+
+#[cfg(test)]
+pub(crate) mod tests {
+    use crate::errors::Result;
+    use crate::file::metadata::thrift_gen::{
+        convert_column, convert_row_group, write_schema, BoundingBox, ColumnChunk, RowGroup,
+        SchemaElement,
+    };
+    use crate::file::metadata::{ColumnChunkMetaData, RowGroupMetaData};
+    use crate::parquet_thrift::tests::test_roundtrip;
+    use crate::parquet_thrift::{
+        read_thrift_vec, ElementType, ReadThrift, ThriftCompactOutputProtocol,
+        ThriftSliceInputProtocol,
+    };
+    use crate::schema::types::{
+        num_nodes, parquet_schema_from_array, ColumnDescriptor, SchemaDescriptor, TypePtr,
+    };
+    use std::sync::Arc;
+
+    // for testing. decode thrift encoded RowGroup
+    pub(crate) fn read_row_group(
+        buf: &mut [u8],
+        schema_descr: Arc<SchemaDescriptor>,
+    ) -> Result<RowGroupMetaData> {
+        let mut reader = ThriftSliceInputProtocol::new(buf);
+        let rg = RowGroup::read_thrift(&mut reader)?;
+        convert_row_group(rg, schema_descr)
+    }
+
+    pub(crate) fn read_column_chunk(
+        buf: &mut [u8],
+        column_descr: Arc<ColumnDescriptor>,
+    ) -> Result<ColumnChunkMetaData> {
+        let mut reader = ThriftSliceInputProtocol::new(buf);
+        let cc = ColumnChunk::read_thrift(&mut reader)?;
+        convert_column(cc, column_descr)
+    }
+
+    pub(crate) fn roundtrip_schema(schema: TypePtr) -> Result<TypePtr> {
+        let num_nodes = num_nodes(&schema)?;
+        let mut buf = Vec::new();
+        let mut writer = ThriftCompactOutputProtocol::new(&mut buf);
+
+        // kick off writing list
+        writer.write_list_begin(ElementType::Struct, num_nodes)?;
+
+        // write SchemaElements
+        write_schema(&schema, &mut writer)?;
+
+        let mut prot = ThriftSliceInputProtocol::new(&buf);
+        let se: Vec<SchemaElement> = read_thrift_vec(&mut prot)?;
+        parquet_schema_from_array(se)
+    }
+
+    pub(crate) fn schema_to_buf(schema: &TypePtr) -> Result<Vec<u8>> {
+        let num_nodes = num_nodes(schema)?;
+        let mut buf = Vec::new();
+        let mut writer = ThriftCompactOutputProtocol::new(&mut buf);
+
+        // kick off writing list
+        writer.write_list_begin(ElementType::Struct, num_nodes)?;
+
+        // write SchemaElements
+        write_schema(schema, &mut writer)?;
+        Ok(buf)
+    }
+
+    pub(crate) fn buf_to_schema_list<'a>(buf: &'a mut Vec<u8>) -> Result<Vec<SchemaElement<'a>>> {
+        let mut prot = ThriftSliceInputProtocol::new(buf.as_mut_slice());
+        read_thrift_vec(&mut prot)
+    }
+
+    #[test]
+    fn test_bounding_box_roundtrip() {
+        test_roundtrip(BoundingBox {
+            xmin: 0.1.into(),
+            xmax: 10.3.into(),
+            ymin: 0.001.into(),
+            ymax: 128.5.into(),
+            zmin: None,
+            zmax: None,
+            mmin: None,
+            mmax: None,
+        });
+
+        test_roundtrip(BoundingBox {
+            xmin: 0.1.into(),
+            xmax: 10.3.into(),
+            ymin: 0.001.into(),
+            ymax: 128.5.into(),
+            zmin: Some(11.0.into()),
+            zmax: Some(1300.0.into()),
+            mmin: None,
+            mmax: None,
+        });
+
+        test_roundtrip(BoundingBox {
+            xmin: 0.1.into(),
+            xmax: 10.3.into(),
+            ymin: 0.001.into(),
+            ymax: 128.5.into(),
+            zmin: Some(11.0.into()),
+            zmax: Some(1300.0.into()),
+            mmin: Some(3.7.into()),
+            mmax: Some(42.0.into()),
+        });
+    }
+}
diff --git a/parquet/src/file/metadata/writer.rs b/parquet/src/file/metadata/writer.rs
index 5bb59b6b2faf..97d008e17308 100644
--- a/parquet/src/file/metadata/writer.rs
+++ b/parquet/src/file/metadata/writer.rs
@@ -15,40 +15,49 @@
 // specific language governing permissions and limitations
 // under the License.
 
+use crate::file::metadata::thrift_gen::{EncryptionAlgorithm, FileMeta};
+use crate::file::metadata::{
+    ColumnChunkMetaData, ParquetColumnIndex, ParquetOffsetIndex, RowGroupMetaData,
+};
+use crate::schema::types::{SchemaDescPtr, SchemaDescriptor};
+use crate::{
+    basic::ColumnOrder,
+    file::metadata::{FileMetaData, ParquetMetaDataBuilder},
+};
 #[cfg(feature = "encryption")]
-use crate::encryption::{
-    encrypt::{
-        encrypt_object, encrypt_object_to_vec, write_signed_plaintext_object, FileEncryptor,
+use crate::{
+    encryption::{
+        encrypt::{encrypt_thrift_object, write_signed_plaintext_thrift_object, FileEncryptor},
+        modules::{create_footer_aad, create_module_aad, ModuleType},
     },
-    modules::{create_footer_aad, create_module_aad, ModuleType},
+    file::column_crypto_metadata::ColumnCryptoMetaData,
+    file::metadata::thrift_gen::{AesGcmV1, FileCryptoMetaData},
+};
+use crate::{errors::Result, file::page_index::column_index::ColumnIndexMetaData};
+
+use crate::{
+    file::writer::{get_file_magic, TrackedWrite},
+    parquet_thrift::WriteThrift,
+};
+use crate::{
+    file::{
+        metadata::{KeyValue, ParquetMetaData},
+        page_index::offset_index::OffsetIndexMetaData,
+    },
+    parquet_thrift::ThriftCompactOutputProtocol,
 };
-#[cfg(feature = "encryption")]
-use crate::errors::ParquetError;
-use crate::errors::Result;
-use crate::file::metadata::{KeyValue, ParquetMetaData};
-use crate::file::page_index::index::Index;
-use crate::file::writer::{get_file_magic, TrackedWrite};
-use crate::format::EncryptionAlgorithm;
-#[cfg(feature = "encryption")]
-use crate::format::{AesGcmV1, ColumnCryptoMetaData};
-use crate::format::{ColumnChunk, ColumnIndex, FileMetaData, OffsetIndex, RowGroup};
-use crate::schema::types;
-use crate::schema::types::{SchemaDescPtr, SchemaDescriptor, TypePtr};
-use crate::thrift::TSerializable;
 use std::io::Write;
 use std::sync::Arc;
-use thrift::protocol::TCompactOutputProtocol;
 
 /// Writes `crate::file::metadata` structures to a thrift encoded byte stream
 ///
 /// See [`ParquetMetaDataWriter`] for background and example.
 pub(crate) struct ThriftMetadataWriter<'a, W: Write> {
     buf: &'a mut TrackedWrite<W>,
-    schema: &'a TypePtr,
     schema_descr: &'a SchemaDescPtr,
-    row_groups: Vec<RowGroup>,
-    column_indexes: Option<&'a [Vec<Option<ColumnIndex>>]>,
-    offset_indexes: Option<&'a [Vec<Option<OffsetIndex>>]>,
+    row_groups: Vec<RowGroupMetaData>,
+    column_indexes: Option<Vec<Vec<Option<ColumnIndexMetaData>>>>,
+    offset_indexes: Option<Vec<Vec<Option<OffsetIndexMetaData>>>>,
     key_value_metadata: Option<Vec<KeyValue>>,
     created_by: Option<String>,
     object_writer: MetadataObjectWriter,
@@ -61,7 +70,10 @@ impl<'a, W: Write> ThriftMetadataWriter<'a, W> {
     /// Note: also updates the `ColumnChunk::offset_index_offset` and
     /// `ColumnChunk::offset_index_length` to reflect the position and length
     /// of the serialized offset indexes.
-    fn write_offset_indexes(&mut self, offset_indexes: &[Vec<Option<OffsetIndex>>]) -> Result<()> {
+    fn write_offset_indexes(
+        &mut self,
+        offset_indexes: &[Vec<Option<OffsetIndexMetaData>>],
+    ) -> Result<()> {
         // iter row group
         // iter each column
         // write offset index to the file
@@ -91,7 +103,10 @@ impl<'a, W: Write> ThriftMetadataWriter<'a, W> {
     /// Note: also updates the `ColumnChunk::column_index_offset` and
     /// `ColumnChunk::column_index_length` to reflect the position and length
     /// of the serialized column indexes.
-    fn write_column_indexes(&mut self, column_indexes: &[Vec<Option<ColumnIndex>>]) -> Result<()> {
+    fn write_column_indexes(
+        &mut self,
+        column_indexes: &[Vec<Option<ColumnIndexMetaData>>],
+    ) -> Result<()> {
         // iter row group
         // iter each column
         // write column index to the file
@@ -117,14 +132,17 @@ impl<'a, W: Write> ThriftMetadataWriter<'a, W> {
     }
 
     /// Assembles and writes the final metadata to self.buf
-    pub fn finish(mut self) -> Result<crate::format::FileMetaData> {
+    pub fn finish(mut self) -> Result<ParquetMetaData> {
         let num_rows = self.row_groups.iter().map(|x| x.num_rows).sum();
 
+        let column_indexes = std::mem::take(&mut self.column_indexes);
+        let offset_indexes = std::mem::take(&mut self.offset_indexes);
+
         // Write column indexes and offset indexes
-        if let Some(column_indexes) = self.column_indexes {
+        if let Some(column_indexes) = column_indexes.as_ref() {
             self.write_column_indexes(column_indexes)?;
         }
-        if let Some(offset_indexes) = self.offset_indexes {
+        if let Some(offset_indexes) = offset_indexes.as_ref() {
             self.write_offset_indexes(offset_indexes)?;
         }
 
@@ -133,27 +151,44 @@ impl<'a, W: Write> ThriftMetadataWriter<'a, W> {
         // for all leaf nodes.
         // Even if the column has an undefined sort order, such as INTERVAL, this
         // is still technically the defined TYPEORDER so it should still be set.
-        let column_orders = (0..self.schema_descr.num_columns())
-            .map(|_| crate::format::ColumnOrder::TYPEORDER(crate::format::TypeDefinedOrder {}))
+        let column_orders = self
+            .schema_descr
+            .columns()
+            .iter()
+            .map(|col| {
+                let sort_order = ColumnOrder::get_sort_order(
+                    col.logical_type(),
+                    col.converted_type(),
+                    col.physical_type(),
+                );
+                ColumnOrder::TYPE_DEFINED_ORDER(sort_order)
+            })
             .collect();
+
         // This field is optional, perhaps in cases where no min/max fields are set
         // in any Statistics or ColumnIndex object in the whole file.
         // But for simplicity we always set this field.
         let column_orders = Some(column_orders);
+
         let (row_groups, unencrypted_row_groups) = self
             .object_writer
             .apply_row_group_encryption(self.row_groups)?;
 
         let (encryption_algorithm, footer_signing_key_metadata) =
             self.object_writer.get_plaintext_footer_crypto_metadata();
-        let mut file_metadata = FileMetaData {
+
+        let file_metadata = FileMetaData::new(
+            self.writer_version,
             num_rows,
-            row_groups,
-            key_value_metadata: self.key_value_metadata.clone(),
-            version: self.writer_version,
-            schema: types::to_thrift(self.schema.as_ref())?,
-            created_by: self.created_by.clone(),
+            self.created_by,
+            self.key_value_metadata,
+            self.schema_descr.clone(),
             column_orders,
+        );
+
+        let file_meta = FileMeta {
+            file_metadata: &file_metadata,
+            row_groups: &row_groups,
             encryption_algorithm,
             footer_signing_key_metadata,
         };
@@ -161,7 +196,7 @@ impl<'a, W: Write> ThriftMetadataWriter<'a, W> {
         // Write file metadata
         let start_pos = self.buf.bytes_written();
         self.object_writer
-            .write_file_metadata(&file_metadata, &mut self.buf)?;
+            .write_file_metadata(&file_meta, &mut self.buf)?;
         let end_pos = self.buf.bytes_written();
 
         // Write footer
@@ -170,28 +205,49 @@ impl<'a, W: Write> ThriftMetadataWriter<'a, W> {
         self.buf.write_all(&metadata_len.to_le_bytes())?;
         self.buf.write_all(self.object_writer.get_file_magic())?;
 
-        if let Some(row_groups) = unencrypted_row_groups {
-            // If row group metadata was encrypted, we replace the encrypted row groups with
-            // unencrypted metadata before it is returned to users. This allows the metadata
-            // to be usable for retrieving the row group statistics for example, without users
-            // needing to decrypt the metadata.
-            file_metadata.row_groups = row_groups;
-        }
+        // If row group metadata was encrypted, we replace the encrypted row groups with
+        // unencrypted metadata before it is returned to users. This allows the metadata
+        // to be usable for retrieving the row group statistics for example, without users
+        // needing to decrypt the metadata.
+        let mut builder = ParquetMetaDataBuilder::new(file_metadata);
+
+        builder = match unencrypted_row_groups {
+            Some(rg) => builder.set_row_groups(rg),
+            None => builder.set_row_groups(row_groups),
+        };
+
+        let column_indexes: Option<ParquetColumnIndex> = column_indexes.map(|ovvi| {
+            ovvi.into_iter()
+                .map(|vi| {
+                    vi.into_iter()
+                        .map(|oi| oi.unwrap_or(ColumnIndexMetaData::NONE))
+                        .collect()
+                })
+                .collect()
+        });
+
+        // FIXME(ets): this will panic if there's a missing index.
+        let offset_indexes: Option<ParquetOffsetIndex> = offset_indexes.map(|ovvi| {
+            ovvi.into_iter()
+                .map(|vi| vi.into_iter().map(|oi| oi.unwrap()).collect())
+                .collect()
+        });
+
+        builder = builder.set_column_index(column_indexes);
+        builder = builder.set_offset_index(offset_indexes);
 
-        Ok(file_metadata)
+        Ok(builder.build())
     }
 
     pub fn new(
         buf: &'a mut TrackedWrite<W>,
-        schema: &'a TypePtr,
         schema_descr: &'a SchemaDescPtr,
-        row_groups: Vec<RowGroup>,
+        row_groups: Vec<RowGroupMetaData>,
         created_by: Option<String>,
         writer_version: i32,
     ) -> Self {
         Self {
             buf,
-            schema,
             schema_descr,
             row_groups,
             column_indexes: None,
@@ -203,12 +259,18 @@ impl<'a, W: Write> ThriftMetadataWriter<'a, W> {
         }
     }
 
-    pub fn with_column_indexes(mut self, column_indexes: &'a [Vec<Option<ColumnIndex>>]) -> Self {
+    pub fn with_column_indexes(
+        mut self,
+        column_indexes: Vec<Vec<Option<ColumnIndexMetaData>>>,
+    ) -> Self {
         self.column_indexes = Some(column_indexes);
         self
     }
 
-    pub fn with_offset_indexes(mut self, offset_indexes: &'a [Vec<Option<OffsetIndex>>]) -> Self {
+    pub fn with_offset_indexes(
+        mut self,
+        offset_indexes: Vec<Vec<Option<OffsetIndexMetaData>>>,
+    ) -> Self {
         self.offset_indexes = Some(offset_indexes);
         self
     }
@@ -255,8 +317,10 @@ impl<'a, W: Write> ThriftMetadataWriter<'a, W> {
 /// 4. Length of encoded `FileMetaData` (4 bytes, little endian)
 /// 5. Parquet Magic Bytes (4 bytes)
 ///
-/// [`FileMetaData`]: crate::format::FileMetaData
+/// [`FileMetaData`]: https://github.com/apache/parquet-format/tree/master?tab=readme-ov-file#metadata
 /// [`ColumnChunkMetaData`]: crate::file::metadata::ColumnChunkMetaData
+/// [`ColumnIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
+/// [`OffsetIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
 ///
 /// ```text
 /// ┌──────────────────────┐
@@ -335,12 +399,7 @@ impl<'a, W: Write> ParquetMetaDataWriter<'a, W> {
         let schema_descr = Arc::new(SchemaDescriptor::new(schema.clone()));
         let created_by = file_metadata.created_by().map(str::to_string);
 
-        let row_groups = self
-            .metadata
-            .row_groups()
-            .iter()
-            .map(|rg| rg.to_thrift())
-            .collect::<Vec<_>>();
+        let row_groups = self.metadata.row_groups.clone();
 
         let key_value_metadata = file_metadata.key_value_metadata().cloned();
 
@@ -349,14 +408,20 @@ impl<'a, W: Write> ParquetMetaDataWriter<'a, W> {
 
         let mut encoder = ThriftMetadataWriter::new(
             &mut self.buf,
-            &schema,
             &schema_descr,
             row_groups,
             created_by,
             file_metadata.version(),
         );
-        encoder = encoder.with_column_indexes(&column_indexes);
-        encoder = encoder.with_offset_indexes(&offset_indexes);
+
+        if let Some(column_indexes) = column_indexes {
+            encoder = encoder.with_column_indexes(column_indexes);
+        }
+
+        if let Some(offset_indexes) = offset_indexes {
+            encoder = encoder.with_offset_indexes(offset_indexes);
+        }
+
         if let Some(key_value_metadata) = key_value_metadata {
             encoder = encoder.with_key_value_metadata(key_value_metadata);
         }
@@ -365,58 +430,38 @@ impl<'a, W: Write> ParquetMetaDataWriter<'a, W> {
         Ok(())
     }
 
-    fn convert_column_indexes(&self) -> Vec<Vec<Option<ColumnIndex>>> {
-        if let Some(row_group_column_indexes) = self.metadata.column_index() {
-            (0..self.metadata.row_groups().len())
-                .map(|rg_idx| {
-                    let column_indexes = &row_group_column_indexes[rg_idx];
-                    column_indexes
-                        .iter()
-                        .map(|column_index| match column_index {
-                            Index::NONE => None,
-                            Index::BOOLEAN(column_index) => Some(column_index.to_thrift()),
-                            Index::BYTE_ARRAY(column_index) => Some(column_index.to_thrift()),
-                            Index::DOUBLE(column_index) => Some(column_index.to_thrift()),
-                            Index::FIXED_LEN_BYTE_ARRAY(column_index) => {
-                                Some(column_index.to_thrift())
-                            }
-                            Index::FLOAT(column_index) => Some(column_index.to_thrift()),
-                            Index::INT32(column_index) => Some(column_index.to_thrift()),
-                            Index::INT64(column_index) => Some(column_index.to_thrift()),
-                            Index::INT96(column_index) => Some(column_index.to_thrift()),
-                        })
-                        .collect()
-                })
-                .collect()
-        } else {
-            // make a None for each row group, for each column
-            self.metadata
-                .row_groups()
-                .iter()
-                .map(|rg| std::iter::repeat_n(None, rg.columns().len()).collect())
-                .collect()
-        }
+    fn convert_column_indexes(&self) -> Option<Vec<Vec<Option<ColumnIndexMetaData>>>> {
+        // TODO(ets): we're converting from ParquetColumnIndex to vec<vec<option>>,
+        // but then converting back to ParquetColumnIndex in the end. need to unify this.
+        self.metadata
+            .column_index()
+            .map(|row_group_column_indexes| {
+                (0..self.metadata.row_groups().len())
+                    .map(|rg_idx| {
+                        let column_indexes = &row_group_column_indexes[rg_idx];
+                        column_indexes
+                            .iter()
+                            .map(|column_index| Some(column_index.clone()))
+                            .collect()
+                    })
+                    .collect()
+            })
     }
 
-    fn convert_offset_index(&self) -> Vec<Vec<Option<OffsetIndex>>> {
-        if let Some(row_group_offset_indexes) = self.metadata.offset_index() {
-            (0..self.metadata.row_groups().len())
-                .map(|rg_idx| {
-                    let offset_indexes = &row_group_offset_indexes[rg_idx];
-                    offset_indexes
-                        .iter()
-                        .map(|offset_index| Some(offset_index.to_thrift()))
-                        .collect()
-                })
-                .collect()
-        } else {
-            // make a None for each row group, for each column
-            self.metadata
-                .row_groups()
-                .iter()
-                .map(|rg| std::iter::repeat_n(None, rg.columns().len()).collect())
-                .collect()
-        }
+    fn convert_offset_index(&self) -> Option<Vec<Vec<Option<OffsetIndexMetaData>>>> {
+        self.metadata
+            .offset_index()
+            .map(|row_group_offset_indexes| {
+                (0..self.metadata.row_groups().len())
+                    .map(|rg_idx| {
+                        let offset_indexes = &row_group_offset_indexes[rg_idx];
+                        offset_indexes
+                            .iter()
+                            .map(|offset_index| Some(offset_index.clone()))
+                            .collect()
+                    })
+                    .collect()
+            })
     }
 }
 
@@ -428,9 +473,9 @@ struct MetadataObjectWriter {
 
 impl MetadataObjectWriter {
     #[inline]
-    fn write_object(object: &impl TSerializable, sink: impl Write) -> Result<()> {
-        let mut protocol = TCompactOutputProtocol::new(sink);
-        object.write_to_out_protocol(&mut protocol)?;
+    fn write_thrift_object(object: &impl WriteThrift, sink: impl Write) -> Result<()> {
+        let mut protocol = ThriftCompactOutputProtocol::new(sink);
+        object.write_thrift(&mut protocol)?;
         Ok(())
     }
 }
@@ -439,39 +484,39 @@ impl MetadataObjectWriter {
 #[cfg(not(feature = "encryption"))]
 impl MetadataObjectWriter {
     /// Write [`FileMetaData`] in Thrift format
-    fn write_file_metadata(&self, file_metadata: &FileMetaData, sink: impl Write) -> Result<()> {
-        Self::write_object(file_metadata, sink)
+    fn write_file_metadata(&self, file_metadata: &FileMeta, sink: impl Write) -> Result<()> {
+        Self::write_thrift_object(file_metadata, sink)
     }
 
     /// Write a column [`OffsetIndex`] in Thrift format
     fn write_offset_index(
         &self,
-        offset_index: &OffsetIndex,
-        _column_chunk: &ColumnChunk,
+        offset_index: &OffsetIndexMetaData,
+        _column_chunk: &ColumnChunkMetaData,
         _row_group_idx: usize,
         _column_idx: usize,
         sink: impl Write,
     ) -> Result<()> {
-        Self::write_object(offset_index, sink)
+        Self::write_thrift_object(offset_index, sink)
     }
 
     /// Write a column [`ColumnIndex`] in Thrift format
     fn write_column_index(
         &self,
-        column_index: &ColumnIndex,
-        _column_chunk: &ColumnChunk,
+        column_index: &ColumnIndexMetaData,
+        _column_chunk: &ColumnChunkMetaData,
         _row_group_idx: usize,
         _column_idx: usize,
         sink: impl Write,
     ) -> Result<()> {
-        Self::write_object(column_index, sink)
+        Self::write_thrift_object(column_index, sink)
     }
 
     /// No-op implementation of row-group metadata encryption
     fn apply_row_group_encryption(
         &self,
-        row_groups: Vec<RowGroup>,
-    ) -> Result<(Vec<RowGroup>, Option<Vec<RowGroup>>)> {
+        row_groups: Vec<RowGroupMetaData>,
+    ) -> Result<(Vec<RowGroupMetaData>, Option<Vec<RowGroupMetaData>>)> {
         Ok((row_groups, None))
     }
 
@@ -497,43 +542,43 @@ impl MetadataObjectWriter {
     }
 
     /// Write [`FileMetaData`] in Thrift format, possibly encrypting it if required
-    fn write_file_metadata(
-        &self,
-        file_metadata: &FileMetaData,
-        mut sink: impl Write,
-    ) -> Result<()> {
+    ///
+    /// [`FileMetaData`]: https://github.com/apache/parquet-format/tree/master?tab=readme-ov-file#metadata
+    fn write_file_metadata(&self, file_metadata: &FileMeta, mut sink: impl Write) -> Result<()> {
         match self.file_encryptor.as_ref() {
             Some(file_encryptor) if file_encryptor.properties().encrypt_footer() => {
                 // First write FileCryptoMetadata
                 let crypto_metadata = Self::file_crypto_metadata(file_encryptor)?;
-                let mut protocol = TCompactOutputProtocol::new(&mut sink);
-                crypto_metadata.write_to_out_protocol(&mut protocol)?;
+                let mut protocol = ThriftCompactOutputProtocol::new(&mut sink);
+                crypto_metadata.write_thrift(&mut protocol)?;
 
                 // Then write encrypted footer
                 let aad = create_footer_aad(file_encryptor.file_aad())?;
                 let mut encryptor = file_encryptor.get_footer_encryptor()?;
-                encrypt_object(file_metadata, &mut encryptor, &mut sink, &aad)
+                encrypt_thrift_object(file_metadata, &mut encryptor, &mut sink, &aad)
             }
             Some(file_encryptor) if file_metadata.encryption_algorithm.is_some() => {
                 let aad = create_footer_aad(file_encryptor.file_aad())?;
                 let mut encryptor = file_encryptor.get_footer_encryptor()?;
-                write_signed_plaintext_object(file_metadata, &mut encryptor, &mut sink, &aad)
+                write_signed_plaintext_thrift_object(file_metadata, &mut encryptor, &mut sink, &aad)
             }
-            _ => Self::write_object(file_metadata, &mut sink),
+            _ => Self::write_thrift_object(file_metadata, &mut sink),
         }
     }
 
     /// Write a column [`OffsetIndex`] in Thrift format, possibly encrypting it if required
+    ///
+    /// [`OffsetIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
     fn write_offset_index(
         &self,
-        offset_index: &OffsetIndex,
-        column_chunk: &ColumnChunk,
+        offset_index: &OffsetIndexMetaData,
+        column_chunk: &ColumnChunkMetaData,
         row_group_idx: usize,
         column_idx: usize,
         sink: impl Write,
     ) -> Result<()> {
         match &self.file_encryptor {
-            Some(file_encryptor) => Self::write_object_with_encryption(
+            Some(file_encryptor) => Self::write_thrift_object_with_encryption(
                 offset_index,
                 sink,
                 file_encryptor,
@@ -542,21 +587,23 @@ impl MetadataObjectWriter {
                 row_group_idx,
                 column_idx,
             ),
-            None => Self::write_object(offset_index, sink),
+            None => Self::write_thrift_object(offset_index, sink),
         }
     }
 
     /// Write a column [`ColumnIndex`] in Thrift format, possibly encrypting it if required
+    ///
+    /// [`ColumnIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
     fn write_column_index(
         &self,
-        column_index: &ColumnIndex,
-        column_chunk: &ColumnChunk,
+        column_index: &ColumnIndexMetaData,
+        column_chunk: &ColumnChunkMetaData,
         row_group_idx: usize,
         column_idx: usize,
         sink: impl Write,
     ) -> Result<()> {
         match &self.file_encryptor {
-            Some(file_encryptor) => Self::write_object_with_encryption(
+            Some(file_encryptor) => Self::write_thrift_object_with_encryption(
                 column_index,
                 sink,
                 file_encryptor,
@@ -565,7 +612,7 @@ impl MetadataObjectWriter {
                 row_group_idx,
                 column_idx,
             ),
-            None => Self::write_object(column_index, sink),
+            None => Self::write_thrift_object(column_index, sink),
         }
     }
 
@@ -574,8 +621,8 @@ impl MetadataObjectWriter {
     /// and possibly unencrypted metadata to be returned to clients if data was encrypted.
     fn apply_row_group_encryption(
         &self,
-        row_groups: Vec<RowGroup>,
-    ) -> Result<(Vec<RowGroup>, Option<Vec<RowGroup>>)> {
+        row_groups: Vec<RowGroupMetaData>,
+    ) -> Result<(Vec<RowGroupMetaData>, Option<Vec<RowGroupMetaData>>)> {
         match &self.file_encryptor {
             Some(file_encryptor) => {
                 let unencrypted_row_groups = row_groups.clone();
@@ -595,25 +642,16 @@ impl MetadataObjectWriter {
         )
     }
 
-    fn write_object_with_encryption(
-        object: &impl TSerializable,
+    fn write_thrift_object_with_encryption(
+        object: &impl WriteThrift,
         mut sink: impl Write,
         file_encryptor: &FileEncryptor,
-        column_metadata: &ColumnChunk,
+        column_metadata: &ColumnChunkMetaData,
         module_type: ModuleType,
         row_group_index: usize,
         column_index: usize,
     ) -> Result<()> {
-        let column_path_vec = &column_metadata
-            .meta_data
-            .as_ref()
-            .ok_or_else(|| {
-                general_err!(
-                    "Column metadata not set for column {} when encrypting object",
-                    column_index
-                )
-            })?
-            .path_in_schema;
+        let column_path_vec = column_metadata.column_path().as_ref();
 
         let joined_column_path;
         let column_path = if column_path_vec.len() == 1 {
@@ -624,6 +662,8 @@ impl MetadataObjectWriter {
         };
 
         if file_encryptor.is_column_encrypted(column_path) {
+            use crate::encryption::encrypt::encrypt_thrift_object;
+
             let aad = create_module_aad(
                 file_encryptor.file_aad(),
                 module_type,
@@ -632,9 +672,9 @@ impl MetadataObjectWriter {
                 None,
             )?;
             let mut encryptor = file_encryptor.get_column_encryptor(column_path)?;
-            encrypt_object(object, &mut encryptor, &mut sink, &aad)
+            encrypt_thrift_object(object, &mut encryptor, &mut sink, &aad)
         } else {
-            Self::write_object(object, sink)
+            Self::write_thrift_object(object, sink)
         }
     }
 
@@ -660,36 +700,34 @@ impl MetadataObjectWriter {
             .aad_prefix()
             .map(|_| !file_encryptor.properties().store_aad_prefix());
         let aad_prefix = if file_encryptor.properties().store_aad_prefix() {
-            file_encryptor.properties().aad_prefix().cloned()
+            file_encryptor.properties().aad_prefix()
         } else {
             None
         };
-        EncryptionAlgorithm::AESGCMV1(AesGcmV1 {
-            aad_prefix,
+        EncryptionAlgorithm::AES_GCM_V1(AesGcmV1 {
+            aad_prefix: aad_prefix.cloned(),
             aad_file_unique: Some(file_encryptor.aad_file_unique().clone()),
             supply_aad_prefix,
         })
     }
 
-    fn file_crypto_metadata(
-        file_encryptor: &FileEncryptor,
-    ) -> Result<crate::format::FileCryptoMetaData> {
+    fn file_crypto_metadata(file_encryptor: &'_ FileEncryptor) -> Result<FileCryptoMetaData<'_>> {
         let properties = file_encryptor.properties();
-        Ok(crate::format::FileCryptoMetaData {
+        Ok(FileCryptoMetaData {
             encryption_algorithm: Self::encryption_algorithm_from_encryptor(file_encryptor),
-            key_metadata: properties.footer_key_metadata().cloned(),
+            key_metadata: properties.footer_key_metadata().map(|v| v.as_slice()),
         })
     }
 
     fn encrypt_row_groups(
-        row_groups: Vec<RowGroup>,
+        row_groups: Vec<RowGroupMetaData>,
         file_encryptor: &Arc<FileEncryptor>,
-    ) -> Result<Vec<RowGroup>> {
+    ) -> Result<Vec<RowGroupMetaData>> {
         row_groups
             .into_iter()
             .enumerate()
             .map(|(rg_idx, mut rg)| {
-                let cols: Result<Vec<ColumnChunk>> = rg
+                let cols: Result<Vec<ColumnChunkMetaData>> = rg
                     .columns
                     .into_iter()
                     .enumerate()
@@ -705,26 +743,24 @@ impl MetadataObjectWriter {
 
     /// Apply column encryption to column chunk metadata
     fn encrypt_column_chunk(
-        mut column_chunk: ColumnChunk,
+        mut column_chunk: ColumnChunkMetaData,
         file_encryptor: &Arc<FileEncryptor>,
         row_group_index: usize,
         column_index: usize,
-    ) -> Result<ColumnChunk> {
+    ) -> Result<ColumnChunkMetaData> {
         // Column crypto metadata should have already been set when the column was created.
         // Here we apply the encryption by encrypting the column metadata if required.
-        match column_chunk.crypto_metadata.as_ref() {
+        match column_chunk.column_crypto_metadata.as_ref() {
             None => {}
-            Some(ColumnCryptoMetaData::ENCRYPTIONWITHFOOTERKEY(_)) => {
+            Some(ColumnCryptoMetaData::ENCRYPTION_WITH_FOOTER_KEY) => {
                 // When uniform encryption is used the footer is already encrypted,
                 // so the column chunk does not need additional encryption.
             }
-            Some(ColumnCryptoMetaData::ENCRYPTIONWITHCOLUMNKEY(col_key)) => {
+            Some(ColumnCryptoMetaData::ENCRYPTION_WITH_COLUMN_KEY(col_key)) => {
+                use crate::file::metadata::thrift_gen::serialize_column_meta_data;
+
                 let column_path = col_key.path_in_schema.join(".");
                 let mut column_encryptor = file_encryptor.get_column_encryptor(&column_path)?;
-                let meta_data = column_chunk
-                    .meta_data
-                    .take()
-                    .ok_or_else(|| general_err!("Column metadata not set for encryption"))?;
                 let aad = create_module_aad(
                     file_encryptor.file_aad(),
                     ModuleType::ColumnMetaData,
@@ -732,10 +768,15 @@ impl MetadataObjectWriter {
                     column_index,
                     None,
                 )?;
-                let ciphertext = encrypt_object_to_vec(&meta_data, &mut column_encryptor, &aad)?;
+                // create temp ColumnMetaData that we can encrypt
+                let mut buffer: Vec<u8> = vec![];
+                {
+                    let mut prot = ThriftCompactOutputProtocol::new(&mut buffer);
+                    serialize_column_meta_data(&column_chunk, &mut prot)?;
+                }
+                let ciphertext = column_encryptor.encrypt(&buffer, &aad)?;
 
                 column_chunk.encrypted_column_metadata = Some(ciphertext);
-                debug_assert!(column_chunk.meta_data.is_none());
             }
         }
 
diff --git a/parquet/src/file/mod.rs b/parquet/src/file/mod.rs
index 976b36dc2358..09036cd7d7b9 100644
--- a/parquet/src/file/mod.rs
+++ b/parquet/src/file/mod.rs
@@ -100,7 +100,6 @@
 #[cfg(feature = "encryption")]
 pub mod column_crypto_metadata;
 pub mod metadata;
-pub mod page_encoding_stats;
 pub mod page_index;
 pub mod properties;
 pub mod reader;
diff --git a/parquet/src/file/page_encoding_stats.rs b/parquet/src/file/page_encoding_stats.rs
deleted file mode 100644
index edb6a8fa9d4c..000000000000
--- a/parquet/src/file/page_encoding_stats.rs
+++ /dev/null
@@ -1,77 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-//! Per-page encoding information.
-
-use crate::basic::{Encoding, PageType};
-use crate::errors::Result;
-use crate::format::{
-    Encoding as TEncoding, PageEncodingStats as TPageEncodingStats, PageType as TPageType,
-};
-
-/// PageEncodingStats for a column chunk and data page.
-#[derive(Clone, Debug, PartialEq, Eq)]
-pub struct PageEncodingStats {
-    /// the page type (data/dic/...)
-    pub page_type: PageType,
-    /// encoding of the page
-    pub encoding: Encoding,
-    /// number of pages of this type with this encoding
-    pub count: i32,
-}
-
-/// Converts Thrift definition into `PageEncodingStats`.
-pub fn try_from_thrift(thrift_encoding_stats: &TPageEncodingStats) -> Result<PageEncodingStats> {
-    let page_type = PageType::try_from(thrift_encoding_stats.page_type)?;
-    let encoding = Encoding::try_from(thrift_encoding_stats.encoding)?;
-    let count = thrift_encoding_stats.count;
-
-    Ok(PageEncodingStats {
-        page_type,
-        encoding,
-        count,
-    })
-}
-
-/// Converts `PageEncodingStats` into Thrift definition.
-pub fn to_thrift(encoding_stats: &PageEncodingStats) -> TPageEncodingStats {
-    let page_type = TPageType::from(encoding_stats.page_type);
-    let encoding = TEncoding::from(encoding_stats.encoding);
-    let count = encoding_stats.count;
-
-    TPageEncodingStats {
-        page_type,
-        encoding,
-        count,
-    }
-}
-
-#[cfg(test)]
-mod tests {
-    use super::*;
-
-    #[test]
-    fn test_page_encoding_stats_from_thrift() {
-        let stats = PageEncodingStats {
-            page_type: PageType::DATA_PAGE,
-            encoding: Encoding::PLAIN,
-            count: 1,
-        };
-
-        assert_eq!(try_from_thrift(&to_thrift(&stats)).unwrap(), stats);
-    }
-}
diff --git a/parquet/src/file/page_index/column_index.rs b/parquet/src/file/page_index/column_index.rs
new file mode 100644
index 000000000000..2aa155a2825d
--- /dev/null
+++ b/parquet/src/file/page_index/column_index.rs
@@ -0,0 +1,750 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! [`ColumnIndexMetaData`] structures holding decoded [`ColumnIndex`] information
+//!
+//! [`ColumnIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
+//!
+
+use crate::{
+    data_type::{ByteArray, FixedLenByteArray},
+    errors::{ParquetError, Result},
+    parquet_thrift::{
+        ElementType, FieldType, ThriftCompactOutputProtocol, WriteThrift, WriteThriftField,
+    },
+};
+use std::ops::Deref;
+
+use crate::{
+    basic::BoundaryOrder,
+    data_type::{private::ParquetValueType, Int96},
+    file::page_index::index_reader::ThriftColumnIndex,
+};
+
+/// Common bits of the column index
+#[derive(Debug, Clone, PartialEq)]
+pub struct ColumnIndex {
+    pub(crate) null_pages: Vec<bool>,
+    pub(crate) boundary_order: BoundaryOrder,
+    pub(crate) null_counts: Option<Vec<i64>>,
+    pub(crate) repetition_level_histograms: Option<Vec<i64>>,
+    pub(crate) definition_level_histograms: Option<Vec<i64>>,
+}
+
+impl ColumnIndex {
+    /// Returns the number of pages
+    pub fn num_pages(&self) -> u64 {
+        self.null_pages.len() as u64
+    }
+
+    /// Returns the number of null values in the page indexed by `idx`
+    ///
+    /// Returns `None` if no null counts have been set in the index
+    pub fn null_count(&self, idx: usize) -> Option<i64> {
+        self.null_counts.as_ref().map(|nc| nc[idx])
+    }
+
+    /// Returns the repetition level histogram for the page indexed by `idx`
+    pub fn repetition_level_histogram(&self, idx: usize) -> Option<&[i64]> {
+        if let Some(rep_hists) = self.repetition_level_histograms.as_ref() {
+            let num_lvls = rep_hists.len() / self.num_pages() as usize;
+            let start = num_lvls * idx;
+            Some(&rep_hists[start..start + num_lvls])
+        } else {
+            None
+        }
+    }
+
+    /// Returns the definition level histogram for the page indexed by `idx`
+    pub fn definition_level_histogram(&self, idx: usize) -> Option<&[i64]> {
+        if let Some(def_hists) = self.definition_level_histograms.as_ref() {
+            let num_lvls = def_hists.len() / self.num_pages() as usize;
+            let start = num_lvls * idx;
+            Some(&def_hists[start..start + num_lvls])
+        } else {
+            None
+        }
+    }
+
+    /// Returns whether the page indexed by `idx` consists of all null values
+    pub fn is_null_page(&self, idx: usize) -> bool {
+        self.null_pages[idx]
+    }
+}
+
+/// Column index for primitive types
+#[derive(Debug, Clone, PartialEq)]
+pub struct PrimitiveColumnIndex<T> {
+    pub(crate) column_index: ColumnIndex,
+    pub(crate) min_values: Vec<T>,
+    pub(crate) max_values: Vec<T>,
+}
+
+impl<T: ParquetValueType> PrimitiveColumnIndex<T> {
+    pub(crate) fn try_new(
+        null_pages: Vec<bool>,
+        boundary_order: BoundaryOrder,
+        null_counts: Option<Vec<i64>>,
+        repetition_level_histograms: Option<Vec<i64>>,
+        definition_level_histograms: Option<Vec<i64>>,
+        min_bytes: Vec<&[u8]>,
+        max_bytes: Vec<&[u8]>,
+    ) -> Result<Self> {
+        let len = null_pages.len();
+
+        let mut min_values = Vec::with_capacity(len);
+        let mut max_values = Vec::with_capacity(len);
+
+        for (i, is_null) in null_pages.iter().enumerate().take(len) {
+            if !is_null {
+                let min = min_bytes[i];
+                min_values.push(T::try_from_le_slice(min)?);
+
+                let max = max_bytes[i];
+                max_values.push(T::try_from_le_slice(max)?);
+            } else {
+                // need placeholders
+                min_values.push(Default::default());
+                max_values.push(Default::default());
+            }
+        }
+
+        Ok(Self {
+            column_index: ColumnIndex {
+                null_pages,
+                boundary_order,
+                null_counts,
+                repetition_level_histograms,
+                definition_level_histograms,
+            },
+            min_values,
+            max_values,
+        })
+    }
+
+    pub(super) fn try_from_thrift(index: ThriftColumnIndex) -> Result<Self> {
+        Self::try_new(
+            index.null_pages,
+            index.boundary_order,
+            index.null_counts,
+            index.repetition_level_histograms,
+            index.definition_level_histograms,
+            index.min_values,
+            index.max_values,
+        )
+    }
+}
+
+impl<T> PrimitiveColumnIndex<T> {
+    /// Returns an array containing the min values for each page.
+    ///
+    /// Values in the returned slice are only valid if [`ColumnIndex::is_null_page()`]
+    /// is `false` for the same index.
+    pub fn min_values(&self) -> &[T] {
+        &self.min_values
+    }
+
+    /// Returns an array containing the max values for each page.
+    ///
+    /// Values in the returned slice are only valid if [`ColumnIndex::is_null_page()`]
+    /// is `false` for the same index.
+    pub fn max_values(&self) -> &[T] {
+        &self.max_values
+    }
+
+    /// Returns an iterator over the min values.
+    ///
+    /// Values may be `None` when [`ColumnIndex::is_null_page()`] is `true`.
+    pub fn min_values_iter(&self) -> impl Iterator<Item = Option<&T>> {
+        self.min_values.iter().enumerate().map(|(i, min)| {
+            if self.is_null_page(i) {
+                None
+            } else {
+                Some(min)
+            }
+        })
+    }
+
+    /// Returns an iterator over the max values.
+    ///
+    /// Values may be `None` when [`ColumnIndex::is_null_page()`] is `true`.
+    pub fn max_values_iter(&self) -> impl Iterator<Item = Option<&T>> {
+        self.max_values.iter().enumerate().map(|(i, min)| {
+            if self.is_null_page(i) {
+                None
+            } else {
+                Some(min)
+            }
+        })
+    }
+
+    /// Returns the min value for the page indexed by `idx`
+    ///
+    /// It is `None` when all values are null
+    pub fn min_value(&self, idx: usize) -> Option<&T> {
+        if self.null_pages[idx] {
+            None
+        } else {
+            Some(&self.min_values[idx])
+        }
+    }
+
+    /// Returns the max value for the page indexed by `idx`
+    ///
+    /// It is `None` when all values are null
+    pub fn max_value(&self, idx: usize) -> Option<&T> {
+        if self.null_pages[idx] {
+            None
+        } else {
+            Some(&self.max_values[idx])
+        }
+    }
+}
+
+impl<T> Deref for PrimitiveColumnIndex<T> {
+    type Target = ColumnIndex;
+
+    fn deref(&self) -> &Self::Target {
+        &self.column_index
+    }
+}
+
+impl<T: ParquetValueType> WriteThrift for PrimitiveColumnIndex<T> {
+    const ELEMENT_TYPE: ElementType = ElementType::Struct;
+    fn write_thrift<W: std::io::Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+    ) -> Result<()> {
+        self.null_pages.write_thrift_field(writer, 1, 0)?;
+
+        // need to handle min/max manually
+        let len = self.null_pages.len();
+        writer.write_field_begin(FieldType::List, 2, 1)?;
+        writer.write_list_begin(ElementType::Binary, len)?;
+        for i in 0..len {
+            let min = self.min_value(i).map(|m| m.as_bytes()).unwrap_or(&[]);
+            min.write_thrift(writer)?;
+        }
+        writer.write_field_begin(FieldType::List, 3, 2)?;
+        writer.write_list_begin(ElementType::Binary, len)?;
+        for i in 0..len {
+            let max = self.max_value(i).map(|m| m.as_bytes()).unwrap_or(&[]);
+            max.write_thrift(writer)?;
+        }
+        let mut last_field_id = self.boundary_order.write_thrift_field(writer, 4, 3)?;
+        if self.null_counts.is_some() {
+            last_field_id =
+                self.null_counts
+                    .as_ref()
+                    .unwrap()
+                    .write_thrift_field(writer, 5, last_field_id)?;
+        }
+        if self.repetition_level_histograms.is_some() {
+            last_field_id = self
+                .repetition_level_histograms
+                .as_ref()
+                .unwrap()
+                .write_thrift_field(writer, 6, last_field_id)?;
+        }
+        if self.definition_level_histograms.is_some() {
+            self.definition_level_histograms
+                .as_ref()
+                .unwrap()
+                .write_thrift_field(writer, 7, last_field_id)?;
+        }
+        writer.write_struct_end()
+    }
+}
+
+/// Column index for byte arrays (fixed length and variable)
+#[derive(Debug, Clone, PartialEq)]
+pub struct ByteArrayColumnIndex {
+    pub(crate) column_index: ColumnIndex,
+    // raw bytes for min and max values
+    pub(crate) min_bytes: Vec<u8>,
+    pub(crate) min_offsets: Vec<usize>,
+    pub(crate) max_bytes: Vec<u8>,
+    pub(crate) max_offsets: Vec<usize>,
+}
+
+impl ByteArrayColumnIndex {
+    pub(crate) fn try_new(
+        null_pages: Vec<bool>,
+        boundary_order: BoundaryOrder,
+        null_counts: Option<Vec<i64>>,
+        repetition_level_histograms: Option<Vec<i64>>,
+        definition_level_histograms: Option<Vec<i64>>,
+        min_values: Vec<&[u8]>,
+        max_values: Vec<&[u8]>,
+    ) -> Result<Self> {
+        let len = null_pages.len();
+
+        let min_len = min_values.iter().map(|&v| v.len()).sum();
+        let max_len = max_values.iter().map(|&v| v.len()).sum();
+        let mut min_bytes = vec![0u8; min_len];
+        let mut max_bytes = vec![0u8; max_len];
+
+        let mut min_offsets = vec![0usize; len + 1];
+        let mut max_offsets = vec![0usize; len + 1];
+
+        let mut min_pos = 0;
+        let mut max_pos = 0;
+
+        for (i, is_null) in null_pages.iter().enumerate().take(len) {
+            if !is_null {
+                let min = min_values[i];
+                let dst = &mut min_bytes[min_pos..min_pos + min.len()];
+                dst.copy_from_slice(min);
+                min_offsets[i] = min_pos;
+                min_pos += min.len();
+
+                let max = max_values[i];
+                let dst = &mut max_bytes[max_pos..max_pos + max.len()];
+                dst.copy_from_slice(max);
+                max_offsets[i] = max_pos;
+                max_pos += max.len();
+            } else {
+                min_offsets[i] = min_pos;
+                max_offsets[i] = max_pos;
+            }
+        }
+
+        min_offsets[len] = min_pos;
+        max_offsets[len] = max_pos;
+
+        Ok(Self {
+            column_index: ColumnIndex {
+                null_pages,
+                boundary_order,
+                null_counts,
+                repetition_level_histograms,
+                definition_level_histograms,
+            },
+            min_bytes,
+            min_offsets,
+            max_bytes,
+            max_offsets,
+        })
+    }
+
+    pub(super) fn try_from_thrift(index: ThriftColumnIndex) -> Result<Self> {
+        Self::try_new(
+            index.null_pages,
+            index.boundary_order,
+            index.null_counts,
+            index.repetition_level_histograms,
+            index.definition_level_histograms,
+            index.min_values,
+            index.max_values,
+        )
+    }
+
+    /// Returns the min value for the page indexed by `idx`
+    ///
+    /// It is `None` when all values are null
+    pub fn min_value(&self, idx: usize) -> Option<&[u8]> {
+        if self.null_pages[idx] {
+            None
+        } else {
+            let start = self.min_offsets[idx];
+            let end = self.min_offsets[idx + 1];
+            Some(&self.min_bytes[start..end])
+        }
+    }
+
+    /// Returns the max value for the page indexed by `idx`
+    ///
+    /// It is `None` when all values are null
+    pub fn max_value(&self, idx: usize) -> Option<&[u8]> {
+        if self.null_pages[idx] {
+            None
+        } else {
+            let start = self.max_offsets[idx];
+            let end = self.max_offsets[idx + 1];
+            Some(&self.max_bytes[start..end])
+        }
+    }
+
+    /// Returns an iterator over the min values.
+    ///
+    /// Values may be `None` when [`ColumnIndex::is_null_page()`] is `true`.
+    pub fn min_values_iter(&self) -> impl Iterator<Item = Option<&[u8]>> {
+        (0..self.num_pages() as usize).map(|i| {
+            if self.is_null_page(i) {
+                None
+            } else {
+                self.min_value(i)
+            }
+        })
+    }
+
+    /// Returns an iterator over the max values.
+    ///
+    /// Values may be `None` when [`ColumnIndex::is_null_page()`] is `true`.
+    pub fn max_values_iter(&self) -> impl Iterator<Item = Option<&[u8]>> {
+        (0..self.num_pages() as usize).map(|i| {
+            if self.is_null_page(i) {
+                None
+            } else {
+                self.max_value(i)
+            }
+        })
+    }
+}
+
+impl Deref for ByteArrayColumnIndex {
+    type Target = ColumnIndex;
+
+    fn deref(&self) -> &Self::Target {
+        &self.column_index
+    }
+}
+
+impl WriteThrift for ByteArrayColumnIndex {
+    const ELEMENT_TYPE: ElementType = ElementType::Struct;
+    fn write_thrift<W: std::io::Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+    ) -> Result<()> {
+        self.null_pages.write_thrift_field(writer, 1, 0)?;
+
+        // need to handle min/max manually
+        let len = self.null_pages.len();
+        writer.write_field_begin(FieldType::List, 2, 1)?;
+        writer.write_list_begin(ElementType::Binary, len)?;
+        for i in 0..len {
+            let min = self.min_value(i).unwrap_or(&[]);
+            min.write_thrift(writer)?;
+        }
+        writer.write_field_begin(FieldType::List, 3, 2)?;
+        writer.write_list_begin(ElementType::Binary, len)?;
+        for i in 0..len {
+            let max = self.max_value(i).unwrap_or(&[]);
+            max.write_thrift(writer)?;
+        }
+        let mut last_field_id = self.boundary_order.write_thrift_field(writer, 4, 3)?;
+        if self.null_counts.is_some() {
+            last_field_id =
+                self.null_counts
+                    .as_ref()
+                    .unwrap()
+                    .write_thrift_field(writer, 5, last_field_id)?;
+        }
+        if self.repetition_level_histograms.is_some() {
+            last_field_id = self
+                .repetition_level_histograms
+                .as_ref()
+                .unwrap()
+                .write_thrift_field(writer, 6, last_field_id)?;
+        }
+        if self.definition_level_histograms.is_some() {
+            self.definition_level_histograms
+                .as_ref()
+                .unwrap()
+                .write_thrift_field(writer, 7, last_field_id)?;
+        }
+        writer.write_struct_end()
+    }
+}
+
+// Macro to generate getter functions for ColumnIndexMetaData.
+macro_rules! colidx_enum_func {
+    ($self:ident, $func:ident, $arg:ident) => {{
+        match *$self {
+            Self::BOOLEAN(ref typed) => typed.$func($arg),
+            Self::INT32(ref typed) => typed.$func($arg),
+            Self::INT64(ref typed) => typed.$func($arg),
+            Self::INT96(ref typed) => typed.$func($arg),
+            Self::FLOAT(ref typed) => typed.$func($arg),
+            Self::DOUBLE(ref typed) => typed.$func($arg),
+            Self::BYTE_ARRAY(ref typed) => typed.$func($arg),
+            Self::FIXED_LEN_BYTE_ARRAY(ref typed) => typed.$func($arg),
+            _ => panic!(concat!(
+                "Cannot call ",
+                stringify!($func),
+                " on ColumnIndexMetaData::NONE"
+            )),
+        }
+    }};
+    ($self:ident, $func:ident) => {{
+        match *$self {
+            Self::BOOLEAN(ref typed) => typed.$func(),
+            Self::INT32(ref typed) => typed.$func(),
+            Self::INT64(ref typed) => typed.$func(),
+            Self::INT96(ref typed) => typed.$func(),
+            Self::FLOAT(ref typed) => typed.$func(),
+            Self::DOUBLE(ref typed) => typed.$func(),
+            Self::BYTE_ARRAY(ref typed) => typed.$func(),
+            Self::FIXED_LEN_BYTE_ARRAY(ref typed) => typed.$func(),
+            _ => panic!(concat!(
+                "Cannot call ",
+                stringify!($func),
+                " on ColumnIndexMetaData::NONE"
+            )),
+        }
+    }};
+}
+
+/// Parsed [`ColumnIndex`] information for a Parquet file.
+///
+/// See [`ParquetColumnIndex`] for more information.
+///
+/// [`ParquetColumnIndex`]: crate::file::metadata::ParquetColumnIndex
+/// [`ColumnIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
+#[derive(Debug, Clone, PartialEq)]
+#[allow(non_camel_case_types)]
+pub enum ColumnIndexMetaData {
+    /// Sometimes reading page index from parquet file
+    /// will only return pageLocations without min_max index,
+    /// `NONE` represents this lack of index information
+    NONE,
+    /// Boolean type index
+    BOOLEAN(PrimitiveColumnIndex<bool>),
+    /// 32-bit integer type index
+    INT32(PrimitiveColumnIndex<i32>),
+    /// 64-bit integer type index
+    INT64(PrimitiveColumnIndex<i64>),
+    /// 96-bit integer type (timestamp) index
+    INT96(PrimitiveColumnIndex<Int96>),
+    /// 32-bit floating point type index
+    FLOAT(PrimitiveColumnIndex<f32>),
+    /// 64-bit floating point type index
+    DOUBLE(PrimitiveColumnIndex<f64>),
+    /// Byte array type index
+    BYTE_ARRAY(ByteArrayColumnIndex),
+    /// Fixed length byte array type index
+    FIXED_LEN_BYTE_ARRAY(ByteArrayColumnIndex),
+}
+
+impl ColumnIndexMetaData {
+    /// Return min/max elements inside ColumnIndex are ordered or not.
+    pub fn is_sorted(&self) -> bool {
+        // 0:UNORDERED, 1:ASCENDING ,2:DESCENDING,
+        if let Some(order) = self.get_boundary_order() {
+            order != BoundaryOrder::UNORDERED
+        } else {
+            false
+        }
+    }
+
+    /// Get boundary_order of this page index.
+    pub fn get_boundary_order(&self) -> Option<BoundaryOrder> {
+        match self {
+            Self::NONE => None,
+            Self::BOOLEAN(index) => Some(index.boundary_order),
+            Self::INT32(index) => Some(index.boundary_order),
+            Self::INT64(index) => Some(index.boundary_order),
+            Self::INT96(index) => Some(index.boundary_order),
+            Self::FLOAT(index) => Some(index.boundary_order),
+            Self::DOUBLE(index) => Some(index.boundary_order),
+            Self::BYTE_ARRAY(index) => Some(index.boundary_order),
+            Self::FIXED_LEN_BYTE_ARRAY(index) => Some(index.boundary_order),
+        }
+    }
+
+    /// Returns array of null counts, one per page.
+    ///
+    /// Returns `None` if now null counts have been set in the index
+    pub fn null_counts(&self) -> Option<&Vec<i64>> {
+        match self {
+            Self::NONE => None,
+            Self::BOOLEAN(index) => index.null_counts.as_ref(),
+            Self::INT32(index) => index.null_counts.as_ref(),
+            Self::INT64(index) => index.null_counts.as_ref(),
+            Self::INT96(index) => index.null_counts.as_ref(),
+            Self::FLOAT(index) => index.null_counts.as_ref(),
+            Self::DOUBLE(index) => index.null_counts.as_ref(),
+            Self::BYTE_ARRAY(index) => index.null_counts.as_ref(),
+            Self::FIXED_LEN_BYTE_ARRAY(index) => index.null_counts.as_ref(),
+        }
+    }
+
+    /// Returns the number of pages
+    pub fn num_pages(&self) -> u64 {
+        colidx_enum_func!(self, num_pages)
+    }
+
+    /// Returns the number of null values in the page indexed by `idx`
+    ///
+    /// Returns `None` if no null counts have been set in the index
+    pub fn null_count(&self, idx: usize) -> Option<i64> {
+        colidx_enum_func!(self, null_count, idx)
+    }
+
+    /// Returns the repetition level histogram for the page indexed by `idx`
+    pub fn repetition_level_histogram(&self, idx: usize) -> Option<&[i64]> {
+        colidx_enum_func!(self, repetition_level_histogram, idx)
+    }
+
+    /// Returns the definition level histogram for the page indexed by `idx`
+    pub fn definition_level_histogram(&self, idx: usize) -> Option<&[i64]> {
+        colidx_enum_func!(self, definition_level_histogram, idx)
+    }
+
+    /// Returns whether the page indexed by `idx` consists of all null values
+    pub fn is_null_page(&self, idx: usize) -> bool {
+        colidx_enum_func!(self, is_null_page, idx)
+    }
+}
+
+/// Provides iterators over min and max values of a [`ColumnIndexMetaData`]
+pub trait ColumnIndexIterators {
+    /// Can be one of `bool`, `i32`, `i64`, `Int96`, `f32`, `f64`, [`ByteArray`],
+    /// or [`FixedLenByteArray`]
+    type Item;
+
+    /// Return iterator over the min values for the index
+    fn min_values_iter(colidx: &ColumnIndexMetaData) -> impl Iterator<Item = Option<Self::Item>>;
+
+    /// Return iterator over the max values for the index
+    fn max_values_iter(colidx: &ColumnIndexMetaData) -> impl Iterator<Item = Option<Self::Item>>;
+}
+
+macro_rules! column_index_iters {
+    ($item: ident, $variant: ident, $conv:expr) => {
+        impl ColumnIndexIterators for $item {
+            type Item = $item;
+
+            fn min_values_iter(
+                colidx: &ColumnIndexMetaData,
+            ) -> impl Iterator<Item = Option<Self::Item>> {
+                if let ColumnIndexMetaData::$variant(index) = colidx {
+                    index.min_values_iter().map($conv)
+                } else {
+                    panic!(concat!("Wrong type for ", stringify!($item), " iterator"))
+                }
+            }
+
+            fn max_values_iter(
+                colidx: &ColumnIndexMetaData,
+            ) -> impl Iterator<Item = Option<Self::Item>> {
+                if let ColumnIndexMetaData::$variant(index) = colidx {
+                    index.max_values_iter().map($conv)
+                } else {
+                    panic!(concat!("Wrong type for ", stringify!($item), " iterator"))
+                }
+            }
+        }
+    };
+}
+
+column_index_iters!(bool, BOOLEAN, |v| v.copied());
+column_index_iters!(i32, INT32, |v| v.copied());
+column_index_iters!(i64, INT64, |v| v.copied());
+column_index_iters!(Int96, INT96, |v| v.copied());
+column_index_iters!(f32, FLOAT, |v| v.copied());
+column_index_iters!(f64, DOUBLE, |v| v.copied());
+column_index_iters!(ByteArray, BYTE_ARRAY, |v| v
+    .map(|v| ByteArray::from(v.to_owned())));
+column_index_iters!(FixedLenByteArray, FIXED_LEN_BYTE_ARRAY, |v| v
+    .map(|v| FixedLenByteArray::from(v.to_owned())));
+
+impl WriteThrift for ColumnIndexMetaData {
+    const ELEMENT_TYPE: ElementType = ElementType::Struct;
+
+    fn write_thrift<W: std::io::Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+    ) -> Result<()> {
+        match self {
+            ColumnIndexMetaData::BOOLEAN(index) => index.write_thrift(writer),
+            ColumnIndexMetaData::INT32(index) => index.write_thrift(writer),
+            ColumnIndexMetaData::INT64(index) => index.write_thrift(writer),
+            ColumnIndexMetaData::INT96(index) => index.write_thrift(writer),
+            ColumnIndexMetaData::FLOAT(index) => index.write_thrift(writer),
+            ColumnIndexMetaData::DOUBLE(index) => index.write_thrift(writer),
+            ColumnIndexMetaData::BYTE_ARRAY(index) => index.write_thrift(writer),
+            ColumnIndexMetaData::FIXED_LEN_BYTE_ARRAY(index) => index.write_thrift(writer),
+            _ => Err(general_err!("Cannot serialize NONE index")),
+        }
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_page_index_min_max_null() {
+        let column_index = PrimitiveColumnIndex {
+            column_index: ColumnIndex {
+                null_pages: vec![false],
+                boundary_order: BoundaryOrder::ASCENDING,
+                null_counts: Some(vec![0]),
+                repetition_level_histograms: Some(vec![1, 2]),
+                definition_level_histograms: Some(vec![1, 2, 3]),
+            },
+            min_values: vec![-123],
+            max_values: vec![234],
+        };
+
+        assert_eq!(column_index.min_value(0), Some(&-123));
+        assert_eq!(column_index.max_value(0), Some(&234));
+        assert_eq!(column_index.null_count(0), Some(0));
+        assert_eq!(column_index.repetition_level_histogram(0).unwrap(), &[1, 2]);
+        assert_eq!(
+            column_index.definition_level_histogram(0).unwrap(),
+            &[1, 2, 3]
+        );
+    }
+
+    #[test]
+    fn test_page_index_min_max_null_none() {
+        let column_index: PrimitiveColumnIndex<i32> = PrimitiveColumnIndex::<i32> {
+            column_index: ColumnIndex {
+                null_pages: vec![true],
+                boundary_order: BoundaryOrder::ASCENDING,
+                null_counts: Some(vec![1]),
+                repetition_level_histograms: None,
+                definition_level_histograms: Some(vec![1, 0]),
+            },
+            min_values: vec![Default::default()],
+            max_values: vec![Default::default()],
+        };
+
+        assert_eq!(column_index.min_value(0), None);
+        assert_eq!(column_index.max_value(0), None);
+        assert_eq!(column_index.null_count(0), Some(1));
+        assert_eq!(column_index.repetition_level_histogram(0), None);
+        assert_eq!(column_index.definition_level_histogram(0).unwrap(), &[1, 0]);
+    }
+
+    #[test]
+    fn test_invalid_column_index() {
+        let column_index = ThriftColumnIndex {
+            null_pages: vec![true, false],
+            min_values: vec![
+                &[],
+                &[], // this shouldn't be empty as null_pages[1] is false
+            ],
+            max_values: vec![
+                &[],
+                &[], // this shouldn't be empty as null_pages[1] is false
+            ],
+            null_counts: None,
+            repetition_level_histograms: None,
+            definition_level_histograms: None,
+            boundary_order: BoundaryOrder::UNORDERED,
+        };
+
+        let err = PrimitiveColumnIndex::<i32>::try_from_thrift(column_index).unwrap_err();
+        assert_eq!(
+            err.to_string(),
+            "Parquet error: error converting value, expected 4 bytes got 0"
+        );
+    }
+}
diff --git a/parquet/src/file/page_index/index.rs b/parquet/src/file/page_index/index.rs
deleted file mode 100644
index a66509e14c7a..000000000000
--- a/parquet/src/file/page_index/index.rs
+++ /dev/null
@@ -1,375 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-//! [`Index`] structures holding decoded [`ColumnIndex`] information
-
-use crate::basic::Type;
-use crate::data_type::private::ParquetValueType;
-use crate::data_type::{AsBytes, ByteArray, FixedLenByteArray, Int96};
-use crate::errors::ParquetError;
-use crate::file::metadata::LevelHistogram;
-use crate::format::{BoundaryOrder, ColumnIndex};
-use std::fmt::Debug;
-
-/// Typed statistics for one data page
-///
-/// See [`NativeIndex`] for more details
-#[derive(Debug, Clone, PartialEq, Eq, Hash)]
-pub struct PageIndex<T> {
-    /// The minimum value, It is None when all values are null
-    pub min: Option<T>,
-    /// The maximum value, It is None when all values are null
-    pub max: Option<T>,
-    /// Null values in the page
-    pub null_count: Option<i64>,
-    /// Repetition level histogram for the page
-    ///
-    /// `repetition_level_histogram[i]` is a count of how many values are at repetition level `i`.
-    /// For example, `repetition_level_histogram[0]` indicates how many rows the page contains.
-    pub repetition_level_histogram: Option<LevelHistogram>,
-    /// Definition level histogram for the page
-    ///
-    /// `definition_level_histogram[i]` is a count of how many values are at definition level `i`.
-    /// For example, `definition_level_histogram[max_definition_level]` indicates how many
-    /// non-null values are present in the page.
-    pub definition_level_histogram: Option<LevelHistogram>,
-}
-
-impl<T> PageIndex<T> {
-    /// Returns the minimum value in the page
-    ///
-    /// It is `None` when all values are null
-    pub fn min(&self) -> Option<&T> {
-        self.min.as_ref()
-    }
-
-    /// Returns the maximum value in the page
-    ///
-    /// It is `None` when all values are null
-    pub fn max(&self) -> Option<&T> {
-        self.max.as_ref()
-    }
-
-    /// Returns the number of null values in the page
-    pub fn null_count(&self) -> Option<i64> {
-        self.null_count
-    }
-
-    /// Returns the repetition level histogram for the page
-    pub fn repetition_level_histogram(&self) -> Option<&LevelHistogram> {
-        self.repetition_level_histogram.as_ref()
-    }
-
-    /// Returns the definition level histogram for the page
-    pub fn definition_level_histogram(&self) -> Option<&LevelHistogram> {
-        self.definition_level_histogram.as_ref()
-    }
-}
-
-impl<T> PageIndex<T>
-where
-    T: AsBytes,
-{
-    /// Returns the minimum value in the page as bytes
-    ///
-    /// It is `None` when all values are null
-    pub fn max_bytes(&self) -> Option<&[u8]> {
-        self.max.as_ref().map(|x| x.as_bytes())
-    }
-
-    /// Returns the maximum value in the page as bytes
-    ///
-    /// It is `None` when all values are null
-    pub fn min_bytes(&self) -> Option<&[u8]> {
-        self.min.as_ref().map(|x| x.as_bytes())
-    }
-}
-
-#[derive(Debug, Clone, PartialEq)]
-#[allow(non_camel_case_types)]
-/// Statistics for data pages in a column chunk.
-///
-/// See [`NativeIndex`] for more information
-pub enum Index {
-    /// Sometimes reading page index from parquet file
-    /// will only return pageLocations without min_max index,
-    /// `NONE` represents this lack of index information
-    NONE,
-    /// Boolean type index
-    BOOLEAN(NativeIndex<bool>),
-    /// 32-bit integer type index
-    INT32(NativeIndex<i32>),
-    /// 64-bit integer type index
-    INT64(NativeIndex<i64>),
-    /// 96-bit integer type (timestamp) index
-    INT96(NativeIndex<Int96>),
-    /// 32-bit floating point type index
-    FLOAT(NativeIndex<f32>),
-    /// 64-bit floating point type index
-    DOUBLE(NativeIndex<f64>),
-    /// Byte array type index
-    BYTE_ARRAY(NativeIndex<ByteArray>),
-    /// Fixed length byte array type index
-    FIXED_LEN_BYTE_ARRAY(NativeIndex<FixedLenByteArray>),
-}
-
-impl Index {
-    /// Return min/max elements inside ColumnIndex are ordered or not.
-    pub fn is_sorted(&self) -> bool {
-        // 0:UNORDERED, 1:ASCENDING ,2:DESCENDING,
-        if let Some(order) = self.get_boundary_order() {
-            order.0 > (BoundaryOrder::UNORDERED.0)
-        } else {
-            false
-        }
-    }
-
-    /// Get boundary_order of this page index.
-    pub fn get_boundary_order(&self) -> Option<BoundaryOrder> {
-        match self {
-            Index::NONE => None,
-            Index::BOOLEAN(index) => Some(index.boundary_order),
-            Index::INT32(index) => Some(index.boundary_order),
-            Index::INT64(index) => Some(index.boundary_order),
-            Index::INT96(index) => Some(index.boundary_order),
-            Index::FLOAT(index) => Some(index.boundary_order),
-            Index::DOUBLE(index) => Some(index.boundary_order),
-            Index::BYTE_ARRAY(index) => Some(index.boundary_order),
-            Index::FIXED_LEN_BYTE_ARRAY(index) => Some(index.boundary_order),
-        }
-    }
-}
-
-/// Strongly typed statistics for data pages in a column chunk.
-///
-/// This structure is a natively typed, in memory representation of the
-/// [`ColumnIndex`] structure in a parquet file footer, as described in the
-/// Parquet [PageIndex documentation]. The statistics stored in this structure
-/// can be used by query engines to skip decoding pages while reading parquet
-/// data.
-///
-/// # Differences with Row Group Level Statistics
-///
-/// One significant difference between `NativeIndex` and row group level
-/// [`Statistics`] is that page level statistics may not store actual column
-/// values as min and max (e.g. they may store truncated strings to save space)
-///
-/// [PageIndex documentation]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
-/// [`Statistics`]: crate::file::statistics::Statistics
-#[derive(Debug, Clone, PartialEq, Eq, Hash)]
-pub struct NativeIndex<T: ParquetValueType> {
-    /// The actual column indexes, one item per page
-    pub indexes: Vec<PageIndex<T>>,
-    /// If the min/max elements are ordered, and if so in which
-    /// direction. See [source] for details.
-    ///
-    /// [source]: https://github.com/apache/parquet-format/blob/bfc549b93e6927cb1fc425466e4084f76edc6d22/src/main/thrift/parquet.thrift#L959-L964
-    pub boundary_order: BoundaryOrder,
-}
-
-impl<T: ParquetValueType> NativeIndex<T> {
-    /// The physical data type of the column
-    pub const PHYSICAL_TYPE: Type = T::PHYSICAL_TYPE;
-
-    /// Creates a new [`NativeIndex`]
-    pub(crate) fn try_new(index: ColumnIndex) -> Result<Self, ParquetError> {
-        let len = index.min_values.len();
-
-        let null_counts = index
-            .null_counts
-            .map(|x| x.into_iter().map(Some).collect::<Vec<_>>())
-            .unwrap_or_else(|| vec![None; len]);
-
-        // histograms are a 1D array encoding a 2D num_pages X num_levels matrix.
-        let to_page_histograms = |opt_hist: Option<Vec<i64>>| {
-            if let Some(hist) = opt_hist {
-                // TODO: should we assert (hist.len() % len) == 0?
-                let num_levels = hist.len() / len;
-                let mut res = Vec::with_capacity(len);
-                for i in 0..len {
-                    let page_idx = i * num_levels;
-                    let page_hist = hist[page_idx..page_idx + num_levels].to_vec();
-                    res.push(Some(LevelHistogram::from(page_hist)));
-                }
-                res
-            } else {
-                vec![None; len]
-            }
-        };
-
-        let rep_hists: Vec<Option<LevelHistogram>> =
-            to_page_histograms(index.repetition_level_histograms);
-        let def_hists: Vec<Option<LevelHistogram>> =
-            to_page_histograms(index.definition_level_histograms);
-
-        let indexes = index
-            .min_values
-            .iter()
-            .zip(index.max_values.iter())
-            .zip(index.null_pages.into_iter())
-            .zip(null_counts.into_iter())
-            .zip(rep_hists.into_iter())
-            .zip(def_hists.into_iter())
-            .map(
-                |(
-                    ((((min, max), is_null), null_count), repetition_level_histogram),
-                    definition_level_histogram,
-                )| {
-                    let (min, max) = if is_null {
-                        (None, None)
-                    } else {
-                        (
-                            Some(T::try_from_le_slice(min)?),
-                            Some(T::try_from_le_slice(max)?),
-                        )
-                    };
-                    Ok(PageIndex {
-                        min,
-                        max,
-                        null_count,
-                        repetition_level_histogram,
-                        definition_level_histogram,
-                    })
-                },
-            )
-            .collect::<Result<Vec<_>, ParquetError>>()?;
-
-        Ok(Self {
-            indexes,
-            boundary_order: index.boundary_order,
-        })
-    }
-
-    pub(crate) fn to_thrift(&self) -> ColumnIndex {
-        let min_values = self
-            .indexes
-            .iter()
-            .map(|x| x.min_bytes().unwrap_or(&[]).to_vec())
-            .collect::<Vec<_>>();
-
-        let max_values = self
-            .indexes
-            .iter()
-            .map(|x| x.max_bytes().unwrap_or(&[]).to_vec())
-            .collect::<Vec<_>>();
-
-        let null_counts = self
-            .indexes
-            .iter()
-            .map(|x| x.null_count())
-            .collect::<Option<Vec<_>>>();
-
-        // Concatenate page histograms into a single Option<Vec>
-        let repetition_level_histograms = self
-            .indexes
-            .iter()
-            .map(|x| x.repetition_level_histogram().map(|v| v.values()))
-            .collect::<Option<Vec<&[i64]>>>()
-            .map(|hists| hists.concat());
-
-        let definition_level_histograms = self
-            .indexes
-            .iter()
-            .map(|x| x.definition_level_histogram().map(|v| v.values()))
-            .collect::<Option<Vec<&[i64]>>>()
-            .map(|hists| hists.concat());
-
-        ColumnIndex::new(
-            self.indexes.iter().map(|x| x.min().is_none()).collect(),
-            min_values,
-            max_values,
-            self.boundary_order,
-            null_counts,
-            repetition_level_histograms,
-            definition_level_histograms,
-        )
-    }
-}
-
-#[cfg(test)]
-mod tests {
-    use super::*;
-
-    #[test]
-    fn test_page_index_min_max_null() {
-        let page_index = PageIndex {
-            min: Some(-123),
-            max: Some(234),
-            null_count: Some(0),
-            repetition_level_histogram: Some(LevelHistogram::from(vec![1, 2])),
-            definition_level_histogram: Some(LevelHistogram::from(vec![1, 2, 3])),
-        };
-
-        assert_eq!(page_index.min().unwrap(), &-123);
-        assert_eq!(page_index.max().unwrap(), &234);
-        assert_eq!(page_index.min_bytes().unwrap(), (-123).as_bytes());
-        assert_eq!(page_index.max_bytes().unwrap(), 234.as_bytes());
-        assert_eq!(page_index.null_count().unwrap(), 0);
-        assert_eq!(
-            page_index.repetition_level_histogram().unwrap().values(),
-            &vec![1, 2]
-        );
-        assert_eq!(
-            page_index.definition_level_histogram().unwrap().values(),
-            &vec![1, 2, 3]
-        );
-    }
-
-    #[test]
-    fn test_page_index_min_max_null_none() {
-        let page_index: PageIndex<i32> = PageIndex {
-            min: None,
-            max: None,
-            null_count: None,
-            repetition_level_histogram: None,
-            definition_level_histogram: None,
-        };
-
-        assert_eq!(page_index.min(), None);
-        assert_eq!(page_index.max(), None);
-        assert_eq!(page_index.min_bytes(), None);
-        assert_eq!(page_index.max_bytes(), None);
-        assert_eq!(page_index.null_count(), None);
-        assert_eq!(page_index.repetition_level_histogram(), None);
-        assert_eq!(page_index.definition_level_histogram(), None);
-    }
-
-    #[test]
-    fn test_invalid_column_index() {
-        let column_index = ColumnIndex {
-            null_pages: vec![true, false],
-            min_values: vec![
-                vec![],
-                vec![], // this shouldn't be empty as null_pages[1] is false
-            ],
-            max_values: vec![
-                vec![],
-                vec![], // this shouldn't be empty as null_pages[1] is false
-            ],
-            null_counts: None,
-            repetition_level_histograms: None,
-            definition_level_histograms: None,
-            boundary_order: BoundaryOrder::UNORDERED,
-        };
-
-        let err = NativeIndex::<i32>::try_new(column_index).unwrap_err();
-        assert_eq!(
-            err.to_string(),
-            "Parquet error: error converting value, expected 4 bytes got 0"
-        );
-    }
-}
diff --git a/parquet/src/file/page_index/index_reader.rs b/parquet/src/file/page_index/index_reader.rs
index d0537711dc20..fd10b9fe8b3c 100644
--- a/parquet/src/file/page_index/index_reader.rs
+++ b/parquet/src/file/page_index/index_reader.rs
@@ -15,17 +15,23 @@
 // specific language governing permissions and limitations
 // under the License.
 
-//! Support for reading [`Index`] and [`OffsetIndex`] from parquet metadata.
+//! Support for reading [`ColumnIndexMetaData`] and [`OffsetIndexMetaData`] from parquet metadata.
 
-use crate::basic::Type;
+use crate::basic::{BoundaryOrder, Type};
 use crate::data_type::Int96;
-use crate::errors::ParquetError;
+use crate::errors::{ParquetError, Result};
 use crate::file::metadata::ColumnChunkMetaData;
-use crate::file::page_index::index::{Index, NativeIndex};
+use crate::file::page_index::column_index::{
+    ByteArrayColumnIndex, ColumnIndexMetaData, PrimitiveColumnIndex,
+};
 use crate::file::page_index::offset_index::OffsetIndexMetaData;
 use crate::file::reader::ChunkReader;
-use crate::format::{ColumnIndex, OffsetIndex};
-use crate::thrift::{TCompactSliceInputProtocol, TSerializable};
+use crate::parquet_thrift::{
+    read_thrift_vec, ElementType, FieldType, ReadThrift, ThriftCompactInputProtocol,
+    ThriftCompactOutputProtocol, ThriftSliceInputProtocol, WriteThrift, WriteThriftField,
+};
+use crate::thrift_struct;
+use std::io::Write;
 use std::ops::Range;
 
 /// Computes the covering range of two optional ranges
@@ -38,7 +44,7 @@ pub(crate) fn acc_range(a: Option<Range<u64>>, b: Option<Range<u64>>) -> Option<
     }
 }
 
-/// Reads per-column [`Index`] for all columns of a row group by
+/// Reads per-column [`ColumnIndexMetaData`] for all columns of a row group by
 /// decoding [`ColumnIndex`] .
 ///
 /// Returns a vector of `index[column_number]`.
@@ -48,6 +54,7 @@ pub(crate) fn acc_range(a: Option<Range<u64>>, b: Option<Range<u64>>) -> Option<
 /// See [Page Index Documentation] for more details.
 ///
 /// [Page Index Documentation]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
+/// [`ColumnIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
 #[deprecated(
     since = "55.2.0",
     note = "Use ParquetMetaDataReader instead; will be removed in 58.0.0"
@@ -55,7 +62,7 @@ pub(crate) fn acc_range(a: Option<Range<u64>>, b: Option<Range<u64>>) -> Option<
 pub fn read_columns_indexes<R: ChunkReader>(
     reader: &R,
     chunks: &[ColumnChunkMetaData],
-) -> Result<Option<Vec<Index>>, ParquetError> {
+) -> Result<Option<Vec<ColumnIndexMetaData>>, ParquetError> {
     let fetch = chunks
         .iter()
         .fold(None, |range, c| acc_range(range, c.column_index_range()));
@@ -76,7 +83,7 @@ pub fn read_columns_indexes<R: ChunkReader>(
                         ..usize::try_from(r.end - fetch.start)?],
                     c.column_type(),
                 ),
-                None => Ok(Index::NONE),
+                None => Ok(ColumnIndexMetaData::NONE),
             })
             .collect(),
     )
@@ -93,6 +100,7 @@ pub fn read_columns_indexes<R: ChunkReader>(
 /// See [Page Index Documentation] for more details.
 ///
 /// [Page Index Documentation]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
+/// [`OffsetIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
 #[deprecated(
     since = "55.2.0",
     note = "Use ParquetMetaDataReader instead; will be removed in 58.0.0"
@@ -128,25 +136,64 @@ pub fn read_offset_indexes<R: ChunkReader>(
 }
 
 pub(crate) fn decode_offset_index(data: &[u8]) -> Result<OffsetIndexMetaData, ParquetError> {
-    let mut prot = TCompactSliceInputProtocol::new(data);
-    let offset = OffsetIndex::read_from_in_protocol(&mut prot)?;
-    OffsetIndexMetaData::try_new(offset)
+    let mut prot = ThriftSliceInputProtocol::new(data);
+
+    // Try to read fast-path first. If that fails, fall back to slower but more robust
+    // decoder.
+    match OffsetIndexMetaData::try_from_fast(&mut prot) {
+        Ok(offset_index) => Ok(offset_index),
+        Err(_) => {
+            prot = ThriftSliceInputProtocol::new(data);
+            OffsetIndexMetaData::read_thrift(&mut prot)
+        }
+    }
 }
 
-pub(crate) fn decode_column_index(data: &[u8], column_type: Type) -> Result<Index, ParquetError> {
-    let mut prot = TCompactSliceInputProtocol::new(data);
+// private struct only used for decoding then discarded
+thrift_struct!(
+pub(super) struct ThriftColumnIndex<'a> {
+  1: required list<bool> null_pages
+  2: required list<'a><binary> min_values
+  3: required list<'a><binary> max_values
+  4: required BoundaryOrder boundary_order
+  5: optional list<i64> null_counts
+  6: optional list<i64> repetition_level_histograms;
+  7: optional list<i64> definition_level_histograms;
+}
+);
 
-    let index = ColumnIndex::read_from_in_protocol(&mut prot)?;
+pub(crate) fn decode_column_index(
+    data: &[u8],
+    column_type: Type,
+) -> Result<ColumnIndexMetaData, ParquetError> {
+    let mut prot = ThriftSliceInputProtocol::new(data);
+    let index = ThriftColumnIndex::read_thrift(&mut prot)?;
 
     let index = match column_type {
-        Type::BOOLEAN => Index::BOOLEAN(NativeIndex::<bool>::try_new(index)?),
-        Type::INT32 => Index::INT32(NativeIndex::<i32>::try_new(index)?),
-        Type::INT64 => Index::INT64(NativeIndex::<i64>::try_new(index)?),
-        Type::INT96 => Index::INT96(NativeIndex::<Int96>::try_new(index)?),
-        Type::FLOAT => Index::FLOAT(NativeIndex::<f32>::try_new(index)?),
-        Type::DOUBLE => Index::DOUBLE(NativeIndex::<f64>::try_new(index)?),
-        Type::BYTE_ARRAY => Index::BYTE_ARRAY(NativeIndex::try_new(index)?),
-        Type::FIXED_LEN_BYTE_ARRAY => Index::FIXED_LEN_BYTE_ARRAY(NativeIndex::try_new(index)?),
+        Type::BOOLEAN => {
+            ColumnIndexMetaData::BOOLEAN(PrimitiveColumnIndex::<bool>::try_from_thrift(index)?)
+        }
+        Type::INT32 => {
+            ColumnIndexMetaData::INT32(PrimitiveColumnIndex::<i32>::try_from_thrift(index)?)
+        }
+        Type::INT64 => {
+            ColumnIndexMetaData::INT64(PrimitiveColumnIndex::<i64>::try_from_thrift(index)?)
+        }
+        Type::INT96 => {
+            ColumnIndexMetaData::INT96(PrimitiveColumnIndex::<Int96>::try_from_thrift(index)?)
+        }
+        Type::FLOAT => {
+            ColumnIndexMetaData::FLOAT(PrimitiveColumnIndex::<f32>::try_from_thrift(index)?)
+        }
+        Type::DOUBLE => {
+            ColumnIndexMetaData::DOUBLE(PrimitiveColumnIndex::<f64>::try_from_thrift(index)?)
+        }
+        Type::BYTE_ARRAY => {
+            ColumnIndexMetaData::BYTE_ARRAY(ByteArrayColumnIndex::try_from_thrift(index)?)
+        }
+        Type::FIXED_LEN_BYTE_ARRAY => {
+            ColumnIndexMetaData::FIXED_LEN_BYTE_ARRAY(ByteArrayColumnIndex::try_from_thrift(index)?)
+        }
     };
 
     Ok(index)
diff --git a/parquet/src/file/page_index/mod.rs b/parquet/src/file/page_index/mod.rs
index a8077896db34..71b8290d5d36 100644
--- a/parquet/src/file/page_index/mod.rs
+++ b/parquet/src/file/page_index/mod.rs
@@ -19,6 +19,6 @@
 //!
 //! [Column Index]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
 
-pub mod index;
+pub mod column_index;
 pub mod index_reader;
 pub mod offset_index;
diff --git a/parquet/src/file/page_index/offset_index.rs b/parquet/src/file/page_index/offset_index.rs
index d48d1b6c083d..d79da37824c8 100644
--- a/parquet/src/file/page_index/offset_index.rs
+++ b/parquet/src/file/page_index/offset_index.rs
@@ -16,30 +16,52 @@
 // under the License.
 
 //! [`OffsetIndexMetaData`] structure holding decoded [`OffsetIndex`] information
+//!
+//! [`OffsetIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
 
-use crate::errors::ParquetError;
-use crate::format::{OffsetIndex, PageLocation};
+use std::io::Write;
 
+use crate::parquet_thrift::{
+    read_thrift_vec, ElementType, FieldType, ReadThrift, ThriftCompactInputProtocol,
+    ThriftCompactOutputProtocol, WriteThrift, WriteThriftField,
+};
+use crate::{
+    errors::{ParquetError, Result},
+    thrift_struct,
+};
+
+thrift_struct!(
+/// Page location information for [`OffsetIndexMetaData`]
+pub struct PageLocation {
+  /// Offset of the page in the file
+  1: required i64 offset
+  /// Size of the page, including header. Sum of compressed_page_size and header
+  2: required i32 compressed_page_size
+  /// Index within the RowGroup of the first row of the page. When an
+  /// OffsetIndex is present, pages must begin on row boundaries
+  /// (repetition_level = 0).
+  3: required i64 first_row_index
+}
+);
+
+thrift_struct!(
 /// [`OffsetIndex`] information for a column chunk. Contains offsets and sizes for each page
 /// in the chunk. Optionally stores fully decoded page sizes for BYTE_ARRAY columns.
-#[derive(Debug, Clone, PartialEq)]
+///
+/// See [`ParquetOffsetIndex`] for more information.
+///
+/// [`ParquetOffsetIndex`]: crate::file::metadata::ParquetOffsetIndex
+/// [`OffsetIndex`]: https://github.com/apache/parquet-format/blob/master/PageIndex.md
 pub struct OffsetIndexMetaData {
-    /// Vector of [`PageLocation`] objects, one per page in the chunk.
-    pub page_locations: Vec<PageLocation>,
-    /// Optional vector of unencoded page sizes, one per page in the chunk.
-    /// Only defined for BYTE_ARRAY columns.
-    pub unencoded_byte_array_data_bytes: Option<Vec<i64>>,
+  /// Vector of [`PageLocation`] objects, one per page in the chunk.
+  1: required list<PageLocation> page_locations
+  /// Optional vector of unencoded page sizes, one per page in the chunk.
+  /// Only defined for BYTE_ARRAY columns.
+  2: optional list<i64> unencoded_byte_array_data_bytes
 }
+);
 
 impl OffsetIndexMetaData {
-    /// Creates a new [`OffsetIndexMetaData`] from an [`OffsetIndex`].
-    pub(crate) fn try_new(index: OffsetIndex) -> Result<Self, ParquetError> {
-        Ok(Self {
-            page_locations: index.page_locations,
-            unencoded_byte_array_data_bytes: index.unencoded_byte_array_data_bytes,
-        })
-    }
-
     /// Vector of [`PageLocation`] objects, one per page in the chunk.
     pub fn page_locations(&self) -> &Vec<PageLocation> {
         &self.page_locations
@@ -51,12 +73,126 @@ impl OffsetIndexMetaData {
         self.unencoded_byte_array_data_bytes.as_ref()
     }
 
-    // TODO: remove annotation after merge
-    #[allow(dead_code)]
-    pub(crate) fn to_thrift(&self) -> OffsetIndex {
-        OffsetIndex::new(
-            self.page_locations.clone(),
-            self.unencoded_byte_array_data_bytes.clone(),
-        )
+    // Fast-path read of offset index. This works because we expect all field deltas to be 1,
+    // and there's no nesting beyond PageLocation, so no need to save the last field id. Like
+    // read_page_locations(), this will fail if absolute field id's are used.
+    pub(super) fn try_from_fast<'a, R: ThriftCompactInputProtocol<'a>>(
+        prot: &mut R,
+    ) -> Result<Self> {
+        // Offset index is a struct with 2 fields. First field is an array of PageLocations,
+        // the second an optional array of i64.
+
+        // read field 1 header, then list header, then vec of PageLocations
+        let (field_type, delta) = prot.read_field_header()?;
+        if delta != 1 || field_type != FieldType::List as u8 {
+            return Err(general_err!("error reading OffsetIndex::page_locations"));
+        }
+
+        // we have to do this manually because we want to use the fast PageLocation decoder
+        let list_ident = prot.read_list_begin()?;
+        let mut page_locations = Vec::with_capacity(list_ident.size as usize);
+        for _ in 0..list_ident.size {
+            page_locations.push(read_page_location(prot)?);
+        }
+
+        let mut unencoded_byte_array_data_bytes: Option<Vec<i64>> = None;
+
+        // read second field...if it's Stop we're done
+        let (mut field_type, delta) = prot.read_field_header()?;
+        if field_type == FieldType::List as u8 {
+            if delta != 1 {
+                return Err(general_err!(
+                    "encountered unknown field while reading OffsetIndex"
+                ));
+            }
+            let vec = read_thrift_vec::<i64, R>(&mut *prot)?;
+            unencoded_byte_array_data_bytes = Some(vec);
+
+            // this one should be Stop
+            (field_type, _) = prot.read_field_header()?;
+        }
+
+        if field_type != FieldType::Stop as u8 {
+            return Err(general_err!(
+                "encountered unknown field while reading OffsetIndex"
+            ));
+        }
+
+        Ok(Self {
+            page_locations,
+            unencoded_byte_array_data_bytes,
+        })
+    }
+}
+
+// hand coding this one because it is very time critical
+
+// Note: this will fail if the fields are either out of order, or if a suboptimal
+// encoder doesn't use field deltas.
+fn read_page_location<'a, R: ThriftCompactInputProtocol<'a>>(prot: &mut R) -> Result<PageLocation> {
+    // there are 3 fields, all mandatory, so all field deltas should be 1
+    let (field_type, delta) = prot.read_field_header()?;
+    if delta != 1 || field_type != FieldType::I64 as u8 {
+        return Err(general_err!("error reading PageLocation::offset"));
+    }
+    let offset = prot.read_i64()?;
+
+    let (field_type, delta) = prot.read_field_header()?;
+    if delta != 1 || field_type != FieldType::I32 as u8 {
+        return Err(general_err!(
+            "error reading PageLocation::compressed_page_size"
+        ));
+    }
+    let compressed_page_size = prot.read_i32()?;
+
+    let (field_type, delta) = prot.read_field_header()?;
+    if delta != 1 || field_type != FieldType::I64 as u8 {
+        return Err(general_err!("error reading PageLocation::first_row_index"));
+    }
+    let first_row_index = prot.read_i64()?;
+
+    // read end of struct...return error if there are unknown fields present
+    let (field_type, _) = prot.read_field_header()?;
+    if field_type != FieldType::Stop as u8 {
+        return Err(general_err!("unexpected field in PageLocation"));
+    }
+
+    Ok(PageLocation {
+        offset,
+        compressed_page_size,
+        first_row_index,
+    })
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use crate::parquet_thrift::tests::test_roundtrip;
+
+    #[test]
+    fn test_offset_idx_roundtrip() {
+        let page_locations = [
+            PageLocation {
+                offset: 0,
+                compressed_page_size: 10,
+                first_row_index: 0,
+            },
+            PageLocation {
+                offset: 10,
+                compressed_page_size: 20,
+                first_row_index: 100,
+            },
+        ]
+        .to_vec();
+        let unenc = [0i64, 100i64].to_vec();
+
+        test_roundtrip(OffsetIndexMetaData {
+            page_locations: page_locations.clone(),
+            unencoded_byte_array_data_bytes: Some(unenc),
+        });
+        test_roundtrip(OffsetIndexMetaData {
+            page_locations,
+            unencoded_byte_array_data_bytes: None,
+        });
     }
 }
diff --git a/parquet/src/file/properties.rs b/parquet/src/file/properties.rs
index 603db6660f45..a76db6465602 100644
--- a/parquet/src/file/properties.rs
+++ b/parquet/src/file/properties.rs
@@ -20,8 +20,7 @@ use crate::basic::{Compression, Encoding};
 use crate::compression::{CodecOptions, CodecOptionsBuilder};
 #[cfg(feature = "encryption")]
 use crate::encryption::encrypt::FileEncryptionProperties;
-use crate::file::metadata::KeyValue;
-use crate::format::SortingColumn;
+use crate::file::metadata::{KeyValue, SortingColumn};
 use crate::schema::types::ColumnPath;
 use std::str::FromStr;
 use std::{collections::HashMap, sync::Arc};
@@ -640,7 +639,7 @@ impl WriterPropertiesBuilder {
     /// * If `Some`, must be greater than 0, otherwise will panic
     /// * If `None`, there's no effective limit.
     ///
-    /// [`Index`]: crate::file::page_index::index::Index
+    /// [`Index`]: crate::file::page_index::column_index::ColumnIndexMetaData
     pub fn set_column_index_truncate_length(mut self, max_length: Option<usize>) -> Self {
         if let Some(value) = max_length {
             assert!(value > 0, "Cannot have a 0 column index truncate length. If you wish to disable min/max value truncation, set it to `None`.");
@@ -1192,6 +1191,7 @@ impl ColumnProperties {
 pub type ReaderPropertiesPtr = Arc<ReaderProperties>;
 
 const DEFAULT_READ_BLOOM_FILTER: bool = false;
+const DEFAULT_READ_PAGE_STATS: bool = false;
 
 /// Configuration settings for reading parquet files.
 ///
@@ -1214,6 +1214,7 @@ const DEFAULT_READ_BLOOM_FILTER: bool = false;
 pub struct ReaderProperties {
     codec_options: CodecOptions,
     read_bloom_filter: bool,
+    read_page_stats: bool,
 }
 
 impl ReaderProperties {
@@ -1231,6 +1232,11 @@ impl ReaderProperties {
     pub(crate) fn read_bloom_filter(&self) -> bool {
         self.read_bloom_filter
     }
+
+    /// Returns whether to read page level statistics
+    pub(crate) fn read_page_stats(&self) -> bool {
+        self.read_page_stats
+    }
 }
 
 /// Builder for parquet file reader configuration. See example on
@@ -1238,6 +1244,7 @@ impl ReaderProperties {
 pub struct ReaderPropertiesBuilder {
     codec_options_builder: CodecOptionsBuilder,
     read_bloom_filter: Option<bool>,
+    read_page_stats: Option<bool>,
 }
 
 /// Reader properties builder.
@@ -1247,6 +1254,7 @@ impl ReaderPropertiesBuilder {
         Self {
             codec_options_builder: CodecOptionsBuilder::default(),
             read_bloom_filter: None,
+            read_page_stats: None,
         }
     }
 
@@ -1255,6 +1263,7 @@ impl ReaderPropertiesBuilder {
         ReaderProperties {
             codec_options: self.codec_options_builder.build(),
             read_bloom_filter: self.read_bloom_filter.unwrap_or(DEFAULT_READ_BLOOM_FILTER),
+            read_page_stats: self.read_page_stats.unwrap_or(DEFAULT_READ_PAGE_STATS),
         }
     }
 
@@ -1283,6 +1292,20 @@ impl ReaderPropertiesBuilder {
         self.read_bloom_filter = Some(value);
         self
     }
+
+    /// Enable/disable reading page-level statistics
+    ///
+    /// If set to `true`, then the reader will decode and populate the [`Statistics`] for
+    /// each page, if present.
+    /// If set to `false`, then the reader will skip decoding the statistics.
+    ///
+    /// By default statistics will not be decoded.
+    ///
+    /// [`Statistics`]: crate::file::statistics::Statistics
+    pub fn set_read_page_statistics(mut self, value: bool) -> Self {
+        self.read_page_stats = Some(value);
+        self
+    }
 }
 
 #[cfg(test)]
diff --git a/parquet/src/file/serialized_reader.rs b/parquet/src/file/serialized_reader.rs
index b36a76f472f5..c47c118e43bb 100644
--- a/parquet/src/file/serialized_reader.rs
+++ b/parquet/src/file/serialized_reader.rs
@@ -18,31 +18,30 @@
 //! Contains implementations of the reader traits FileReader, RowGroupReader and PageReader
 //! Also contains implementations of the ChunkReader for files (with buffering) and byte arrays (RAM)
 
-use crate::basic::{Encoding, Type};
+use crate::basic::{PageType, Type};
 use crate::bloom_filter::Sbbf;
 use crate::column::page::{Page, PageMetadata, PageReader};
 use crate::compression::{create_codec, Codec};
 #[cfg(feature = "encryption")]
 use crate::encryption::decrypt::{read_and_decrypt, CryptoContext};
 use crate::errors::{ParquetError, Result};
-use crate::file::page_index::offset_index::OffsetIndexMetaData;
+use crate::file::metadata::thrift_gen::PageHeader;
+use crate::file::page_index::offset_index::{OffsetIndexMetaData, PageLocation};
+use crate::file::statistics;
 use crate::file::{
     metadata::*,
     properties::{ReaderProperties, ReaderPropertiesPtr},
     reader::*,
-    statistics,
 };
-use crate::format::{PageHeader, PageLocation, PageType};
+#[cfg(feature = "encryption")]
+use crate::parquet_thrift::ThriftSliceInputProtocol;
+use crate::parquet_thrift::{ReadThrift, ThriftReadInputProtocol};
 use crate::record::reader::RowIter;
 use crate::record::Row;
 use crate::schema::types::Type as SchemaType;
-#[cfg(feature = "encryption")]
-use crate::thrift::TCompactSliceInputProtocol;
-use crate::thrift::TSerializable;
 use bytes::Bytes;
 use std::collections::VecDeque;
 use std::{fs::File, io::Read, path::Path, sync::Arc};
-use thrift::protocol::TCompactInputProtocol;
 
 impl TryFrom<File> for SerializedFileReader<File> {
     type Error = ParquetError;
@@ -414,7 +413,7 @@ pub(crate) fn decode_page(
         _ => buffer,
     };
 
-    let result = match page_header.type_ {
+    let result = match page_header.r#type {
         PageType::DICTIONARY_PAGE => {
             let dict_header = page_header.dictionary_page_header.as_ref().ok_or_else(|| {
                 ParquetError::General("Missing dictionary page header".to_string())
@@ -423,7 +422,7 @@ pub(crate) fn decode_page(
             Page::DictionaryPage {
                 buf: buffer,
                 num_values: dict_header.num_values.try_into()?,
-                encoding: Encoding::try_from(dict_header.encoding)?,
+                encoding: dict_header.encoding,
                 is_sorted,
             }
         }
@@ -434,10 +433,10 @@ pub(crate) fn decode_page(
             Page::DataPage {
                 buf: buffer,
                 num_values: header.num_values.try_into()?,
-                encoding: Encoding::try_from(header.encoding)?,
-                def_level_encoding: Encoding::try_from(header.definition_level_encoding)?,
-                rep_level_encoding: Encoding::try_from(header.repetition_level_encoding)?,
-                statistics: statistics::from_thrift(physical_type, header.statistics)?,
+                encoding: header.encoding,
+                def_level_encoding: header.definition_level_encoding,
+                rep_level_encoding: header.repetition_level_encoding,
+                statistics: statistics::from_thrift_page_stats(physical_type, header.statistics)?,
             }
         }
         PageType::DATA_PAGE_V2 => {
@@ -448,18 +447,18 @@ pub(crate) fn decode_page(
             Page::DataPageV2 {
                 buf: buffer,
                 num_values: header.num_values.try_into()?,
-                encoding: Encoding::try_from(header.encoding)?,
+                encoding: header.encoding,
                 num_nulls: header.num_nulls.try_into()?,
                 num_rows: header.num_rows.try_into()?,
                 def_levels_byte_len: header.definition_levels_byte_length.try_into()?,
                 rep_levels_byte_len: header.repetition_levels_byte_length.try_into()?,
                 is_compressed,
-                statistics: statistics::from_thrift(physical_type, header.statistics)?,
+                statistics: statistics::from_thrift_page_stats(physical_type, header.statistics)?,
             }
         }
         _ => {
             // For unknown page type (e.g., INDEX_PAGE), skip and read next.
-            unimplemented!("Page type {:?} is not supported", page_header.type_)
+            unimplemented!("Page type {:?} is not supported", page_header.r#type)
         }
     };
 
@@ -499,6 +498,8 @@ enum SerializedPageReaderState {
 
 #[derive(Default)]
 struct SerializedPageReaderContext {
+    /// Controls decoding of page-level statistics
+    read_stats: bool,
     /// Crypto context carrying objects required for decryption
     #[cfg(feature = "encryption")]
     crypto_context: Option<Arc<CryptoContext>>,
@@ -610,12 +611,16 @@ impl<R: ChunkReader> SerializedPageReader<R> {
                 require_dictionary: meta.dictionary_page_offset().is_some(),
             },
         };
+        let mut context = SerializedPageReaderContext::default();
+        if props.read_page_stats() {
+            context.read_stats = true;
+        }
         Ok(Self {
             reader,
             decompressor,
             state,
             physical_type: meta.column_type(),
-            context: Default::default(),
+            context,
         })
     }
 
@@ -732,8 +737,12 @@ impl SerializedPageReaderContext {
         _page_index: usize,
         _dictionary_page: bool,
     ) -> Result<PageHeader> {
-        let mut prot = TCompactInputProtocol::new(input);
-        Ok(PageHeader::read_from_in_protocol(&mut prot)?)
+        let mut prot = ThriftReadInputProtocol::new(input);
+        if self.read_stats {
+            Ok(PageHeader::read_thrift(&mut prot)?)
+        } else {
+            Ok(PageHeader::read_thrift_without_stats(&mut prot)?)
+        }
     }
 
     fn decrypt_page_data<T>(
@@ -756,8 +765,14 @@ impl SerializedPageReaderContext {
     ) -> Result<PageHeader> {
         match self.page_crypto_context(page_index, dictionary_page) {
             None => {
-                let mut prot = TCompactInputProtocol::new(input);
-                Ok(PageHeader::read_from_in_protocol(&mut prot)?)
+                let mut prot = ThriftReadInputProtocol::new(input);
+                if self.read_stats {
+                    Ok(PageHeader::read_thrift(&mut prot)?)
+                } else {
+                    use crate::file::metadata::thrift_gen::PageHeader;
+
+                    Ok(PageHeader::read_thrift_without_stats(&mut prot)?)
+                }
             }
             Some(page_crypto_context) => {
                 let data_decryptor = page_crypto_context.data_decryptor();
@@ -770,8 +785,12 @@ impl SerializedPageReaderContext {
                     ))
                 })?;
 
-                let mut prot = TCompactSliceInputProtocol::new(buf.as_slice());
-                Ok(PageHeader::read_from_in_protocol(&mut prot)?)
+                let mut prot = ThriftSliceInputProtocol::new(buf.as_slice());
+                if self.read_stats {
+                    Ok(PageHeader::read_thrift(&mut prot)?)
+                } else {
+                    Ok(PageHeader::read_thrift_without_stats(&mut prot)?)
+                }
             }
         }
     }
@@ -875,7 +894,7 @@ impl<R: ChunkReader> PageReader for SerializedPageReader<R> {
                     *offset += data_len as u64;
                     *remaining -= data_len as u64;
 
-                    if header.type_ == PageType::INDEX_PAGE {
+                    if header.r#type == PageType::INDEX_PAGE {
                         continue;
                     }
 
@@ -1102,14 +1121,15 @@ mod tests {
 
     use bytes::Buf;
 
+    use crate::file::page_index::column_index::{
+        ByteArrayColumnIndex, ColumnIndexMetaData, PrimitiveColumnIndex,
+    };
     use crate::file::properties::{EnabledStatistics, WriterProperties};
-    use crate::format::BoundaryOrder;
 
-    use crate::basic::{self, ColumnOrder, SortOrder};
+    use crate::basic::{self, BoundaryOrder, ColumnOrder, Encoding, SortOrder};
     use crate::column::reader::ColumnReader;
     use crate::data_type::private::ParquetValueType;
     use crate::data_type::{AsBytes, FixedLenByteArrayType, Int32Type};
-    use crate::file::page_index::index::{Index, NativeIndex};
     #[allow(deprecated)]
     use crate::file::page_index::index_reader::{read_columns_indexes, read_offset_indexes};
     use crate::file::writer::SerializedFileWriter;
@@ -1395,7 +1415,7 @@ mod tests {
                     assert_eq!(def_levels_byte_len, 2);
                     assert_eq!(rep_levels_byte_len, 0);
                     assert!(is_compressed);
-                    assert!(statistics.is_some());
+                    assert!(statistics.is_none()); // page stats are no longer read
                     true
                 }
                 _ => false,
@@ -1497,7 +1517,7 @@ mod tests {
                     assert_eq!(def_levels_byte_len, 2);
                     assert_eq!(rep_levels_byte_len, 0);
                     assert!(is_compressed);
-                    assert!(statistics.is_some());
+                    assert!(statistics.is_none()); // page stats are no longer read
                     true
                 }
                 _ => false,
@@ -1874,9 +1894,15 @@ mod tests {
             80, 65, 82, 49,
         ];
         let ret = SerializedFileReader::new(Bytes::copy_from_slice(&data));
+        #[cfg(feature = "encryption")]
+        assert_eq!(
+            ret.err().unwrap().to_string(),
+            "Parquet error: Could not parse metadata: Parquet error: Received empty union from remote ColumnOrder"
+        );
+        #[cfg(not(feature = "encryption"))]
         assert_eq!(
             ret.err().unwrap().to_string(),
-            "Parquet error: Could not parse metadata: bad data"
+            "Parquet error: Received empty union from remote ColumnOrder"
         );
     }
 
@@ -1913,21 +1939,19 @@ mod tests {
 
         // only one row group
         assert_eq!(column_index.len(), 1);
-        let index = if let Index::BYTE_ARRAY(index) = &column_index[0][0] {
+        let index = if let ColumnIndexMetaData::BYTE_ARRAY(index) = &column_index[0][0] {
             index
         } else {
             unreachable!()
         };
 
         assert_eq!(index.boundary_order, BoundaryOrder::ASCENDING);
-        let index_in_pages = &index.indexes;
 
         //only one page group
-        assert_eq!(index_in_pages.len(), 1);
+        assert_eq!(index.num_pages(), 1);
 
-        let page0 = &index_in_pages[0];
-        let min = page0.min.as_ref().unwrap();
-        let max = page0.max.as_ref().unwrap();
+        let min = index.min_value(0).unwrap();
+        let max = index.max_value(0).unwrap();
         assert_eq!(b"Hello", min.as_bytes());
         assert_eq!(b"today", max.as_bytes());
 
@@ -1992,7 +2016,7 @@ mod tests {
         let boundary_order = &column_index[0][0].get_boundary_order();
         assert!(boundary_order.is_some());
         matches!(boundary_order.unwrap(), BoundaryOrder::UNORDERED);
-        if let Index::INT32(index) = &column_index[0][0] {
+        if let ColumnIndexMetaData::INT32(index) = &column_index[0][0] {
             check_native_page_index(
                 index,
                 325,
@@ -2005,15 +2029,15 @@ mod tests {
         };
         //col1->bool_col:BOOLEAN UNCOMPRESSED DO:0 FPO:37329 SZ:3022/3022/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: false, max: true, num_nulls: 0]
         assert!(&column_index[0][1].is_sorted());
-        if let Index::BOOLEAN(index) = &column_index[0][1] {
-            assert_eq!(index.indexes.len(), 82);
+        if let ColumnIndexMetaData::BOOLEAN(index) = &column_index[0][1] {
+            assert_eq!(index.num_pages(), 82);
             assert_eq!(row_group_offset_indexes[1].page_locations.len(), 82);
         } else {
             unreachable!()
         };
         //col2->tinyint_col: INT32 UNCOMPRESSED DO:0 FPO:40351 SZ:37325/37325/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: 0, max: 9, num_nulls: 0]
         assert!(&column_index[0][2].is_sorted());
-        if let Index::INT32(index) = &column_index[0][2] {
+        if let ColumnIndexMetaData::INT32(index) = &column_index[0][2] {
             check_native_page_index(
                 index,
                 325,
@@ -2026,7 +2050,7 @@ mod tests {
         };
         //col4->smallint_col: INT32 UNCOMPRESSED DO:0 FPO:77676 SZ:37325/37325/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: 0, max: 9, num_nulls: 0]
         assert!(&column_index[0][3].is_sorted());
-        if let Index::INT32(index) = &column_index[0][3] {
+        if let ColumnIndexMetaData::INT32(index) = &column_index[0][3] {
             check_native_page_index(
                 index,
                 325,
@@ -2039,7 +2063,7 @@ mod tests {
         };
         //col5->smallint_col: INT32 UNCOMPRESSED DO:0 FPO:77676 SZ:37325/37325/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: 0, max: 9, num_nulls: 0]
         assert!(&column_index[0][4].is_sorted());
-        if let Index::INT32(index) = &column_index[0][4] {
+        if let ColumnIndexMetaData::INT32(index) = &column_index[0][4] {
             check_native_page_index(
                 index,
                 325,
@@ -2052,7 +2076,7 @@ mod tests {
         };
         //col6->bigint_col: INT64 UNCOMPRESSED DO:0 FPO:152326 SZ:71598/71598/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: 0, max: 90, num_nulls: 0]
         assert!(!&column_index[0][5].is_sorted());
-        if let Index::INT64(index) = &column_index[0][5] {
+        if let ColumnIndexMetaData::INT64(index) = &column_index[0][5] {
             check_native_page_index(
                 index,
                 528,
@@ -2065,7 +2089,7 @@ mod tests {
         };
         //col7->float_col: FLOAT UNCOMPRESSED DO:0 FPO:223924 SZ:37325/37325/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: -0.0, max: 9.9, num_nulls: 0]
         assert!(&column_index[0][6].is_sorted());
-        if let Index::FLOAT(index) = &column_index[0][6] {
+        if let ColumnIndexMetaData::FLOAT(index) = &column_index[0][6] {
             check_native_page_index(
                 index,
                 325,
@@ -2078,7 +2102,7 @@ mod tests {
         };
         //col8->double_col: DOUBLE UNCOMPRESSED DO:0 FPO:261249 SZ:71598/71598/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: -0.0, max: 90.89999999999999, num_nulls: 0]
         assert!(!&column_index[0][7].is_sorted());
-        if let Index::DOUBLE(index) = &column_index[0][7] {
+        if let ColumnIndexMetaData::DOUBLE(index) = &column_index[0][7] {
             check_native_page_index(
                 index,
                 528,
@@ -2091,8 +2115,8 @@ mod tests {
         };
         //col9->date_string_col: BINARY UNCOMPRESSED DO:0 FPO:332847 SZ:111948/111948/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: 01/01/09, max: 12/31/10, num_nulls: 0]
         assert!(!&column_index[0][8].is_sorted());
-        if let Index::BYTE_ARRAY(index) = &column_index[0][8] {
-            check_native_page_index(
+        if let ColumnIndexMetaData::BYTE_ARRAY(index) = &column_index[0][8] {
+            check_byte_array_page_index(
                 index,
                 974,
                 get_row_group_min_max_bytes(row_group_metadata, 8),
@@ -2104,8 +2128,8 @@ mod tests {
         };
         //col10->string_col: BINARY UNCOMPRESSED DO:0 FPO:444795 SZ:45298/45298/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: 0, max: 9, num_nulls: 0]
         assert!(&column_index[0][9].is_sorted());
-        if let Index::BYTE_ARRAY(index) = &column_index[0][9] {
-            check_native_page_index(
+        if let ColumnIndexMetaData::BYTE_ARRAY(index) = &column_index[0][9] {
+            check_byte_array_page_index(
                 index,
                 352,
                 get_row_group_min_max_bytes(row_group_metadata, 9),
@@ -2118,14 +2142,14 @@ mod tests {
         //col11->timestamp_col: INT96 UNCOMPRESSED DO:0 FPO:490093 SZ:111948/111948/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[num_nulls: 0, min/max not defined]
         //Notice: min_max values for each page for this col not exits.
         assert!(!&column_index[0][10].is_sorted());
-        if let Index::NONE = &column_index[0][10] {
+        if let ColumnIndexMetaData::NONE = &column_index[0][10] {
             assert_eq!(row_group_offset_indexes[10].page_locations.len(), 974);
         } else {
             unreachable!()
         };
         //col12->year: INT32 UNCOMPRESSED DO:0 FPO:602041 SZ:37325/37325/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: 2009, max: 2010, num_nulls: 0]
         assert!(&column_index[0][11].is_sorted());
-        if let Index::INT32(index) = &column_index[0][11] {
+        if let ColumnIndexMetaData::INT32(index) = &column_index[0][11] {
             check_native_page_index(
                 index,
                 325,
@@ -2138,7 +2162,7 @@ mod tests {
         };
         //col13->month: INT32 UNCOMPRESSED DO:0 FPO:639366 SZ:37325/37325/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: 1, max: 12, num_nulls: 0]
         assert!(!&column_index[0][12].is_sorted());
-        if let Index::INT32(index) = &column_index[0][12] {
+        if let ColumnIndexMetaData::INT32(index) = &column_index[0][12] {
             check_native_page_index(
                 index,
                 325,
@@ -2152,17 +2176,31 @@ mod tests {
     }
 
     fn check_native_page_index<T: ParquetValueType>(
-        row_group_index: &NativeIndex<T>,
+        row_group_index: &PrimitiveColumnIndex<T>,
+        page_size: usize,
+        min_max: (&[u8], &[u8]),
+        boundary_order: BoundaryOrder,
+    ) {
+        assert_eq!(row_group_index.num_pages() as usize, page_size);
+        assert_eq!(row_group_index.boundary_order, boundary_order);
+        assert!(row_group_index.min_values().iter().all(|x| {
+            x >= &T::try_from_le_slice(min_max.0).unwrap()
+                && x <= &T::try_from_le_slice(min_max.1).unwrap()
+        }));
+    }
+
+    fn check_byte_array_page_index(
+        row_group_index: &ByteArrayColumnIndex,
         page_size: usize,
         min_max: (&[u8], &[u8]),
         boundary_order: BoundaryOrder,
     ) {
-        assert_eq!(row_group_index.indexes.len(), page_size);
+        assert_eq!(row_group_index.num_pages() as usize, page_size);
         assert_eq!(row_group_index.boundary_order, boundary_order);
-        row_group_index.indexes.iter().all(|x| {
-            x.min.as_ref().unwrap() >= &T::try_from_le_slice(min_max.0).unwrap()
-                && x.max.as_ref().unwrap() <= &T::try_from_le_slice(min_max.1).unwrap()
-        });
+        for i in 0..row_group_index.num_pages() as usize {
+            let x = row_group_index.min_value(i).unwrap();
+            assert!(x >= min_max.0 && x <= min_max.1);
+        }
     }
 
     fn get_row_group_min_max_bytes(r: &RowGroupMetaData, col_num: usize) -> (&[u8], &[u8]) {
@@ -2403,12 +2441,11 @@ mod tests {
         assert_eq!(c.len(), 1);
 
         match &c[0] {
-            Index::FIXED_LEN_BYTE_ARRAY(v) => {
-                assert_eq!(v.indexes.len(), 1);
-                let page_idx = &v.indexes[0];
-                assert_eq!(page_idx.null_count.unwrap(), 1);
-                assert_eq!(page_idx.min.as_ref().unwrap().as_ref(), &[0; 11]);
-                assert_eq!(page_idx.max.as_ref().unwrap().as_ref(), &[5; 11]);
+            ColumnIndexMetaData::FIXED_LEN_BYTE_ARRAY(v) => {
+                assert_eq!(v.num_pages(), 1);
+                assert_eq!(v.null_count(0).unwrap(), 1);
+                assert_eq!(v.min_value(0).unwrap(), &[0; 11]);
+                assert_eq!(v.max_value(0).unwrap(), &[5; 11]);
             }
             _ => unreachable!(),
         }
@@ -2509,8 +2546,8 @@ mod tests {
         }
         let file_metadata = file_writer.close().unwrap();
 
-        assert_eq!(file_metadata.num_rows, 25);
-        assert_eq!(file_metadata.row_groups.len(), 5);
+        assert_eq!(file_metadata.file_metadata().num_rows(), 25);
+        assert_eq!(file_metadata.num_row_groups(), 5);
 
         // read only the 3rd row group
         let read_options = ReadOptionsBuilder::new()
@@ -2539,11 +2576,11 @@ mod tests {
 
         // test that we got the index matching the row group
         match pg_idx {
-            Index::INT32(int_idx) => {
+            ColumnIndexMetaData::INT32(int_idx) => {
                 let min = col_stats.min_bytes_opt().unwrap().get_i32_le();
                 let max = col_stats.max_bytes_opt().unwrap().get_i32_le();
-                assert_eq!(int_idx.indexes[0].min(), Some(min).as_ref());
-                assert_eq!(int_idx.indexes[0].max(), Some(max).as_ref());
+                assert_eq!(int_idx.min_value(0), Some(min).as_ref());
+                assert_eq!(int_idx.max_value(0), Some(max).as_ref());
             }
             _ => panic!("wrong stats type"),
         }
@@ -2584,11 +2621,11 @@ mod tests {
 
             // test that we got the index matching the row group
             match pg_idx {
-                Index::INT32(int_idx) => {
+                ColumnIndexMetaData::INT32(int_idx) => {
                     let min = col_stats.min_bytes_opt().unwrap().get_i32_le();
                     let max = col_stats.max_bytes_opt().unwrap().get_i32_le();
-                    assert_eq!(int_idx.indexes[0].min(), Some(min).as_ref());
-                    assert_eq!(int_idx.indexes[0].max(), Some(max).as_ref());
+                    assert_eq!(int_idx.min_value(0), Some(min).as_ref());
+                    assert_eq!(int_idx.max_value(0), Some(max).as_ref());
                 }
                 _ => panic!("wrong stats type"),
             }
diff --git a/parquet/src/file/statistics.rs b/parquet/src/file/statistics.rs
index 02729a5016bb..0c54940fac3b 100644
--- a/parquet/src/file/statistics.rs
+++ b/parquet/src/file/statistics.rs
@@ -41,12 +41,11 @@
 
 use std::fmt;
 
-use crate::format::Statistics as TStatistics;
-
 use crate::basic::Type;
 use crate::data_type::private::ParquetValueType;
 use crate::data_type::*;
 use crate::errors::{ParquetError, Result};
+use crate::file::metadata::thrift_gen::PageStatistics;
 use crate::util::bit_util::FromBytes;
 
 pub(crate) mod private {
@@ -120,9 +119,9 @@ macro_rules! statistics_enum_func {
 }
 
 /// Converts Thrift definition into `Statistics`.
-pub fn from_thrift(
+pub(crate) fn from_thrift_page_stats(
     physical_type: Type,
-    thrift_stats: Option<TStatistics>,
+    thrift_stats: Option<PageStatistics>,
 ) -> Result<Option<Statistics>> {
     Ok(match thrift_stats {
         Some(stats) => {
@@ -269,7 +268,7 @@ pub fn from_thrift(
 }
 
 /// Convert Statistics into Thrift definition.
-pub fn to_thrift(stats: Option<&Statistics>) -> Option<TStatistics> {
+pub(crate) fn page_stats_to_thrift(stats: Option<&Statistics>) -> Option<PageStatistics> {
     let stats = stats?;
 
     // record null count if it can fit in i64
@@ -282,7 +281,7 @@ pub fn to_thrift(stats: Option<&Statistics>) -> Option<TStatistics> {
         .distinct_count_opt()
         .and_then(|value| i64::try_from(value).ok());
 
-    let mut thrift_stats = TStatistics {
+    let mut thrift_stats = PageStatistics {
         max: None,
         min: None,
         null_count,
@@ -319,15 +318,14 @@ pub fn to_thrift(stats: Option<&Statistics>) -> Option<TStatistics> {
 
 /// Strongly typed statistics for a column chunk within a row group.
 ///
-/// This structure is a natively typed, in memory representation of the
-/// [`Statistics`] structure in a parquet file footer. The statistics stored in
+/// This structure is a natively typed, in memory representation of the thrift
+/// `Statistics` structure in a Parquet file footer. The statistics stored in
 /// this structure can be used by query engines to skip decoding pages while
 /// reading parquet data.
 ///
-/// Page level statistics are stored separately, in [NativeIndex].
+/// Page level statistics are stored separately, in [ColumnIndexMetaData].
 ///
-/// [`Statistics`]: crate::format::Statistics
-/// [NativeIndex]: crate::file::page_index::index::NativeIndex
+/// [ColumnIndexMetaData]: crate::file::page_index::column_index::ColumnIndexMetaData
 #[derive(Debug, Clone, PartialEq)]
 pub enum Statistics {
     /// Statistics for Boolean column
@@ -702,7 +700,7 @@ mod tests {
     #[test]
     #[should_panic(expected = "General(\"Statistics null count is negative -10\")")]
     fn test_statistics_negative_null_count() {
-        let thrift_stats = TStatistics {
+        let thrift_stats = PageStatistics {
             max: None,
             min: None,
             null_count: Some(-10),
@@ -713,13 +711,16 @@ mod tests {
             is_min_value_exact: None,
         };
 
-        from_thrift(Type::INT32, Some(thrift_stats)).unwrap();
+        from_thrift_page_stats(Type::INT32, Some(thrift_stats)).unwrap();
     }
 
     #[test]
     fn test_statistics_thrift_none() {
-        assert_eq!(from_thrift(Type::INT32, None).unwrap(), None);
-        assert_eq!(from_thrift(Type::BYTE_ARRAY, None).unwrap(), None);
+        assert_eq!(from_thrift_page_stats(Type::INT32, None).unwrap(), None);
+        assert_eq!(
+            from_thrift_page_stats(Type::BYTE_ARRAY, None).unwrap(),
+            None
+        );
     }
 
     #[test]
@@ -864,8 +865,11 @@ mod tests {
         // Helper method to check statistics conversion.
         fn check_stats(stats: Statistics) {
             let tpe = stats.physical_type();
-            let thrift_stats = to_thrift(Some(&stats));
-            assert_eq!(from_thrift(tpe, thrift_stats).unwrap(), Some(stats));
+            let thrift_stats = page_stats_to_thrift(Some(&stats));
+            assert_eq!(
+                from_thrift_page_stats(tpe, thrift_stats).unwrap(),
+                Some(stats)
+            );
         }
 
         check_stats(Statistics::boolean(
@@ -1001,7 +1005,7 @@ mod tests {
     fn test_count_encoding_distinct_too_large() {
         // statistics are stored using i64, so test trying to store larger values
         let statistics = make_bool_stats(Some(u64::MAX), Some(100));
-        let thrift_stats = to_thrift(Some(&statistics)).unwrap();
+        let thrift_stats = page_stats_to_thrift(Some(&statistics)).unwrap();
         assert_eq!(thrift_stats.distinct_count, None); // can't store u64 max --> null
         assert_eq!(thrift_stats.null_count, Some(100));
     }
@@ -1010,18 +1014,24 @@ mod tests {
     fn test_count_encoding_null_too_large() {
         // statistics are stored using i64, so test trying to store larger values
         let statistics = make_bool_stats(Some(100), Some(u64::MAX));
-        let thrift_stats = to_thrift(Some(&statistics)).unwrap();
+        let thrift_stats = page_stats_to_thrift(Some(&statistics)).unwrap();
         assert_eq!(thrift_stats.distinct_count, Some(100));
         assert_eq!(thrift_stats.null_count, None); // can' store u64 max --> null
     }
 
     #[test]
     fn test_count_decoding_null_invalid() {
-        let tstatistics = TStatistics {
+        let tstatistics = PageStatistics {
             null_count: Some(-42),
-            ..Default::default()
+            max: None,
+            min: None,
+            distinct_count: None,
+            max_value: None,
+            min_value: None,
+            is_max_value_exact: None,
+            is_min_value_exact: None,
         };
-        let err = from_thrift(Type::BOOLEAN, Some(tstatistics)).unwrap_err();
+        let err = from_thrift_page_stats(Type::BOOLEAN, Some(tstatistics)).unwrap_err();
         assert_eq!(
             err.to_string(),
             "Parquet error: Statistics null count is negative -42"
@@ -1034,14 +1044,14 @@ mod tests {
     fn statistics_count_test(distinct_count: Option<u64>, null_count: Option<u64>) {
         let statistics = make_bool_stats(distinct_count, null_count);
 
-        let thrift_stats = to_thrift(Some(&statistics)).unwrap();
+        let thrift_stats = page_stats_to_thrift(Some(&statistics)).unwrap();
         assert_eq!(thrift_stats.null_count.map(|c| c as u64), null_count);
         assert_eq!(
             thrift_stats.distinct_count.map(|c| c as u64),
             distinct_count
         );
 
-        let round_tripped = from_thrift(Type::BOOLEAN, Some(thrift_stats))
+        let round_tripped = from_thrift_page_stats(Type::BOOLEAN, Some(thrift_stats))
             .unwrap()
             .unwrap();
         // TODO: remove branch when we no longer support assuming null_count==None in the thrift
diff --git a/parquet/src/file/writer.rs b/parquet/src/file/writer.rs
index fa72b060ea84..a6c13cfa2cb0 100644
--- a/parquet/src/file/writer.rs
+++ b/parquet/src/file/writer.rs
@@ -18,13 +18,13 @@
 //! [`SerializedFileWriter`]: Low level Parquet writer API
 
 use crate::bloom_filter::Sbbf;
-use crate::format as parquet;
-use crate::format::{ColumnIndex, OffsetIndex};
-use crate::thrift::TSerializable;
+use crate::file::metadata::thrift_gen::PageHeader;
+use crate::file::page_index::column_index::ColumnIndexMetaData;
+use crate::file::page_index::offset_index::OffsetIndexMetaData;
+use crate::parquet_thrift::{ThriftCompactOutputProtocol, WriteThrift};
 use std::fmt::Debug;
 use std::io::{BufWriter, IoSlice, Read};
 use std::{io::Write, sync::Arc};
-use thrift::protocol::TCompactOutputProtocol;
 
 use crate::column::page_encryption::PageEncryptor;
 use crate::column::writer::{get_typed_column_writer_mut, ColumnCloseResult, ColumnWriterImpl};
@@ -127,8 +127,8 @@ pub type OnCloseRowGroup<'a, W> = Box<
             &'a mut TrackedWrite<W>,
             RowGroupMetaData,
             Vec<Option<Sbbf>>,
-            Vec<Option<ColumnIndex>>,
-            Vec<Option<OffsetIndex>>,
+            Vec<Option<ColumnIndexMetaData>>,
+            Vec<Option<OffsetIndexMetaData>>,
         ) -> Result<()>
         + 'a
         + Send,
@@ -155,13 +155,12 @@ pub type OnCloseRowGroup<'a, W> = Box<
 /// - After all row groups have been written, close the file writer using `close` method.
 pub struct SerializedFileWriter<W: Write> {
     buf: TrackedWrite<W>,
-    schema: TypePtr,
     descr: SchemaDescPtr,
     props: WriterPropertiesPtr,
     row_groups: Vec<RowGroupMetaData>,
     bloom_filters: Vec<Vec<Option<Sbbf>>>,
-    column_indexes: Vec<Vec<Option<ColumnIndex>>>,
-    offset_indexes: Vec<Vec<Option<OffsetIndex>>>,
+    column_indexes: Vec<Vec<Option<ColumnIndexMetaData>>>,
+    offset_indexes: Vec<Vec<Option<OffsetIndexMetaData>>>,
     row_group_index: usize,
     // kv_metadatas will be appended to `props` when `write_metadata`
     kv_metadatas: Vec<KeyValue>,
@@ -195,7 +194,6 @@ impl<W: Write + Send> SerializedFileWriter<W> {
         Self::start_file(&properties, &mut buf)?;
         Ok(Self {
             buf,
-            schema,
             descr: Arc::new(schema_descriptor),
             props: properties,
             row_groups: vec![],
@@ -298,7 +296,7 @@ impl<W: Write + Send> SerializedFileWriter<W> {
     /// Unlike [`Self::close`] this does not consume self
     ///
     /// Attempting to write after calling finish will result in an error
-    pub fn finish(&mut self) -> Result<parquet::FileMetaData> {
+    pub fn finish(&mut self) -> Result<ParquetMetaData> {
         self.assert_previous_writer_closed()?;
         let metadata = self.write_metadata()?;
         self.buf.flush()?;
@@ -306,7 +304,7 @@ impl<W: Write + Send> SerializedFileWriter<W> {
     }
 
     /// Closes and finalises file writer, returning the file metadata.
-    pub fn close(mut self) -> Result<parquet::FileMetaData> {
+    pub fn close(mut self) -> Result<ParquetMetaData> {
         self.finish()
     }
 
@@ -326,8 +324,9 @@ impl<W: Write + Send> SerializedFileWriter<W> {
         Ok(())
     }
 
-    /// Assembles and writes metadata at the end of the file.
-    fn write_metadata(&mut self) -> Result<parquet::FileMetaData> {
+    /// Assembles and writes metadata at the end of the file. This will take ownership
+    /// of `row_groups` and the page index structures.
+    fn write_metadata(&mut self) -> Result<ParquetMetaData> {
         self.finished = true;
 
         // write out any remaining bloom filters after all row groups
@@ -341,15 +340,13 @@ impl<W: Write + Send> SerializedFileWriter<W> {
             None => Some(self.kv_metadatas.clone()),
         };
 
-        let row_groups = self
-            .row_groups
-            .iter()
-            .map(|v| v.to_thrift())
-            .collect::<Vec<_>>();
+        // take ownership of metadata
+        let row_groups = std::mem::take(&mut self.row_groups);
+        let column_indexes = std::mem::take(&mut self.column_indexes);
+        let offset_indexes = std::mem::take(&mut self.offset_indexes);
 
         let mut encoder = ThriftMetadataWriter::new(
             &mut self.buf,
-            &self.schema,
             &self.descr,
             row_groups,
             Some(self.props.created_by().to_string()),
@@ -364,8 +361,11 @@ impl<W: Write + Send> SerializedFileWriter<W> {
         if let Some(key_value_metadata) = key_value_metadata {
             encoder = encoder.with_key_value_metadata(key_value_metadata)
         }
-        encoder = encoder.with_column_indexes(&self.column_indexes);
-        encoder = encoder.with_offset_indexes(&self.offset_indexes);
+
+        encoder = encoder.with_column_indexes(column_indexes);
+        if !self.props.offset_index_disabled() {
+            encoder = encoder.with_offset_indexes(offset_indexes);
+        }
         encoder.finish()
     }
 
@@ -508,8 +508,8 @@ pub struct SerializedRowGroupWriter<'a, W: Write> {
     row_group_metadata: Option<RowGroupMetaDataPtr>,
     column_chunks: Vec<ColumnChunkMetaData>,
     bloom_filters: Vec<Option<Sbbf>>,
-    column_indexes: Vec<Option<ColumnIndex>>,
-    offset_indexes: Vec<Option<OffsetIndex>>,
+    column_indexes: Vec<Option<ColumnIndexMetaData>>,
+    offset_indexes: Vec<Option<OffsetIndexMetaData>>,
     row_group_index: i16,
     file_offset: i64,
     on_close: Option<OnCloseRowGroup<'a, W>>,
@@ -918,15 +918,15 @@ impl<'a, W: Write> SerializedPageWriter<'a, W> {
     /// Serializes page header into Thrift.
     /// Returns number of bytes that have been written into the sink.
     #[inline]
-    fn serialize_page_header(&mut self, header: parquet::PageHeader) -> Result<usize> {
+    fn serialize_page_header(&mut self, header: PageHeader) -> Result<usize> {
         let start_pos = self.sink.bytes_written();
         match self.page_encryptor_and_sink_mut() {
             Some((page_encryptor, sink)) => {
                 page_encryptor.encrypt_page_header(&header, sink)?;
             }
             None => {
-                let mut protocol = TCompactOutputProtocol::new(&mut self.sink);
-                header.write_to_out_protocol(&mut protocol)?;
+                let mut protocol = ThriftCompactOutputProtocol::new(&mut self.sink);
+                header.write_thrift(&mut protocol)?;
             }
         }
         Ok(self.sink.bytes_written() - start_pos)
@@ -1041,15 +1041,15 @@ mod tests {
     use crate::column::reader::get_typed_column_reader;
     use crate::compression::{create_codec, Codec, CodecOptionsBuilder};
     use crate::data_type::{BoolType, ByteArrayType, Int32Type};
-    use crate::file::page_index::index::Index;
+    use crate::file::page_index::column_index::ColumnIndexMetaData;
     use crate::file::properties::EnabledStatistics;
     use crate::file::serialized_reader::ReadOptionsBuilder;
+    use crate::file::statistics::{from_thrift_page_stats, page_stats_to_thrift};
     use crate::file::{
         properties::{ReaderProperties, WriterProperties, WriterVersion},
         reader::{FileReader, SerializedFileReader, SerializedPageReader},
-        statistics::{from_thrift, to_thrift, Statistics},
+        statistics::Statistics,
     };
-    use crate::format::SortingColumn;
     use crate::record::{Row, RowAccessor};
     use crate::schema::parser::parse_message_type;
     use crate::schema::types;
@@ -1499,8 +1499,11 @@ mod tests {
                         encoding,
                         def_level_encoding,
                         rep_level_encoding,
-                        statistics: from_thrift(physical_type, to_thrift(statistics.as_ref()))
-                            .unwrap(),
+                        statistics: from_thrift_page_stats(
+                            physical_type,
+                            page_stats_to_thrift(statistics.as_ref()),
+                        )
+                        .unwrap(),
                     }
                 }
                 Page::DataPageV2 {
@@ -1529,8 +1532,11 @@ mod tests {
                         def_levels_byte_len,
                         rep_levels_byte_len,
                         is_compressed: compressor.is_some(),
-                        statistics: from_thrift(physical_type, to_thrift(statistics.as_ref()))
-                            .unwrap(),
+                        statistics: from_thrift_page_stats(
+                            physical_type,
+                            page_stats_to_thrift(statistics.as_ref()),
+                        )
+                        .unwrap(),
                     }
                 }
                 Page::DictionaryPage {
@@ -1582,6 +1588,7 @@ mod tests {
 
             let props = ReaderProperties::builder()
                 .set_backward_compatible_lz4(false)
+                .set_read_page_statistics(true)
                 .build();
             let mut page_reader = SerializedPageReader::new_with_properties(
                 Arc::new(reader),
@@ -1620,7 +1627,10 @@ mod tests {
         assert_eq!(&left.buffer(), &right.buffer());
         assert_eq!(left.num_values(), right.num_values());
         assert_eq!(left.encoding(), right.encoding());
-        assert_eq!(to_thrift(left.statistics()), to_thrift(right.statistics()));
+        assert_eq!(
+            page_stats_to_thrift(left.statistics()),
+            page_stats_to_thrift(right.statistics())
+        );
     }
 
     /// Tests roundtrip of i32 data written using `W` and read using `R`
@@ -1628,7 +1638,7 @@ mod tests {
         file: W,
         data: Vec<Vec<i32>>,
         compression: Compression,
-    ) -> crate::format::FileMetaData
+    ) -> ParquetMetaData
     where
         W: Write + Send,
         R: ChunkReader + From<W> + 'static,
@@ -1643,7 +1653,7 @@ mod tests {
         data: Vec<Vec<D::T>>,
         value: F,
         compression: Compression,
-    ) -> crate::format::FileMetaData
+    ) -> ParquetMetaData
     where
         W: Write + Send,
         R: ChunkReader + From<W> + 'static,
@@ -1714,7 +1724,7 @@ mod tests {
 
     /// File write-read roundtrip.
     /// `data` consists of arrays of values for each row group.
-    fn test_file_roundtrip(file: File, data: Vec<Vec<i32>>) -> crate::format::FileMetaData {
+    fn test_file_roundtrip(file: File, data: Vec<Vec<i32>>) -> ParquetMetaData {
         test_roundtrip_i32::<File, File>(file, data, Compression::UNCOMPRESSED)
     }
 
@@ -1789,13 +1799,12 @@ mod tests {
     fn test_column_offset_index_file() {
         let file = tempfile::tempfile().unwrap();
         let file_metadata = test_file_roundtrip(file, vec![vec![1, 2, 3, 4, 5]]);
-        file_metadata.row_groups.iter().for_each(|row_group| {
-            row_group.columns.iter().for_each(|column_chunk| {
-                assert_ne!(None, column_chunk.column_index_offset);
-                assert_ne!(None, column_chunk.column_index_length);
-
-                assert_ne!(None, column_chunk.offset_index_offset);
-                assert_ne!(None, column_chunk.offset_index_length);
+        file_metadata.row_groups().iter().for_each(|row_group| {
+            row_group.columns().iter().for_each(|column_chunk| {
+                assert!(column_chunk.column_index_offset().is_some());
+                assert!(column_chunk.column_index_length().is_some());
+                assert!(column_chunk.offset_index_offset().is_some());
+                assert!(column_chunk.offset_index_length().is_some());
             })
         });
     }
@@ -1888,29 +1897,22 @@ mod tests {
         let metadata = row_group_writer.close().unwrap();
         writer.close().unwrap();
 
-        let thrift = metadata.to_thrift();
-        let encoded_stats: Vec<_> = thrift
-            .columns
-            .into_iter()
-            .map(|x| x.meta_data.unwrap().statistics.unwrap())
-            .collect();
-
         // decimal
-        let s = &encoded_stats[0];
+        let s = page_stats_to_thrift(metadata.column(0).statistics()).unwrap();
         assert_eq!(s.min.as_deref(), Some(1_i32.to_le_bytes().as_ref()));
         assert_eq!(s.max.as_deref(), Some(3_i32.to_le_bytes().as_ref()));
         assert_eq!(s.min_value.as_deref(), Some(1_i32.to_le_bytes().as_ref()));
         assert_eq!(s.max_value.as_deref(), Some(3_i32.to_le_bytes().as_ref()));
 
         // i32
-        let s = &encoded_stats[1];
+        let s = page_stats_to_thrift(metadata.column(1).statistics()).unwrap();
         assert_eq!(s.min.as_deref(), Some(1_i32.to_le_bytes().as_ref()));
         assert_eq!(s.max.as_deref(), Some(3_i32.to_le_bytes().as_ref()));
         assert_eq!(s.min_value.as_deref(), Some(1_i32.to_le_bytes().as_ref()));
         assert_eq!(s.max_value.as_deref(), Some(3_i32.to_le_bytes().as_ref()));
 
         // u32
-        let s = &encoded_stats[2];
+        let s = page_stats_to_thrift(metadata.column(2).statistics()).unwrap();
         assert_eq!(s.min.as_deref(), None);
         assert_eq!(s.max.as_deref(), None);
         assert_eq!(s.min_value.as_deref(), Some(1_i32.to_le_bytes().as_ref()));
@@ -2036,15 +2038,15 @@ mod tests {
         row_group_writer.close().unwrap();
 
         let metadata = file_writer.finish().unwrap();
-        assert_eq!(metadata.row_groups.len(), 1);
-        let row_group = &metadata.row_groups[0];
-        assert_eq!(row_group.columns.len(), 2);
+        assert_eq!(metadata.num_row_groups(), 1);
+        let row_group = metadata.row_group(0);
+        assert_eq!(row_group.num_columns(), 2);
         // Column "a" has both offset and column index, as requested
-        assert!(row_group.columns[0].offset_index_offset.is_some());
-        assert!(row_group.columns[0].column_index_offset.is_some());
+        assert!(row_group.column(0).offset_index_offset().is_some());
+        assert!(row_group.column(0).column_index_offset().is_some());
         // Column "b" should only have offset index
-        assert!(row_group.columns[1].offset_index_offset.is_some());
-        assert!(row_group.columns[1].column_index_offset.is_none());
+        assert!(row_group.column(1).offset_index_offset().is_some());
+        assert!(row_group.column(1).column_index_offset().is_none());
 
         let err = file_writer.next_row_group().err().unwrap().to_string();
         assert_eq!(err, "Parquet error: SerializedFileWriter already finished");
@@ -2063,9 +2065,9 @@ mod tests {
         assert_eq!(column_index[0].len(), 2); // 2 column
 
         let a_idx = &column_index[0][0];
-        assert!(matches!(a_idx, Index::INT32(_)), "{a_idx:?}");
+        assert!(matches!(a_idx, ColumnIndexMetaData::INT32(_)), "{a_idx:?}");
         let b_idx = &column_index[0][1];
-        assert!(matches!(b_idx, Index::NONE), "{b_idx:?}");
+        assert!(matches!(b_idx, ColumnIndexMetaData::NONE), "{b_idx:?}");
     }
 
     #[test]
@@ -2098,9 +2100,8 @@ mod tests {
         row_group_writer.close().unwrap();
         let file_metadata = writer.close().unwrap();
 
-        assert_eq!(file_metadata.row_groups.len(), 1);
-        assert_eq!(file_metadata.row_groups[0].columns.len(), 1);
-        assert!(file_metadata.row_groups[0].columns[0].meta_data.is_some());
+        assert_eq!(file_metadata.num_row_groups(), 1);
+        assert_eq!(file_metadata.row_group(0).num_columns(), 1);
 
         let check_def_hist = |def_hist: &[i64]| {
             assert_eq!(def_hist.len(), 2);
@@ -2108,29 +2109,26 @@ mod tests {
             assert_eq!(def_hist[1], 7);
         };
 
-        assert!(file_metadata.row_groups[0].columns[0].meta_data.is_some());
-        let meta_data = file_metadata.row_groups[0].columns[0]
-            .meta_data
-            .as_ref()
-            .unwrap();
-        assert!(meta_data.size_statistics.is_some());
-        let size_stats = meta_data.size_statistics.as_ref().unwrap();
+        let meta_data = file_metadata.row_group(0).column(0);
 
-        assert!(size_stats.repetition_level_histogram.is_none());
-        assert!(size_stats.definition_level_histogram.is_some());
-        assert!(size_stats.unencoded_byte_array_data_bytes.is_some());
+        assert!(meta_data.repetition_level_histogram().is_none());
+        assert!(meta_data.definition_level_histogram().is_some());
+        assert!(meta_data.unencoded_byte_array_data_bytes().is_some());
         assert_eq!(
             unenc_size,
-            size_stats.unencoded_byte_array_data_bytes.unwrap()
+            meta_data.unencoded_byte_array_data_bytes().unwrap()
         );
-        check_def_hist(size_stats.definition_level_histogram.as_ref().unwrap());
+        check_def_hist(meta_data.definition_level_histogram().unwrap().values());
 
         // check that the read metadata is also correct
         let options = ReadOptionsBuilder::new().with_page_index().build();
         let reader = SerializedFileReader::new_with_options(file, options).unwrap();
 
         let rfile_metadata = reader.metadata().file_metadata();
-        assert_eq!(rfile_metadata.num_rows(), file_metadata.num_rows);
+        assert_eq!(
+            rfile_metadata.num_rows(),
+            file_metadata.file_metadata().num_rows()
+        );
         assert_eq!(reader.num_row_groups(), 1);
         let rowgroup = reader.get_row_group(0).unwrap();
         assert_eq!(rowgroup.num_columns(), 1);
@@ -2149,16 +2147,16 @@ mod tests {
         let column_index = reader.metadata().column_index().unwrap();
         assert_eq!(column_index.len(), 1);
         assert_eq!(column_index[0].len(), 1);
-        let col_idx = if let Index::BYTE_ARRAY(index) = &column_index[0][0] {
-            assert_eq!(index.indexes.len(), 1);
-            &index.indexes[0]
+        let col_idx = if let ColumnIndexMetaData::BYTE_ARRAY(index) = &column_index[0][0] {
+            assert_eq!(index.num_pages(), 1);
+            index
         } else {
             unreachable!()
         };
 
-        assert!(col_idx.repetition_level_histogram().is_none());
-        assert!(col_idx.definition_level_histogram().is_some());
-        check_def_hist(col_idx.definition_level_histogram().unwrap().values());
+        assert!(col_idx.repetition_level_histogram(0).is_none());
+        assert!(col_idx.definition_level_histogram(0).is_some());
+        check_def_hist(col_idx.definition_level_histogram(0).unwrap());
 
         assert!(reader.metadata().offset_index().is_some());
         let offset_index = reader.metadata().offset_index().unwrap();
@@ -2250,9 +2248,8 @@ mod tests {
         row_group_writer.close().unwrap();
         let file_metadata = writer.close().unwrap();
 
-        assert_eq!(file_metadata.row_groups.len(), 1);
-        assert_eq!(file_metadata.row_groups[0].columns.len(), 1);
-        assert!(file_metadata.row_groups[0].columns[0].meta_data.is_some());
+        assert_eq!(file_metadata.num_row_groups(), 1);
+        assert_eq!(file_metadata.row_group(0).num_columns(), 1);
 
         let check_def_hist = |def_hist: &[i64]| {
             assert_eq!(def_hist.len(), 4);
@@ -2270,25 +2267,22 @@ mod tests {
 
         // check that histograms are set properly in the write and read metadata
         // also check that unencoded_byte_array_data_bytes is not set
-        assert!(file_metadata.row_groups[0].columns[0].meta_data.is_some());
-        let meta_data = file_metadata.row_groups[0].columns[0]
-            .meta_data
-            .as_ref()
-            .unwrap();
-        assert!(meta_data.size_statistics.is_some());
-        let size_stats = meta_data.size_statistics.as_ref().unwrap();
-        assert!(size_stats.repetition_level_histogram.is_some());
-        assert!(size_stats.definition_level_histogram.is_some());
-        assert!(size_stats.unencoded_byte_array_data_bytes.is_none());
-        check_def_hist(size_stats.definition_level_histogram.as_ref().unwrap());
-        check_rep_hist(size_stats.repetition_level_histogram.as_ref().unwrap());
+        let meta_data = file_metadata.row_group(0).column(0);
+        assert!(meta_data.repetition_level_histogram().is_some());
+        assert!(meta_data.definition_level_histogram().is_some());
+        assert!(meta_data.unencoded_byte_array_data_bytes().is_none());
+        check_def_hist(meta_data.definition_level_histogram().unwrap().values());
+        check_rep_hist(meta_data.repetition_level_histogram().unwrap().values());
 
         // check that the read metadata is also correct
         let options = ReadOptionsBuilder::new().with_page_index().build();
         let reader = SerializedFileReader::new_with_options(file, options).unwrap();
 
         let rfile_metadata = reader.metadata().file_metadata();
-        assert_eq!(rfile_metadata.num_rows(), file_metadata.num_rows);
+        assert_eq!(
+            rfile_metadata.num_rows(),
+            file_metadata.file_metadata().num_rows()
+        );
         assert_eq!(reader.num_row_groups(), 1);
         let rowgroup = reader.get_row_group(0).unwrap();
         assert_eq!(rowgroup.num_columns(), 1);
@@ -2304,15 +2298,15 @@ mod tests {
         let column_index = reader.metadata().column_index().unwrap();
         assert_eq!(column_index.len(), 1);
         assert_eq!(column_index[0].len(), 1);
-        let col_idx = if let Index::INT32(index) = &column_index[0][0] {
-            assert_eq!(index.indexes.len(), 1);
-            &index.indexes[0]
+        let col_idx = if let ColumnIndexMetaData::INT32(index) = &column_index[0][0] {
+            assert_eq!(index.num_pages(), 1);
+            index
         } else {
             unreachable!()
         };
 
-        check_def_hist(col_idx.definition_level_histogram().unwrap().values());
-        check_rep_hist(col_idx.repetition_level_histogram().unwrap().values());
+        check_def_hist(col_idx.definition_level_histogram(0).unwrap());
+        check_rep_hist(col_idx.repetition_level_histogram(0).unwrap());
 
         assert!(reader.metadata().offset_index().is_some());
         let offset_index = reader.metadata().offset_index().unwrap();
diff --git a/parquet/src/geospatial/bounding_box.rs b/parquet/src/geospatial/bounding_box.rs
index 7d9eb58d0032..ce23696afcf3 100644
--- a/parquet/src/geospatial/bounding_box.rs
+++ b/parquet/src/geospatial/bounding_box.rs
@@ -21,7 +21,6 @@
 //! Derived from the parquet format spec: <https://github.com/apache/parquet-format/blob/master/Geospatial.md>
 //!
 //!
-use crate::format as parquet;
 
 /// A geospatial instance has at least two coordinate dimensions: X and Y for 2D coordinates of each point.
 /// X represents longitude/easting and Y represents latitude/northing. A geospatial instance can optionally
@@ -171,47 +170,6 @@ impl BoundingBox {
     }
 }
 
-impl From<BoundingBox> for parquet::BoundingBox {
-    /// Converts our internal `BoundingBox` to the Thrift-generated format.
-    fn from(b: BoundingBox) -> parquet::BoundingBox {
-        parquet::BoundingBox {
-            xmin: b.x_range.0.into(),
-            xmax: b.x_range.1.into(),
-            ymin: b.y_range.0.into(),
-            ymax: b.y_range.1.into(),
-            zmin: b.z_range.map(|z| z.0.into()),
-            zmax: b.z_range.map(|z| z.1.into()),
-            mmin: b.m_range.map(|m| m.0.into()),
-            mmax: b.m_range.map(|m| m.1.into()),
-        }
-    }
-}
-
-impl From<parquet::BoundingBox> for BoundingBox {
-    fn from(bbox: parquet::BoundingBox) -> Self {
-        let mut new_bbox = Self::new(
-            bbox.xmin.into(),
-            bbox.xmax.into(),
-            bbox.ymin.into(),
-            bbox.ymax.into(),
-        );
-
-        new_bbox = match (bbox.zmin, bbox.zmax) {
-            (Some(zmin), Some(zmax)) => new_bbox.with_zrange(zmin.into(), zmax.into()),
-            // If either None or mismatch, leave it as None and don't error
-            _ => new_bbox,
-        };
-
-        new_bbox = match (bbox.mmin, bbox.mmax) {
-            (Some(mmin), Some(mmax)) => new_bbox.with_mrange(mmin.into(), mmax.into()),
-            // If either None or mismatch, leave it as None and don't error
-            _ => new_bbox,
-        };
-
-        new_bbox
-    }
-}
-
 #[cfg(test)]
 mod tests {
     use super::*;
@@ -255,159 +213,4 @@ mod tests {
         assert!(bbox_zm.is_z_valid());
         assert!(bbox_zm.is_m_valid());
     }
-
-    #[test]
-    fn test_bounding_box_to_thrift() {
-        use thrift::OrderedFloat;
-
-        let bbox = BoundingBox::new(0.0, 0.0, 10.0, 10.0);
-        let thrift_bbox: parquet::BoundingBox = bbox.into();
-        assert_eq!(thrift_bbox.xmin, 0.0);
-        assert_eq!(thrift_bbox.xmax, 0.0);
-        assert_eq!(thrift_bbox.ymin, 10.0);
-        assert_eq!(thrift_bbox.ymax, 10.0);
-        assert_eq!(thrift_bbox.zmin, None);
-        assert_eq!(thrift_bbox.zmax, None);
-        assert_eq!(thrift_bbox.mmin, None);
-        assert_eq!(thrift_bbox.mmax, None);
-
-        let bbox_z = BoundingBox::new(0.0, 0.0, 10.0, 10.0).with_zrange(5.0, 15.0);
-        let thrift_bbox_z: parquet::BoundingBox = bbox_z.into();
-        assert_eq!(thrift_bbox_z.zmin, Some(OrderedFloat(5.0)));
-        assert_eq!(thrift_bbox_z.zmax, Some(OrderedFloat(15.0)));
-        assert_eq!(thrift_bbox_z.mmin, None);
-        assert_eq!(thrift_bbox_z.mmax, None);
-
-        let bbox_m = BoundingBox::new(0.0, 0.0, 10.0, 10.0).with_mrange(10.0, 20.0);
-        let thrift_bbox_m: parquet::BoundingBox = bbox_m.into();
-        assert_eq!(thrift_bbox_m.zmin, None);
-        assert_eq!(thrift_bbox_m.zmax, None);
-        assert_eq!(thrift_bbox_m.mmin, Some(OrderedFloat(10.0)));
-        assert_eq!(thrift_bbox_m.mmax, Some(OrderedFloat(20.0)));
-
-        let bbox_z_m = BoundingBox::new(0.0, 0.0, 10.0, 10.0)
-            .with_zrange(5.0, 15.0)
-            .with_mrange(10.0, 20.0);
-        let thrift_bbox_zm: parquet::BoundingBox = bbox_z_m.into();
-        assert_eq!(thrift_bbox_zm.zmin, Some(OrderedFloat(5.0)));
-        assert_eq!(thrift_bbox_zm.zmax, Some(OrderedFloat(15.0)));
-        assert_eq!(thrift_bbox_zm.mmin, Some(OrderedFloat(10.0)));
-        assert_eq!(thrift_bbox_zm.mmax, Some(OrderedFloat(20.0)));
-    }
-
-    #[test]
-    fn test_bounding_box_from_thrift() {
-        use thrift::OrderedFloat;
-
-        let thrift_bbox = parquet::BoundingBox {
-            xmin: OrderedFloat(0.0),
-            xmax: OrderedFloat(0.0),
-            ymin: OrderedFloat(10.0),
-            ymax: OrderedFloat(10.0),
-            zmin: None,
-            zmax: None,
-            mmin: None,
-            mmax: None,
-        };
-        let bbox: BoundingBox = thrift_bbox.into();
-        assert_eq!(bbox.get_xmin(), 0.0);
-        assert_eq!(bbox.get_xmax(), 0.0);
-        assert_eq!(bbox.get_ymin(), 10.0);
-        assert_eq!(bbox.get_ymax(), 10.0);
-        assert_eq!(bbox.get_zmin(), None);
-        assert_eq!(bbox.get_zmax(), None);
-        assert_eq!(bbox.get_mmin(), None);
-        assert_eq!(bbox.get_mmax(), None);
-
-        let thrift_bbox_z = parquet::BoundingBox {
-            xmin: OrderedFloat(0.0),
-            xmax: OrderedFloat(0.0),
-            ymin: OrderedFloat(10.0),
-            ymax: OrderedFloat(10.0),
-            zmin: Some(OrderedFloat(130.0)),
-            zmax: Some(OrderedFloat(130.0)),
-            mmin: None,
-            mmax: None,
-        };
-        let bbox_z: BoundingBox = thrift_bbox_z.into();
-        assert_eq!(bbox_z.get_xmin(), 0.0);
-        assert_eq!(bbox_z.get_xmax(), 0.0);
-        assert_eq!(bbox_z.get_ymin(), 10.0);
-        assert_eq!(bbox_z.get_ymax(), 10.0);
-        assert_eq!(bbox_z.get_zmin(), Some(130.0));
-        assert_eq!(bbox_z.get_zmax(), Some(130.0));
-        assert_eq!(bbox_z.get_mmin(), None);
-        assert_eq!(bbox_z.get_mmax(), None);
-
-        let thrift_bbox_m = parquet::BoundingBox {
-            xmin: OrderedFloat(0.0),
-            xmax: OrderedFloat(0.0),
-            ymin: OrderedFloat(10.0),
-            ymax: OrderedFloat(10.0),
-            zmin: None,
-            zmax: None,
-            mmin: Some(OrderedFloat(120.0)),
-            mmax: Some(OrderedFloat(120.0)),
-        };
-        let bbox_m: BoundingBox = thrift_bbox_m.into();
-        assert_eq!(bbox_m.get_xmin(), 0.0);
-        assert_eq!(bbox_m.get_xmax(), 0.0);
-        assert_eq!(bbox_m.get_ymin(), 10.0);
-        assert_eq!(bbox_m.get_ymax(), 10.0);
-        assert_eq!(bbox_m.get_zmin(), None);
-        assert_eq!(bbox_m.get_zmax(), None);
-        assert_eq!(bbox_m.get_mmin(), Some(120.0));
-        assert_eq!(bbox_m.get_mmax(), Some(120.0));
-
-        let thrift_bbox_zm = parquet::BoundingBox {
-            xmin: OrderedFloat(0.0),
-            xmax: OrderedFloat(0.0),
-            ymin: OrderedFloat(10.0),
-            ymax: OrderedFloat(10.0),
-            zmin: Some(OrderedFloat(130.0)),
-            zmax: Some(OrderedFloat(130.0)),
-            mmin: Some(OrderedFloat(120.0)),
-            mmax: Some(OrderedFloat(120.0)),
-        };
-
-        let bbox_zm: BoundingBox = thrift_bbox_zm.into();
-        assert_eq!(bbox_zm.get_xmin(), 0.0);
-        assert_eq!(bbox_zm.get_xmax(), 0.0);
-        assert_eq!(bbox_zm.get_ymin(), 10.0);
-        assert_eq!(bbox_zm.get_ymax(), 10.0);
-        assert_eq!(bbox_zm.get_zmin(), Some(130.0));
-        assert_eq!(bbox_zm.get_zmax(), Some(130.0));
-        assert_eq!(bbox_zm.get_mmin(), Some(120.0));
-        assert_eq!(bbox_zm.get_mmax(), Some(120.0));
-    }
-
-    #[test]
-    fn test_bounding_box_thrift_roundtrip() {
-        use thrift::OrderedFloat;
-
-        let thrift_bbox = parquet::BoundingBox {
-            xmin: OrderedFloat(0.0),
-            xmax: OrderedFloat(0.0),
-            ymin: OrderedFloat(10.0),
-            ymax: OrderedFloat(10.0),
-            zmin: Some(OrderedFloat(130.0)),
-            zmax: Some(OrderedFloat(130.0)),
-            mmin: Some(OrderedFloat(120.0)),
-            mmax: Some(OrderedFloat(120.0)),
-        };
-
-        // cloning to make sure it's not moved
-        let bbox: BoundingBox = thrift_bbox.clone().into();
-        assert_eq!(bbox.get_xmin(), 0.0);
-        assert_eq!(bbox.get_xmax(), 0.0);
-        assert_eq!(bbox.get_ymin(), 10.0);
-        assert_eq!(bbox.get_ymax(), 10.0);
-        assert_eq!(bbox.get_zmin(), Some(130.0));
-        assert_eq!(bbox.get_zmax(), Some(130.0));
-        assert_eq!(bbox.get_mmin(), Some(120.0));
-        assert_eq!(bbox.get_mmax(), Some(120.0));
-
-        let thrift_bbox_2: parquet::BoundingBox = bbox.into();
-        assert_eq!(thrift_bbox_2, thrift_bbox);
-    }
 }
diff --git a/parquet/src/geospatial/statistics.rs b/parquet/src/geospatial/statistics.rs
index 2a39c494bd0f..d3287412b143 100644
--- a/parquet/src/geospatial/statistics.rs
+++ b/parquet/src/geospatial/statistics.rs
@@ -20,7 +20,6 @@
 //! This module provides functionality for working with geospatial statistics in Parquet files.
 //! It includes support for bounding boxes and geospatial statistics in column chunk metadata.
 
-use crate::format::GeospatialStatistics as TGeospatialStatistics;
 use crate::geospatial::bounding_box::BoundingBox;
 
 // ----------------------------------------------------------------------
@@ -58,75 +57,23 @@ impl GeospatialStatistics {
             geospatial_types,
         }
     }
-}
 
-/// Converts a Thrift-generated geospatial statistics object to the internal representation.
-pub fn from_thrift(geo_statistics: Option<TGeospatialStatistics>) -> Option<GeospatialStatistics> {
-    let geo_stats = geo_statistics?;
-    let bbox = geo_stats.bbox.map(|bbox| bbox.into());
-    // If vector is empty, then set it to None
-    let geospatial_types: Option<Vec<i32>> = geo_stats.geospatial_types.filter(|v| !v.is_empty());
-    Some(GeospatialStatistics::new(bbox, geospatial_types))
-}
+    /// Optional bounding defining the spatial extent, where `None` represents a lack of information.
+    pub fn bounding_box(&self) -> Option<&BoundingBox> {
+        self.bbox.as_ref()
+    }
 
-/// Converts our internal geospatial statistics to the Thrift-generated format.
-pub fn to_thrift(geo_statistics: Option<&GeospatialStatistics>) -> Option<TGeospatialStatistics> {
-    let geo_stats = geo_statistics?;
-    let bbox = geo_stats.bbox.clone().map(|bbox| bbox.into());
-    let geospatial_types = geo_stats.geospatial_types.clone();
-    Some(TGeospatialStatistics::new(bbox, geospatial_types))
+    /// Optional list of geometry type identifiers, where `None` represents a lack of information.
+    pub fn geospatial_types(&self) -> Option<&Vec<i32>> {
+        self.geospatial_types.as_ref()
+    }
 }
 
 #[cfg(test)]
 mod tests {
     use super::*;
 
-    /// Tests the conversion from Thrift format when no statistics are provided.
-    #[test]
-    fn test_from_thrift() {
-        assert_eq!(from_thrift(None), None);
-        assert_eq!(
-            from_thrift(Some(TGeospatialStatistics::new(None, None))),
-            Some(GeospatialStatistics::default())
-        );
-    }
-
-    /// Tests the conversion from Thrift format with actual geospatial data.
-    #[test]
-    fn test_geo_statistics_from_thrift() {
-        let bbox = BoundingBox::new(0.0, 0.0, 100.0, 100.0);
-        let geospatial_types = vec![1, 2, 3];
-        let stats = GeospatialStatistics::new(Some(bbox), Some(geospatial_types));
-        let thrift_stats = to_thrift(Some(&stats));
-        assert_eq!(from_thrift(thrift_stats), Some(stats));
-    }
-
-    #[test]
-    fn test_bbox_to_thrift() {
-        use crate::format as parquet;
-        use thrift::OrderedFloat;
-
-        let bbox = BoundingBox::new(0.0, 0.0, 100.0, 100.0);
-        let thrift_bbox: parquet::BoundingBox = bbox.into();
-        assert_eq!(thrift_bbox.xmin, 0.0);
-        assert_eq!(thrift_bbox.xmax, 0.0);
-        assert_eq!(thrift_bbox.ymin, 100.0);
-        assert_eq!(thrift_bbox.ymax, 100.0);
-        assert_eq!(thrift_bbox.zmin, None);
-        assert_eq!(thrift_bbox.zmax, None);
-        assert_eq!(thrift_bbox.mmin, None);
-        assert_eq!(thrift_bbox.mmax, None);
-
-        let bbox_z = BoundingBox::new(0.0, 0.0, 100.0, 100.0).with_zrange(5.0, 15.0);
-        let thrift_bbox_z: parquet::BoundingBox = bbox_z.into();
-        assert_eq!(thrift_bbox_z.zmin, Some(OrderedFloat(5.0)));
-        assert_eq!(thrift_bbox_z.zmax, Some(OrderedFloat(15.0)));
-
-        let bbox_m = BoundingBox::new(0.0, 0.0, 100.0, 100.0).with_mrange(10.0, 20.0);
-        let thrift_bbox_m: parquet::BoundingBox = bbox_m.into();
-        assert_eq!(thrift_bbox_m.mmin, Some(OrderedFloat(10.0)));
-        assert_eq!(thrift_bbox_m.mmax, Some(OrderedFloat(20.0)));
-    }
+    // TODO(ets): add round trip to/from parquet tests
 
     #[test]
     fn test_read_geospatial_statistics_from_file() {
diff --git a/parquet/src/lib.rs b/parquet/src/lib.rs
index 19a69bce900b..f33365a2a8b1 100644
--- a/parquet/src/lib.rs
+++ b/parquet/src/lib.rs
@@ -188,6 +188,8 @@ pub mod file;
 pub mod record;
 pub mod schema;
 
+mod parquet_macros;
+mod parquet_thrift;
 pub mod thrift;
 /// What data is needed to read the next item from a decoder.
 ///
diff --git a/parquet/src/parquet_macros.rs b/parquet/src/parquet_macros.rs
new file mode 100644
index 000000000000..80dc9658c04f
--- /dev/null
+++ b/parquet/src/parquet_macros.rs
@@ -0,0 +1,479 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+// These macros are adapted from Jörn Horstmann's thrift macros at
+// https://github.com/jhorstmann/compact-thrift
+// They allow for pasting sections of the Parquet thrift IDL file
+// into a macro to generate rust structures and implementations.
+
+//! This is a collection of macros used to parse Thrift IDL descriptions of structs,
+//! unions, and enums into their corresponding Rust types. These macros will also
+//! generate the code necessary to serialize and deserialize to/from the [Thrift compact]
+//! protocol.
+//!
+//! Further details of how to use them (and other aspects of the Thrift serialization process)
+//! can be found in [THRIFT.md].
+//!
+//! [Thrift compact]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#list-and-set
+//! [THRIFT.md]: https://github.com/apache/arrow-rs/blob/main/parquet/THRIFT.md
+
+#[macro_export]
+#[allow(clippy::crate_in_macro_def)]
+/// Macro used to generate rust enums from a Thrift `enum` definition.
+///
+/// When utilizing this macro the Thrift serialization traits and structs need to be in scope.
+macro_rules! thrift_enum {
+    ($(#[$($def_attrs:tt)*])* enum $identifier:ident { $($(#[$($field_attrs:tt)*])* $field_name:ident = $field_value:literal;)* }) => {
+        $(#[$($def_attrs)*])*
+        #[derive(Clone, Copy, Debug, Eq, Hash, Ord, PartialEq, PartialOrd)]
+        #[allow(non_camel_case_types)]
+        #[allow(missing_docs)]
+        pub enum $identifier {
+            $($(#[cfg_attr(not(doctest), $($field_attrs)*)])* $field_name = $field_value,)*
+        }
+
+        impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for $identifier {
+            #[allow(deprecated)]
+            fn read_thrift(prot: &mut R) -> Result<Self> {
+                let val = prot.read_i32()?;
+                match val {
+                    $($field_value => Ok(Self::$field_name),)*
+                    _ => Err(general_err!("Unexpected {} {}", stringify!($identifier), val)),
+                }
+            }
+        }
+
+        impl fmt::Display for $identifier {
+            fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
+                write!(f, "{self:?}")
+            }
+        }
+
+        impl WriteThrift for $identifier {
+            const ELEMENT_TYPE: ElementType = ElementType::I32;
+
+            fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+                writer.write_i32(*self as i32)
+            }
+        }
+
+        impl WriteThriftField for $identifier {
+            fn write_thrift_field<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>, field_id: i16, last_field_id: i16) -> Result<i16> {
+                writer.write_field_begin(FieldType::I32, field_id, last_field_id)?;
+                self.write_thrift(writer)?;
+                Ok(field_id)
+            }
+        }
+    }
+}
+
+/// Macro used to generate Rust enums for Thrift unions in which all variants are typed with empty
+/// structs.
+///
+/// Because the compact protocol does not write any struct type information, these empty structs
+/// become a single `0` (end-of-fields marker) upon serialization. Rather than trying to deserialize
+/// an empty struct, we can instead simply read the `0` and discard it.
+///
+/// The resulting Rust enum will have all unit variants.
+///
+/// When utilizing this macro the Thrift serialization traits and structs need to be in scope.
+#[macro_export]
+#[allow(clippy::crate_in_macro_def)]
+macro_rules! thrift_union_all_empty {
+    ($(#[$($def_attrs:tt)*])* union $identifier:ident { $($(#[$($field_attrs:tt)*])* $field_id:literal : $field_type:ident $(< $element_type:ident >)? $field_name:ident $(;)?)* }) => {
+        $(#[cfg_attr(not(doctest), $($def_attrs)*)])*
+        #[derive(Clone, Copy, Debug, Eq, PartialEq)]
+        #[allow(non_camel_case_types)]
+        #[allow(non_snake_case)]
+        #[allow(missing_docs)]
+        pub enum $identifier {
+            $($(#[cfg_attr(not(doctest), $($field_attrs)*)])* $field_name),*
+        }
+
+        impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for $identifier {
+            fn read_thrift(prot: &mut R) -> Result<Self> {
+                let field_ident = prot.read_field_begin(0)?;
+                if field_ident.field_type == FieldType::Stop {
+                    return Err(general_err!("Received empty union from remote {}", stringify!($identifier)));
+                }
+                let ret = match field_ident.id {
+                    $($field_id => {
+                        prot.skip_empty_struct()?;
+                        Self::$field_name
+                    }
+                    )*
+                    _ => {
+                        return Err(general_err!("Unexpected {} {}", stringify!($identifier), field_ident.id));
+                    }
+                };
+                let field_ident = prot.read_field_begin(field_ident.id)?;
+                if field_ident.field_type != FieldType::Stop {
+                    return Err(general_err!(
+                        "Received multiple fields for union from remote {}", stringify!($identifier)
+                    ));
+                }
+                Ok(ret)
+            }
+        }
+
+        impl WriteThrift for $identifier {
+            const ELEMENT_TYPE: ElementType = ElementType::Struct;
+
+            fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+                match *self {
+                    $(Self::$field_name => writer.write_empty_struct($field_id, 0)?,)*
+                };
+                // write end of struct for this union
+                writer.write_struct_end()
+            }
+        }
+
+        impl WriteThriftField for $identifier {
+            fn write_thrift_field<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>, field_id: i16, last_field_id: i16) -> Result<i16> {
+                writer.write_field_begin(FieldType::Struct, field_id, last_field_id)?;
+                self.write_thrift(writer)?;
+                Ok(field_id)
+            }
+        }
+    }
+}
+
+/// Macro used to generate Rust enums for Thrift unions where variants are a mix of unit and
+/// tuple types.
+///
+/// Use of this macro requires modifying the thrift IDL. For variants with empty structs as their
+/// type, delete the typename (i.e. `1: EmptyStruct Var1;` becomes `1: Var1`). For variants with a
+/// non-empty type, the typename must be contained within parens (e.g. `1: MyType Var1;` becomes
+/// `1: (MyType) Var1;`).
+///
+/// This macro allows for specifying lifetime annotations for the resulting `enum` and its fields.
+///
+/// When utilizing this macro the Thrift serialization traits and structs need to be in scope.
+#[macro_export]
+#[allow(clippy::crate_in_macro_def)]
+macro_rules! thrift_union {
+    ($(#[$($def_attrs:tt)*])* union $identifier:ident $(< $lt:lifetime >)? { $($(#[$($field_attrs:tt)*])* $field_id:literal : $( ( $field_type:ident $(< $element_type:ident >)? $(< $field_lt:lifetime >)?) )? $field_name:ident $(;)?)* }) => {
+        $(#[cfg_attr(not(doctest), $($def_attrs)*)])*
+        #[derive(Clone, Debug, Eq, PartialEq)]
+        #[allow(non_camel_case_types)]
+        #[allow(non_snake_case)]
+        #[allow(missing_docs)]
+        pub enum $identifier $(<$lt>)? {
+            $($(#[cfg_attr(not(doctest), $($field_attrs)*)])* $field_name $( ( $crate::__thrift_union_type!{$field_type $($field_lt)? $($element_type)?} ) )?),*
+        }
+
+        impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for $identifier $(<$lt>)? {
+            fn read_thrift(prot: &mut R) -> Result<Self> {
+                let field_ident = prot.read_field_begin(0)?;
+                if field_ident.field_type == FieldType::Stop {
+                    return Err(general_err!("Received empty union from remote {}", stringify!($identifier)));
+                }
+                let ret = match field_ident.id {
+                    $($field_id => {
+                        let val = $crate::__thrift_read_variant!(prot, $field_name $($field_type $($element_type)?)?);
+                        val
+                    })*
+                    _ => {
+                        return Err(general_err!("Unexpected {} {}", stringify!($identifier), field_ident.id));
+                    }
+                };
+                let field_ident = prot.read_field_begin(field_ident.id)?;
+                if field_ident.field_type != FieldType::Stop {
+                    return Err(general_err!(
+                        concat!("Received multiple fields for union from remote {}", stringify!($identifier))
+                    ));
+                }
+                Ok(ret)
+            }
+        }
+
+        impl $(<$lt>)? WriteThrift for $identifier $(<$lt>)? {
+            const ELEMENT_TYPE: ElementType = ElementType::Struct;
+
+            fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+                match self {
+                    $($crate::__thrift_write_variant_lhs!($field_name $($field_type)?, variant_val) =>
+                      $crate::__thrift_write_variant_rhs!($field_id $($field_type)?, writer, variant_val),)*
+                };
+                writer.write_struct_end()
+            }
+        }
+
+        impl $(<$lt>)? WriteThriftField for $identifier $(<$lt>)? {
+            fn write_thrift_field<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>, field_id: i16, last_field_id: i16) -> Result<i16> {
+                writer.write_field_begin(FieldType::Struct, field_id, last_field_id)?;
+                self.write_thrift(writer)?;
+                Ok(field_id)
+            }
+        }
+    }
+}
+
+/// Macro used to generate Rust structs from a Thrift `struct` definition.
+///
+/// This macro allows for specifying lifetime annotations for the resulting `struct` and its fields.
+///
+/// When utilizing this macro the Thrift serialization traits and structs need to be in scope.
+#[macro_export]
+macro_rules! thrift_struct {
+    ($(#[$($def_attrs:tt)*])* $vis:vis struct $identifier:ident $(< $lt:lifetime >)? { $($(#[$($field_attrs:tt)*])* $field_id:literal : $required_or_optional:ident $field_type:ident $(< $field_lt:lifetime >)? $(< $element_type:ident >)? $field_name:ident $(= $default_value:literal)? $(;)?)* }) => {
+        $(#[cfg_attr(not(doctest), $($def_attrs)*)])*
+        #[derive(Clone, Debug, Eq, PartialEq)]
+        #[allow(non_camel_case_types)]
+        #[allow(non_snake_case)]
+        #[allow(missing_docs)]
+        $vis struct $identifier $(<$lt>)? {
+            $($(#[cfg_attr(not(doctest), $($field_attrs)*)])* $vis $field_name: $crate::__thrift_required_or_optional!($required_or_optional $crate::__thrift_field_type!($field_type $($field_lt)? $($element_type)?))),*
+        }
+
+        impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for $identifier $(<$lt>)? {
+            fn read_thrift(prot: &mut R) -> Result<Self> {
+                $(let mut $field_name: Option<$crate::__thrift_field_type!($field_type $($field_lt)? $($element_type)?)> = None;)*
+                let mut last_field_id = 0i16;
+                loop {
+                    let field_ident = prot.read_field_begin(last_field_id)?;
+                    if field_ident.field_type == FieldType::Stop {
+                        break;
+                    }
+                    match field_ident.id {
+                        $($field_id => {
+                            let val = $crate::__thrift_read_field!(prot, field_ident, $field_type $($field_lt)? $($element_type)?);
+                            $field_name = Some(val);
+                        })*
+                        _ => {
+                            prot.skip(field_ident.field_type)?;
+                        }
+                    };
+                    last_field_id = field_ident.id;
+                }
+                $($crate::__thrift_result_required_or_optional!($required_or_optional $field_name);)*
+                Ok(Self {
+                    $($field_name),*
+                })
+            }
+        }
+
+        impl $(<$lt>)? WriteThrift for $identifier $(<$lt>)? {
+            const ELEMENT_TYPE: ElementType = ElementType::Struct;
+
+            #[allow(unused_assignments)]
+            fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+                #[allow(unused_mut, unused_variables)]
+                let mut last_field_id = 0i16;
+                $($crate::__thrift_write_required_or_optional_field!($required_or_optional $field_name, $field_id, $field_type, self, writer, last_field_id);)*
+                writer.write_struct_end()
+            }
+        }
+
+        impl $(<$lt>)? WriteThriftField for $identifier $(<$lt>)? {
+            fn write_thrift_field<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>, field_id: i16, last_field_id: i16) -> Result<i16> {
+                writer.write_field_begin(FieldType::Struct, field_id, last_field_id)?;
+                self.write_thrift(writer)?;
+                Ok(field_id)
+            }
+        }
+    }
+}
+
+#[doc(hidden)]
+#[macro_export]
+macro_rules! __thrift_write_required_or_optional_field {
+    (required $field_name:ident, $field_id:literal, $field_type:ident, $self:tt, $writer:tt, $last_id:tt) => {
+        $crate::__thrift_write_required_field!(
+            $field_type,
+            $field_name,
+            $field_id,
+            $self,
+            $writer,
+            $last_id
+        )
+    };
+    (optional $field_name:ident, $field_id:literal, $field_type:ident, $self:tt, $writer:tt, $last_id:tt) => {
+        $crate::__thrift_write_optional_field!(
+            $field_type,
+            $field_name,
+            $field_id,
+            $self,
+            $writer,
+            $last_id
+        )
+    };
+}
+
+#[doc(hidden)]
+#[macro_export]
+macro_rules! __thrift_write_required_field {
+    (binary, $field_name:ident, $field_id:literal, $self:ident, $writer:ident, $last_id:ident) => {
+        $writer.write_field_begin(FieldType::Binary, $field_id, $last_id)?;
+        $writer.write_bytes($self.$field_name)?;
+        $last_id = $field_id;
+    };
+    ($field_type:ident, $field_name:ident, $field_id:literal, $self:ident, $writer:ident, $last_id:ident) => {
+        $last_id = $self
+            .$field_name
+            .write_thrift_field($writer, $field_id, $last_id)?;
+    };
+}
+
+#[doc(hidden)]
+#[macro_export]
+macro_rules! __thrift_write_optional_field {
+    (binary, $field_name:ident, $field_id:literal, $self:ident, $writer:tt, $last_id:tt) => {
+        if $self.$field_name.is_some() {
+            $writer.write_field_begin(FieldType::Binary, $field_id, $last_id)?;
+            $writer.write_bytes($self.$field_name.as_ref().unwrap())?;
+            $last_id = $field_id;
+        }
+    };
+    ($field_type:ident, $field_name:ident, $field_id:literal, $self:ident, $writer:tt, $last_id:tt) => {
+        if $self.$field_name.is_some() {
+            $last_id = $self
+                .$field_name
+                .as_ref()
+                .unwrap()
+                .write_thrift_field($writer, $field_id, $last_id)?;
+        }
+    };
+}
+
+#[doc(hidden)]
+#[macro_export]
+macro_rules! __thrift_required_or_optional {
+    (required $field_type:ty) => { $field_type };
+    (optional $field_type:ty) => { Option<$field_type> };
+}
+
+// Performance note: using `expect` here is about 4% faster on the page index bench,
+// but we want to propagate errors. Using `ok_or` is *much* slower.
+#[doc(hidden)]
+#[macro_export]
+macro_rules! __thrift_result_required_or_optional {
+    (required $field_name:ident) => {
+        let Some($field_name) = $field_name else {
+            return Err(general_err!(concat!(
+                "Required field ",
+                stringify!($field_name),
+                " is missing",
+            )));
+        };
+    };
+    (optional $field_name:ident) => {};
+}
+
+#[doc(hidden)]
+#[macro_export]
+macro_rules! __thrift_read_field {
+    ($prot:tt, $field_ident:tt, list $lt:lifetime binary) => {
+        read_thrift_vec::<&'a [u8], R>(&mut *$prot)?
+    };
+    ($prot:tt, $field_ident:tt, list $lt:lifetime $element_type:ident) => {
+        read_thrift_vec::<$element_type, R>(&mut *$prot)?
+    };
+    ($prot:tt, $field_ident:tt, list string) => {
+        read_thrift_vec::<String, R>(&mut *$prot)?
+    };
+    ($prot:tt, $field_ident:tt, list $element_type:ident) => {
+        read_thrift_vec::<$element_type, R>(&mut *$prot)?
+    };
+    ($prot:tt, $field_ident:tt, string $lt:lifetime) => {
+        <&$lt str>::read_thrift(&mut *$prot)?
+    };
+    ($prot:tt, $field_ident:tt, binary $lt:lifetime) => {
+        <&$lt [u8]>::read_thrift(&mut *$prot)?
+    };
+    ($prot:tt, $field_ident:tt, $field_type:ident $lt:lifetime) => {
+        $field_type::read_thrift(&mut *$prot)?
+    };
+    ($prot:tt, $field_ident:tt, string) => {
+        String::read_thrift(&mut *$prot)?
+    };
+    ($prot:tt, $field_ident:tt, binary) => {
+        // this one needs to not conflict with `list<i8>`
+        $prot.read_bytes_owned()?
+    };
+    ($prot:tt, $field_ident:tt, double) => {
+        $crate::parquet_thrift::OrderedF64::read_thrift(&mut *$prot)?
+    };
+    ($prot:tt, $field_ident:tt, bool) => {
+        $field_ident.bool_val.unwrap()
+    };
+    ($prot:tt, $field_ident:tt, $field_type:ident) => {
+        $field_type::read_thrift(&mut *$prot)?
+    };
+}
+
+#[doc(hidden)]
+#[macro_export]
+macro_rules! __thrift_field_type {
+    (binary $lt:lifetime) => { &$lt [u8] };
+    (string $lt:lifetime) => { &$lt str };
+    ($field_type:ident $lt:lifetime) => { $field_type<$lt> };
+    (list $lt:lifetime $element_type:ident) => { Vec< $crate::__thrift_field_type!($element_type $lt) > };
+    (list string) => { Vec<String> };
+    (list $element_type:ident) => { Vec< $crate::__thrift_field_type!($element_type) > };
+    (binary) => { Vec<u8> };
+    (string) => { String };
+    (double) => { $crate::parquet_thrift::OrderedF64 };
+    ($field_type:ty) => { $field_type };
+}
+
+#[doc(hidden)]
+#[macro_export]
+macro_rules! __thrift_union_type {
+    (binary $lt:lifetime) => { &$lt [u8] };
+    (string $lt:lifetime) => { &$lt str };
+    ($field_type:ident $lt:lifetime) => { $field_type<$lt> };
+    ($field_type:ident) => { $field_type };
+    (list $field_type:ident) => { Vec<$field_type> };
+}
+
+#[doc(hidden)]
+#[macro_export]
+macro_rules! __thrift_read_variant {
+    ($prot:tt, $field_name:ident $field_type:ident) => {
+        Self::$field_name($field_type::read_thrift(&mut *$prot)?)
+    };
+    ($prot:tt, $field_name:ident list $field_type:ident) => {
+        Self::$field_name(Vec::<$field_type>::read_thrift(&mut *$prot)?)
+    };
+    ($prot:tt, $field_name:ident) => {{
+        $prot.skip_empty_struct()?;
+        Self::$field_name
+    }};
+}
+
+#[doc(hidden)]
+#[macro_export]
+macro_rules! __thrift_write_variant_lhs {
+    ($field_name:ident $field_type:ident, $val:tt) => {
+        Self::$field_name($val)
+    };
+    ($field_name:ident, $val:tt) => {
+        Self::$field_name
+    };
+}
+
+#[doc(hidden)]
+#[macro_export]
+macro_rules! __thrift_write_variant_rhs {
+    ($field_id:literal $field_type:ident, $writer:tt, $val:ident) => {
+        $val.write_thrift_field($writer, $field_id, 0)?
+    };
+    ($field_id:literal, $writer:tt, $val:tt) => {
+        $writer.write_empty_struct($field_id, 0)?
+    };
+}
diff --git a/parquet/src/parquet_thrift.rs b/parquet/src/parquet_thrift.rs
new file mode 100644
index 000000000000..e27c7d16efdb
--- /dev/null
+++ b/parquet/src/parquet_thrift.rs
@@ -0,0 +1,1105 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Structs used for encoding and decoding Parquet Thrift objects.
+//!
+//! These include:
+//! * [`ThriftCompactInputProtocol`]: Trait implemented by Thrift decoders.
+//!     * [`ThriftSliceInputProtocol`]: Thrift decoder that takes a slice of bytes as input.
+//!     * [`ThriftReadInputProtocol`]: Thrift decoder that takes a [`Read`] as input.
+//! * [`ReadThrift`]: Trait implemented by serializable objects.
+//! * [`ThriftCompactOutputProtocol`]: Thrift encoder.
+//! * [`WriteThrift`]: Trait implemented by serializable objects.
+//! * [`WriteThriftField`]: Trait implemented by serializable objects that are fields in Thrift structs.
+
+use std::{
+    cmp::Ordering,
+    io::{Read, Write},
+};
+
+use crate::errors::{ParquetError, Result};
+
+/// Wrapper for thrift `double` fields. This is used to provide
+/// an implementation of `Eq` for floats. This implementation
+/// uses IEEE 754 total order.
+#[derive(Debug, Clone, Copy, PartialEq)]
+pub struct OrderedF64(f64);
+
+impl From<f64> for OrderedF64 {
+    fn from(value: f64) -> Self {
+        Self(value)
+    }
+}
+
+impl From<OrderedF64> for f64 {
+    fn from(value: OrderedF64) -> Self {
+        value.0
+    }
+}
+
+impl Eq for OrderedF64 {} // Marker trait, requires PartialEq
+
+impl Ord for OrderedF64 {
+    fn cmp(&self, other: &Self) -> Ordering {
+        self.0.total_cmp(&other.0)
+    }
+}
+
+impl PartialOrd for OrderedF64 {
+    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
+        Some(self.cmp(other))
+    }
+}
+
+// Thrift compact protocol types for struct fields.
+#[derive(Clone, Copy, Debug, Eq, PartialEq)]
+pub(crate) enum FieldType {
+    Stop = 0,
+    BooleanTrue = 1,
+    BooleanFalse = 2,
+    Byte = 3,
+    I16 = 4,
+    I32 = 5,
+    I64 = 6,
+    Double = 7,
+    Binary = 8,
+    List = 9,
+    Set = 10,
+    Map = 11,
+    Struct = 12,
+}
+
+impl TryFrom<u8> for FieldType {
+    type Error = ParquetError;
+    fn try_from(value: u8) -> Result<Self> {
+        match value {
+            0 => Ok(Self::Stop),
+            1 => Ok(Self::BooleanTrue),
+            2 => Ok(Self::BooleanFalse),
+            3 => Ok(Self::Byte),
+            4 => Ok(Self::I16),
+            5 => Ok(Self::I32),
+            6 => Ok(Self::I64),
+            7 => Ok(Self::Double),
+            8 => Ok(Self::Binary),
+            9 => Ok(Self::List),
+            10 => Ok(Self::Set),
+            11 => Ok(Self::Map),
+            12 => Ok(Self::Struct),
+            _ => Err(general_err!("Unexpected struct field type{}", value)),
+        }
+    }
+}
+
+impl TryFrom<ElementType> for FieldType {
+    type Error = ParquetError;
+    fn try_from(value: ElementType) -> std::result::Result<Self, Self::Error> {
+        match value {
+            ElementType::Bool => Ok(Self::BooleanTrue),
+            ElementType::Byte => Ok(Self::Byte),
+            ElementType::I16 => Ok(Self::I16),
+            ElementType::I32 => Ok(Self::I32),
+            ElementType::I64 => Ok(Self::I64),
+            ElementType::Double => Ok(Self::Double),
+            ElementType::Binary => Ok(Self::Binary),
+            ElementType::List => Ok(Self::List),
+            ElementType::Struct => Ok(Self::Struct),
+            _ => Err(general_err!("Unexpected list element type{:?}", value)),
+        }
+    }
+}
+
+// Thrift compact protocol types for list elements
+#[derive(Clone, Copy, Debug, Eq, PartialEq)]
+pub(crate) enum ElementType {
+    Bool = 2,
+    Byte = 3,
+    I16 = 4,
+    I32 = 5,
+    I64 = 6,
+    Double = 7,
+    Binary = 8,
+    List = 9,
+    Set = 10,
+    Map = 11,
+    Struct = 12,
+}
+
+impl TryFrom<u8> for ElementType {
+    type Error = ParquetError;
+    fn try_from(value: u8) -> Result<Self> {
+        match value {
+            // For historical and compatibility reasons, a reader should be capable to deal with both cases.
+            // The only valid value in the original spec was 2, but due to an widespread implementation bug
+            // the defacto standard across large parts of the library became 1 instead.
+            // As a result, both values are now allowed.
+            // https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#list-and-set
+            1 | 2 => Ok(Self::Bool),
+            3 => Ok(Self::Byte),
+            4 => Ok(Self::I16),
+            5 => Ok(Self::I32),
+            6 => Ok(Self::I64),
+            7 => Ok(Self::Double),
+            8 => Ok(Self::Binary),
+            9 => Ok(Self::List),
+            10 => Ok(Self::Set),
+            11 => Ok(Self::Map),
+            12 => Ok(Self::Struct),
+            _ => Err(general_err!("Unexpected list/set element type{}", value)),
+        }
+    }
+}
+
+/// Struct used to describe a [thrift struct] field during decoding.
+///
+/// [thrift struct]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#struct-encoding
+pub(crate) struct FieldIdentifier {
+    /// The type for the field.
+    pub(crate) field_type: FieldType,
+    /// The field's `id`. May be computed from delta or directly decoded.
+    pub(crate) id: i16,
+    /// Stores the value for booleans.
+    ///
+    /// Boolean fields store no data, instead the field type is either boolean true, or
+    /// boolean false.
+    pub(crate) bool_val: Option<bool>,
+}
+
+/// Struct used to describe a [thrift list].
+///
+/// [thrift list]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#list-and-set
+#[derive(Clone, Debug, Eq, PartialEq)]
+pub(crate) struct ListIdentifier {
+    /// The type for each element in the list.
+    pub(crate) element_type: ElementType,
+    /// Number of elements contained in the list.
+    pub(crate) size: i32,
+}
+
+/// Low-level object used to deserialize structs encoded with the Thrift [compact] protocol.
+///
+/// Implementation of this trait must provide the low-level functions `read_byte`, `read_bytes`,
+/// `skip_bytes`, and `read_double`. These primitives are used by the default functions provided
+/// here to perform deserialization.
+///
+/// [compact]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md
+pub(crate) trait ThriftCompactInputProtocol<'a> {
+    /// Read a single byte from the input.
+    fn read_byte(&mut self) -> Result<u8>;
+
+    /// Read a Thrift encoded [binary] from the input.
+    ///
+    /// [binary]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#binary-encoding
+    fn read_bytes(&mut self) -> Result<&'a [u8]>;
+
+    fn read_bytes_owned(&mut self) -> Result<Vec<u8>>;
+
+    /// Skip the next `n` bytes of input.
+    fn skip_bytes(&mut self, n: usize) -> Result<()>;
+
+    /// Read a ULEB128 encoded unsigned varint from the input.
+    fn read_vlq(&mut self) -> Result<u64> {
+        let mut in_progress = 0;
+        let mut shift = 0;
+        loop {
+            let byte = self.read_byte()?;
+            in_progress |= ((byte & 0x7F) as u64).wrapping_shl(shift);
+            if byte & 0x80 == 0 {
+                return Ok(in_progress);
+            }
+            shift += 7;
+        }
+    }
+
+    /// Read a zig-zag encoded signed varint from the input.
+    fn read_zig_zag(&mut self) -> Result<i64> {
+        let val = self.read_vlq()?;
+        Ok((val >> 1) as i64 ^ -((val & 1) as i64))
+    }
+
+    /// Read the [`ListIdentifier`] for a Thrift encoded list.
+    fn read_list_begin(&mut self) -> Result<ListIdentifier> {
+        let header = self.read_byte()?;
+        let element_type = ElementType::try_from(header & 0x0f)?;
+
+        let possible_element_count = (header & 0xF0) >> 4;
+        let element_count = if possible_element_count != 15 {
+            // high bits set high if count and type encoded separately
+            possible_element_count as i32
+        } else {
+            self.read_vlq()? as _
+        };
+
+        Ok(ListIdentifier {
+            element_type,
+            size: element_count,
+        })
+    }
+
+    /// Read the [`FieldIdentifier`] for a field in a Thrift encoded struct.
+    fn read_field_begin(&mut self, last_field_id: i16) -> Result<FieldIdentifier> {
+        // we can read at least one byte, which is:
+        // - the type
+        // - the field delta and the type
+        let field_type = self.read_byte()?;
+        let field_delta = (field_type & 0xf0) >> 4;
+        let field_type = FieldType::try_from(field_type & 0xf)?;
+        let mut bool_val: Option<bool> = None;
+
+        match field_type {
+            FieldType::Stop => Ok(FieldIdentifier {
+                field_type: FieldType::Stop,
+                id: 0,
+                bool_val,
+            }),
+            _ => {
+                // special handling for bools
+                if field_type == FieldType::BooleanFalse {
+                    bool_val = Some(false);
+                } else if field_type == FieldType::BooleanTrue {
+                    bool_val = Some(true);
+                }
+                let field_id = if field_delta != 0 {
+                    last_field_id.checked_add(field_delta as i16).map_or_else(
+                        || {
+                            Err(general_err!(format!(
+                                "cannot add {} to {}",
+                                field_delta, last_field_id
+                            )))
+                        },
+                        Ok,
+                    )?
+                } else {
+                    self.read_i16()?
+                };
+
+                Ok(FieldIdentifier {
+                    field_type,
+                    id: field_id,
+                    bool_val,
+                })
+            }
+        }
+    }
+
+    /// This is a specialized version of [`Self::read_field_begin`], solely for use in parsing
+    /// simple structs. This function assumes that the delta field will always be less than 0xf,
+    /// fields will be in order, and no boolean fields will be read.
+    /// This also skips validation of the field type.
+    ///
+    /// Returns a tuple of `(field_type, field_delta)`.
+    fn read_field_header(&mut self) -> Result<(u8, u8)> {
+        let field_type = self.read_byte()?;
+        let field_delta = (field_type & 0xf0) >> 4;
+        let field_type = field_type & 0xf;
+        Ok((field_type, field_delta))
+    }
+
+    /// Read a boolean list element. This should not be used for struct fields. For the latter,
+    /// use the [`FieldIdentifier::bool_val`] field.
+    fn read_bool(&mut self) -> Result<bool> {
+        let b = self.read_byte()?;
+        // Previous versions of the thrift specification said to use 0 and 1 inside collections,
+        // but that differed from existing implementations.
+        // The specification was updated in https://github.com/apache/thrift/commit/2c29c5665bc442e703480bb0ee60fe925ffe02e8.
+        // At least the go implementation seems to have followed the previously documented values.
+        match b {
+            0x01 => Ok(true),
+            0x00 | 0x02 => Ok(false),
+            unkn => Err(general_err!(format!("cannot convert {unkn} into bool"))),
+        }
+    }
+
+    /// Read a Thrift [binary] as a UTF-8 encoded string.
+    ///
+    /// [binary]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#binary-encoding
+    fn read_string(&mut self) -> Result<&'a str> {
+        let slice = self.read_bytes()?;
+        Ok(std::str::from_utf8(slice)?)
+    }
+
+    /// Read an `i8`.
+    fn read_i8(&mut self) -> Result<i8> {
+        Ok(self.read_byte()? as _)
+    }
+
+    /// Read an `i16`.
+    fn read_i16(&mut self) -> Result<i16> {
+        Ok(self.read_zig_zag()? as _)
+    }
+
+    /// Read an `i32`.
+    fn read_i32(&mut self) -> Result<i32> {
+        Ok(self.read_zig_zag()? as _)
+    }
+
+    /// Read an `i64`.
+    fn read_i64(&mut self) -> Result<i64> {
+        self.read_zig_zag()
+    }
+
+    /// Read a Thrift `double` as `f64`.
+    fn read_double(&mut self) -> Result<f64>;
+
+    /// Skip a ULEB128 encoded varint.
+    fn skip_vlq(&mut self) -> Result<()> {
+        loop {
+            let byte = self.read_byte()?;
+            if byte & 0x80 == 0 {
+                return Ok(());
+            }
+        }
+    }
+
+    /// Skip a thrift [binary].
+    ///
+    /// [binary]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#binary-encoding
+    fn skip_binary(&mut self) -> Result<()> {
+        let len = self.read_vlq()? as usize;
+        self.skip_bytes(len)
+    }
+
+    /// Skip a field with type `field_type` recursively until the default
+    /// maximum skip depth (currently 64) is reached.
+    fn skip(&mut self, field_type: FieldType) -> Result<()> {
+        const DEFAULT_SKIP_DEPTH: i8 = 64;
+        self.skip_till_depth(field_type, DEFAULT_SKIP_DEPTH)
+    }
+
+    /// Empty structs in unions consist of a single byte of 0 for the field stop record.
+    /// This skips that byte without encuring the cost of processing the [`FieldIdentifier`].
+    /// Will return an error if the struct is not actually empty.
+    fn skip_empty_struct(&mut self) -> Result<()> {
+        let b = self.read_byte()?;
+        if b != 0 {
+            Err(general_err!("Empty struct has fields"))
+        } else {
+            Ok(())
+        }
+    }
+
+    /// Skip a field with type `field_type` recursively up to `depth` levels.
+    fn skip_till_depth(&mut self, field_type: FieldType, depth: i8) -> Result<()> {
+        if depth == 0 {
+            return Err(general_err!(format!("cannot parse past {:?}", field_type)));
+        }
+
+        match field_type {
+            // boolean field has no data
+            FieldType::BooleanFalse | FieldType::BooleanTrue => Ok(()),
+            FieldType::Byte => self.read_i8().map(|_| ()),
+            FieldType::I16 => self.skip_vlq().map(|_| ()),
+            FieldType::I32 => self.skip_vlq().map(|_| ()),
+            FieldType::I64 => self.skip_vlq().map(|_| ()),
+            FieldType::Double => self.skip_bytes(8).map(|_| ()),
+            FieldType::Binary => self.skip_binary().map(|_| ()),
+            FieldType::Struct => {
+                let mut last_field_id = 0i16;
+                loop {
+                    let field_ident = self.read_field_begin(last_field_id)?;
+                    if field_ident.field_type == FieldType::Stop {
+                        break;
+                    }
+                    self.skip_till_depth(field_ident.field_type, depth - 1)?;
+                    last_field_id = field_ident.id;
+                }
+                Ok(())
+            }
+            FieldType::List => {
+                let list_ident = self.read_list_begin()?;
+                for _ in 0..list_ident.size {
+                    let element_type = FieldType::try_from(list_ident.element_type)?;
+                    self.skip_till_depth(element_type, depth - 1)?;
+                }
+                Ok(())
+            }
+            // no list or map types in parquet format
+            u => Err(general_err!(format!("cannot skip field type {:?}", &u))),
+        }
+    }
+}
+
+/// A high performance Thrift reader that reads from a slice of bytes.
+pub(crate) struct ThriftSliceInputProtocol<'a> {
+    buf: &'a [u8],
+}
+
+impl<'a> ThriftSliceInputProtocol<'a> {
+    /// Create a new `ThriftSliceInputProtocol` using the bytes in `buf`.
+    pub fn new(buf: &'a [u8]) -> Self {
+        Self { buf }
+    }
+
+    /// Return the current buffer as a slice.
+    pub fn as_slice(&self) -> &'a [u8] {
+        self.buf
+    }
+}
+
+impl<'b, 'a: 'b> ThriftCompactInputProtocol<'b> for ThriftSliceInputProtocol<'a> {
+    #[inline]
+    fn read_byte(&mut self) -> Result<u8> {
+        let ret = *self.buf.first().ok_or_else(eof_error)?;
+        self.buf = &self.buf[1..];
+        Ok(ret)
+    }
+
+    fn read_bytes(&mut self) -> Result<&'b [u8]> {
+        let len = self.read_vlq()? as usize;
+        let ret = self.buf.get(..len).ok_or_else(eof_error)?;
+        self.buf = &self.buf[len..];
+        Ok(ret)
+    }
+
+    fn read_bytes_owned(&mut self) -> Result<Vec<u8>> {
+        Ok(self.read_bytes()?.to_vec())
+    }
+
+    #[inline]
+    fn skip_bytes(&mut self, n: usize) -> Result<()> {
+        self.buf.get(..n).ok_or_else(eof_error)?;
+        self.buf = &self.buf[n..];
+        Ok(())
+    }
+
+    fn read_double(&mut self) -> Result<f64> {
+        let slice = self.buf.get(..8).ok_or_else(eof_error)?;
+        self.buf = &self.buf[8..];
+        match slice.try_into() {
+            Ok(slice) => Ok(f64::from_le_bytes(slice)),
+            Err(_) => Err(general_err!("Unexpected error converting slice")),
+        }
+    }
+}
+
+fn eof_error() -> ParquetError {
+    eof_err!("Unexpected EOF")
+}
+
+/// A Thrift input protocol that wraps a [`Read`] object.
+///
+/// Note that this is only intended for use in reading Parquet page headers. This will panic
+/// if Thrift `binary` data is encountered because a slice of that data cannot be returned.
+pub(crate) struct ThriftReadInputProtocol<R: Read> {
+    reader: R,
+}
+
+impl<R: Read> ThriftReadInputProtocol<R> {
+    pub(crate) fn new(reader: R) -> Self {
+        Self { reader }
+    }
+}
+
+impl<'a, R: Read> ThriftCompactInputProtocol<'a> for ThriftReadInputProtocol<R> {
+    #[inline]
+    fn read_byte(&mut self) -> Result<u8> {
+        let mut buf = [0_u8; 1];
+        self.reader.read_exact(&mut buf)?;
+        Ok(buf[0])
+    }
+
+    fn read_bytes(&mut self) -> Result<&'a [u8]> {
+        unimplemented!()
+    }
+
+    fn read_bytes_owned(&mut self) -> Result<Vec<u8>> {
+        let len = self.read_vlq()? as usize;
+        let mut v = Vec::with_capacity(len);
+        std::io::copy(&mut self.reader.by_ref().take(len as u64), &mut v)?;
+        Ok(v)
+    }
+
+    fn skip_bytes(&mut self, n: usize) -> Result<()> {
+        std::io::copy(
+            &mut self.reader.by_ref().take(n as u64),
+            &mut std::io::sink(),
+        )?;
+        Ok(())
+    }
+
+    fn read_double(&mut self) -> Result<f64> {
+        let mut buf = [0_u8; 8];
+        self.reader.read_exact(&mut buf)?;
+        Ok(f64::from_le_bytes(buf))
+    }
+}
+
+/// Trait implemented for objects that can be deserialized from a Thrift input stream.
+/// Implementations are provided for Thrift primitive types.
+pub(crate) trait ReadThrift<'a, R: ThriftCompactInputProtocol<'a>> {
+    /// Read an object of type `Self` from the input protocol object.
+    fn read_thrift(prot: &mut R) -> Result<Self>
+    where
+        Self: Sized;
+}
+
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for bool {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        prot.read_bool()
+    }
+}
+
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for i8 {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        prot.read_i8()
+    }
+}
+
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for i16 {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        prot.read_i16()
+    }
+}
+
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for i32 {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        prot.read_i32()
+    }
+}
+
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for i64 {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        prot.read_i64()
+    }
+}
+
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for OrderedF64 {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        Ok(OrderedF64(prot.read_double()?))
+    }
+}
+
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for &'a str {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        prot.read_string()
+    }
+}
+
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for String {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        Ok(String::from_utf8(prot.read_bytes_owned()?)?)
+    }
+}
+
+impl<'a, R: ThriftCompactInputProtocol<'a>> ReadThrift<'a, R> for &'a [u8] {
+    fn read_thrift(prot: &mut R) -> Result<Self> {
+        prot.read_bytes()
+    }
+}
+
+/// Read a Thrift encoded [list] from the input protocol object.
+///
+/// [list]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#list-and-set
+pub(crate) fn read_thrift_vec<'a, T, R>(prot: &mut R) -> Result<Vec<T>>
+where
+    R: ThriftCompactInputProtocol<'a>,
+    T: ReadThrift<'a, R>,
+{
+    let list_ident = prot.read_list_begin()?;
+    let mut res = Vec::with_capacity(list_ident.size as usize);
+    for _ in 0..list_ident.size {
+        let val = T::read_thrift(prot)?;
+        res.push(val);
+    }
+    Ok(res)
+}
+
+/////////////////////////
+// thrift compact output
+
+/// Low-level object used to serialize structs to the Thrift [compact output] protocol.
+///
+/// This struct serves as a wrapper around a [`Write`] object, to which thrift encoded data
+/// will written. The implementation provides functions to write Thrift primitive types, as well
+/// as functions used in the encoding of lists and structs. This should rarely be used directly,
+/// but is instead intended for use by implementers of [`WriteThrift`] and [`WriteThriftField`].
+///
+/// [compact output]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md
+pub(crate) struct ThriftCompactOutputProtocol<W: Write> {
+    writer: W,
+}
+
+impl<W: Write> ThriftCompactOutputProtocol<W> {
+    /// Create a new `ThriftCompactOutputProtocol` wrapping the byte sink `writer`.
+    pub(crate) fn new(writer: W) -> Self {
+        Self { writer }
+    }
+
+    /// Write a single byte to the output stream.
+    fn write_byte(&mut self, b: u8) -> Result<()> {
+        self.writer.write_all(&[b])?;
+        Ok(())
+    }
+
+    /// Write the given `u64` as a ULEB128 encoded varint.
+    fn write_vlq(&mut self, val: u64) -> Result<()> {
+        let mut v = val;
+        while v > 0x7f {
+            self.write_byte(v as u8 | 0x80)?;
+            v >>= 7;
+        }
+        self.write_byte(v as u8)
+    }
+
+    /// Write the given `i64` as a zig-zag encoded varint.
+    fn write_zig_zag(&mut self, val: i64) -> Result<()> {
+        let s = (val < 0) as i64;
+        self.write_vlq((((val ^ -s) << 1) + s) as u64)
+    }
+
+    /// Used to mark the start of a Thrift struct field of type `field_type`. `last_field_id`
+    /// is used to compute a delta to the given `field_id` per the compact protocol [spec].
+    ///
+    /// [spec]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#struct-encoding
+    pub(crate) fn write_field_begin(
+        &mut self,
+        field_type: FieldType,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<()> {
+        let delta = field_id.wrapping_sub(last_field_id);
+        if delta > 0 && delta <= 0xf {
+            self.write_byte((delta as u8) << 4 | field_type as u8)
+        } else {
+            self.write_byte(field_type as u8)?;
+            self.write_i16(field_id)
+        }
+    }
+
+    /// Used to indicate the start of a list of `element_type` elements.
+    pub(crate) fn write_list_begin(&mut self, element_type: ElementType, len: usize) -> Result<()> {
+        if len < 15 {
+            self.write_byte((len as u8) << 4 | element_type as u8)
+        } else {
+            self.write_byte(0xf0u8 | element_type as u8)?;
+            self.write_vlq(len as _)
+        }
+    }
+
+    /// Used to mark the end of a struct. This must be called after all fields of the struct have
+    /// been written.
+    pub(crate) fn write_struct_end(&mut self) -> Result<()> {
+        self.write_byte(0)
+    }
+
+    /// Serialize a slice of `u8`s. This will encode a length, and then write the bytes without
+    /// further encoding.
+    pub(crate) fn write_bytes(&mut self, val: &[u8]) -> Result<()> {
+        self.write_vlq(val.len() as u64)?;
+        self.writer.write_all(val)?;
+        Ok(())
+    }
+
+    /// Short-cut method used to encode structs that have no fields (often used in Thrift unions).
+    /// This simply encodes the field id and then immediately writes the end-of-struct marker.
+    pub(crate) fn write_empty_struct(&mut self, field_id: i16, last_field_id: i16) -> Result<i16> {
+        self.write_field_begin(FieldType::Struct, field_id, last_field_id)?;
+        self.write_struct_end()?;
+        Ok(last_field_id)
+    }
+
+    /// Write a boolean value.
+    pub(crate) fn write_bool(&mut self, val: bool) -> Result<()> {
+        match val {
+            true => self.write_byte(1),
+            false => self.write_byte(2),
+        }
+    }
+
+    /// Write a zig-zag encoded `i8` value.
+    pub(crate) fn write_i8(&mut self, val: i8) -> Result<()> {
+        self.write_byte(val as u8)
+    }
+
+    /// Write a zig-zag encoded `i16` value.
+    pub(crate) fn write_i16(&mut self, val: i16) -> Result<()> {
+        self.write_zig_zag(val as _)
+    }
+
+    /// Write a zig-zag encoded `i32` value.
+    pub(crate) fn write_i32(&mut self, val: i32) -> Result<()> {
+        self.write_zig_zag(val as _)
+    }
+
+    /// Write a zig-zag encoded `i64` value.
+    pub(crate) fn write_i64(&mut self, val: i64) -> Result<()> {
+        self.write_zig_zag(val as _)
+    }
+
+    /// Write a double value.
+    pub(crate) fn write_double(&mut self, val: f64) -> Result<()> {
+        self.writer.write_all(&val.to_le_bytes())?;
+        Ok(())
+    }
+}
+
+/// Trait implemented by objects that are to be serialized to a Thrift [compact output] protocol
+/// stream. Implementations are also provided for primitive Thrift types.
+///
+/// [compact output]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md
+pub(crate) trait WriteThrift {
+    /// The [`ElementType`] to use when a list of this object is written.
+    const ELEMENT_TYPE: ElementType;
+
+    /// Serialize this object to the given `writer`.
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()>;
+}
+
+/// Implementation for a vector of thrift serializable objects that implement [`WriteThrift`].
+/// This will write the necessary list header and then serialize the elements one-at-a-time.
+impl<T> WriteThrift for Vec<T>
+where
+    T: WriteThrift,
+{
+    const ELEMENT_TYPE: ElementType = ElementType::List;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        writer.write_list_begin(T::ELEMENT_TYPE, self.len())?;
+        for item in self {
+            item.write_thrift(writer)?;
+        }
+        Ok(())
+    }
+}
+
+impl WriteThrift for bool {
+    const ELEMENT_TYPE: ElementType = ElementType::Bool;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        writer.write_bool(*self)
+    }
+}
+
+impl WriteThrift for i8 {
+    const ELEMENT_TYPE: ElementType = ElementType::Byte;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        writer.write_i8(*self)
+    }
+}
+
+impl WriteThrift for i16 {
+    const ELEMENT_TYPE: ElementType = ElementType::I16;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        writer.write_i16(*self)
+    }
+}
+
+impl WriteThrift for i32 {
+    const ELEMENT_TYPE: ElementType = ElementType::I32;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        writer.write_i32(*self)
+    }
+}
+
+impl WriteThrift for i64 {
+    const ELEMENT_TYPE: ElementType = ElementType::I64;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        writer.write_i64(*self)
+    }
+}
+
+impl WriteThrift for OrderedF64 {
+    const ELEMENT_TYPE: ElementType = ElementType::Double;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        writer.write_double(self.0)
+    }
+}
+
+impl WriteThrift for f64 {
+    const ELEMENT_TYPE: ElementType = ElementType::Double;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        writer.write_double(*self)
+    }
+}
+
+impl WriteThrift for &[u8] {
+    const ELEMENT_TYPE: ElementType = ElementType::Binary;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        writer.write_bytes(self)
+    }
+}
+
+impl WriteThrift for &str {
+    const ELEMENT_TYPE: ElementType = ElementType::Binary;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        writer.write_bytes(self.as_bytes())
+    }
+}
+
+impl WriteThrift for String {
+    const ELEMENT_TYPE: ElementType = ElementType::Binary;
+
+    fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+        writer.write_bytes(self.as_bytes())
+    }
+}
+
+/// Trait implemented by objects that are fields of Thrift structs.
+///
+/// For example, given the Thrift struct definition
+/// ```ignore
+/// struct MyStruct {
+///   1: required i32 field1
+///   2: optional bool field2
+///   3: optional OtherStruct field3
+/// }
+/// ```
+///
+/// which becomes in Rust
+/// ```no_run
+/// # struct OtherStruct {}
+/// struct MyStruct {
+///   field1: i32,
+///   field2: Option<bool>,
+///   field3: Option<OtherStruct>,
+/// }
+/// ```
+/// the impl of `WriteThrift` for `MyStruct` will use the `WriteThriftField` impls for `i32`,
+/// `bool`, and `OtherStruct`.
+///
+/// ```ignore
+/// impl WriteThrift for MyStruct {
+///   fn write_thrift<W: Write>(&self, writer: &mut ThriftCompactOutputProtocol<W>) -> Result<()> {
+///     let mut last_field_id = 0i16;
+///     last_field_id = self.field1.write_thrift_field(writer, 1, last_field_id)?;
+///     if self.field2.is_some() {
+///       // if field2 is `None` then this assignment won't happen and last_field_id will remain
+///       // `1` when writing `field3`
+///       last_field_id = self.field2.write_thrift_field(writer, 2, last_field_id)?;
+///     }
+///     if self.field3.is_some() {
+///       // no need to assign last_field_id since this is the final field.
+///       self.field3.write_thrift_field(writer, 3, last_field_id)?;
+///     }
+///     writer.write_struct_end()
+///   }
+/// }
+/// ```
+///
+pub(crate) trait WriteThriftField {
+    /// Used to write struct fields (which may be primitive or IDL defined types). This will
+    /// write the field marker for the given `field_id`, using `last_field_id` to compute the
+    /// field delta used by the Thrift [compact protocol]. On success this will return `field_id`
+    /// to be used in chaining.
+    ///
+    /// [compact protocol]: https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#struct-encoding
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16>;
+}
+
+impl WriteThriftField for bool {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        // boolean only writes the field header
+        match *self {
+            true => writer.write_field_begin(FieldType::BooleanTrue, field_id, last_field_id)?,
+            false => writer.write_field_begin(FieldType::BooleanFalse, field_id, last_field_id)?,
+        }
+        Ok(field_id)
+    }
+}
+
+impl WriteThriftField for i8 {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::Byte, field_id, last_field_id)?;
+        writer.write_i8(*self)?;
+        Ok(field_id)
+    }
+}
+
+impl WriteThriftField for i16 {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::I16, field_id, last_field_id)?;
+        writer.write_i16(*self)?;
+        Ok(field_id)
+    }
+}
+
+impl WriteThriftField for i32 {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::I32, field_id, last_field_id)?;
+        writer.write_i32(*self)?;
+        Ok(field_id)
+    }
+}
+
+impl WriteThriftField for i64 {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::I64, field_id, last_field_id)?;
+        writer.write_i64(*self)?;
+        Ok(field_id)
+    }
+}
+
+impl WriteThriftField for OrderedF64 {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::Double, field_id, last_field_id)?;
+        writer.write_double(self.0)?;
+        Ok(field_id)
+    }
+}
+
+impl WriteThriftField for f64 {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::Double, field_id, last_field_id)?;
+        writer.write_double(*self)?;
+        Ok(field_id)
+    }
+}
+
+impl WriteThriftField for &[u8] {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::Binary, field_id, last_field_id)?;
+        writer.write_bytes(self)?;
+        Ok(field_id)
+    }
+}
+
+impl WriteThriftField for &str {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::Binary, field_id, last_field_id)?;
+        writer.write_bytes(self.as_bytes())?;
+        Ok(field_id)
+    }
+}
+
+impl WriteThriftField for String {
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::Binary, field_id, last_field_id)?;
+        writer.write_bytes(self.as_bytes())?;
+        Ok(field_id)
+    }
+}
+
+impl<T> WriteThriftField for Vec<T>
+where
+    T: WriteThrift,
+{
+    fn write_thrift_field<W: Write>(
+        &self,
+        writer: &mut ThriftCompactOutputProtocol<W>,
+        field_id: i16,
+        last_field_id: i16,
+    ) -> Result<i16> {
+        writer.write_field_begin(FieldType::List, field_id, last_field_id)?;
+        self.write_thrift(writer)?;
+        Ok(field_id)
+    }
+}
+
+#[cfg(test)]
+pub(crate) mod tests {
+    use crate::basic::{TimeUnit, Type};
+
+    use super::*;
+    use std::fmt::Debug;
+
+    pub(crate) fn test_roundtrip<T>(val: T)
+    where
+        T: for<'a> ReadThrift<'a, ThriftSliceInputProtocol<'a>> + WriteThrift + PartialEq + Debug,
+    {
+        let mut buf = Vec::<u8>::new();
+        {
+            let mut writer = ThriftCompactOutputProtocol::new(&mut buf);
+            val.write_thrift(&mut writer).unwrap();
+        }
+
+        let mut prot = ThriftSliceInputProtocol::new(&buf);
+        let read_val = T::read_thrift(&mut prot).unwrap();
+        assert_eq!(val, read_val);
+    }
+
+    #[test]
+    fn test_enum_roundtrip() {
+        test_roundtrip(Type::BOOLEAN);
+        test_roundtrip(Type::INT32);
+        test_roundtrip(Type::INT64);
+        test_roundtrip(Type::INT96);
+        test_roundtrip(Type::FLOAT);
+        test_roundtrip(Type::DOUBLE);
+        test_roundtrip(Type::BYTE_ARRAY);
+        test_roundtrip(Type::FIXED_LEN_BYTE_ARRAY);
+    }
+
+    #[test]
+    fn test_union_all_empty_roundtrip() {
+        test_roundtrip(TimeUnit::MILLIS);
+        test_roundtrip(TimeUnit::MICROS);
+        test_roundtrip(TimeUnit::NANOS);
+    }
+}
diff --git a/parquet/src/schema/parser.rs b/parquet/src/schema/parser.rs
index 0a67250476c7..700be8a15fd6 100644
--- a/parquet/src/schema/parser.rs
+++ b/parquet/src/schema/parser.rs
@@ -178,9 +178,9 @@ fn parse_timeunit(
     value
         .ok_or_else(|| general_err!(not_found_msg))
         .and_then(|v| match v.to_uppercase().as_str() {
-            "MILLIS" => Ok(TimeUnit::MILLIS(Default::default())),
-            "MICROS" => Ok(TimeUnit::MICROS(Default::default())),
-            "NANOS" => Ok(TimeUnit::NANOS(Default::default())),
+            "MILLIS" => Ok(TimeUnit::MILLIS),
+            "MICROS" => Ok(TimeUnit::MICROS),
+            "NANOS" => Ok(TimeUnit::NANOS),
             _ => Err(general_err!(parse_fail_msg)),
         })
 }
@@ -1075,7 +1075,7 @@ mod tests {
             Arc::new(
                 Type::primitive_type_builder("_6", PhysicalType::INT32)
                     .with_logical_type(Some(LogicalType::Time {
-                        unit: TimeUnit::MILLIS(Default::default()),
+                        unit: TimeUnit::MILLIS,
                         is_adjusted_to_u_t_c: false,
                     }))
                     .build()
@@ -1084,7 +1084,7 @@ mod tests {
             Arc::new(
                 Type::primitive_type_builder("_7", PhysicalType::INT64)
                     .with_logical_type(Some(LogicalType::Time {
-                        unit: TimeUnit::MICROS(Default::default()),
+                        unit: TimeUnit::MICROS,
                         is_adjusted_to_u_t_c: true,
                     }))
                     .build()
@@ -1093,7 +1093,7 @@ mod tests {
             Arc::new(
                 Type::primitive_type_builder("_8", PhysicalType::INT64)
                     .with_logical_type(Some(LogicalType::Timestamp {
-                        unit: TimeUnit::MILLIS(Default::default()),
+                        unit: TimeUnit::MILLIS,
                         is_adjusted_to_u_t_c: true,
                     }))
                     .build()
@@ -1102,7 +1102,7 @@ mod tests {
             Arc::new(
                 Type::primitive_type_builder("_9", PhysicalType::INT64)
                     .with_logical_type(Some(LogicalType::Timestamp {
-                        unit: TimeUnit::NANOS(Default::default()),
+                        unit: TimeUnit::NANOS,
                         is_adjusted_to_u_t_c: false,
                     }))
                     .build()
diff --git a/parquet/src/schema/printer.rs b/parquet/src/schema/printer.rs
index 5ef068da915b..0cc5df59f329 100644
--- a/parquet/src/schema/printer.rs
+++ b/parquet/src/schema/printer.rs
@@ -277,9 +277,9 @@ impl<'a> Printer<'a> {
 #[inline]
 fn print_timeunit(unit: &TimeUnit) -> &str {
     match unit {
-        TimeUnit::MILLIS(_) => "MILLIS",
-        TimeUnit::MICROS(_) => "MICROS",
-        TimeUnit::NANOS(_) => "NANOS",
+        TimeUnit::MILLIS => "MILLIS",
+        TimeUnit::MICROS => "MICROS",
+        TimeUnit::NANOS => "NANOS",
     }
 }
 
@@ -326,10 +326,26 @@ fn print_logical_and_converted(
             LogicalType::List => "LIST".to_string(),
             LogicalType::Map => "MAP".to_string(),
             LogicalType::Float16 => "FLOAT16".to_string(),
-            LogicalType::Variant => "VARIANT".to_string(),
-            LogicalType::Geometry => "GEOMETRY".to_string(),
-            LogicalType::Geography => "GEOGRAPHY".to_string(),
+            LogicalType::Variant {
+                specification_version,
+            } => format!("VARIANT({specification_version:?})"),
+            LogicalType::Geometry { crs } => {
+                if let Some(crs) = crs {
+                    format!("GEOMETRY({crs})")
+                } else {
+                    "GEOMETRY".to_string()
+                }
+            }
+            LogicalType::Geography { crs, algorithm } => {
+                let algorithm = algorithm.unwrap_or_default();
+                if let Some(crs) = crs {
+                    format!("GEOGRAPHY({algorithm}, {crs})")
+                } else {
+                    format!("GEOGRAPHY({algorithm})")
+                }
+            }
             LogicalType::Unknown => "UNKNOWN".to_string(),
+            LogicalType::_Unknown { field_id } => format!("_Unknown({field_id})"),
         },
         None => {
             // Also print converted type if it is available
@@ -449,7 +465,7 @@ mod tests {
 
     use std::sync::Arc;
 
-    use crate::basic::{Repetition, Type as PhysicalType};
+    use crate::basic::{EdgeInterpolationAlgorithm, Repetition, Type as PhysicalType};
     use crate::errors::Result;
     use crate::schema::parser::parse_message_type;
 
@@ -645,7 +661,7 @@ mod tests {
                     PhysicalType::INT64,
                     Some(LogicalType::Timestamp {
                         is_adjusted_to_u_t_c: true,
-                        unit: TimeUnit::MILLIS(Default::default()),
+                        unit: TimeUnit::MILLIS,
                     }),
                     ConvertedType::NONE,
                     Repetition::REQUIRED,
@@ -671,7 +687,7 @@ mod tests {
                     None,
                     PhysicalType::INT32,
                     Some(LogicalType::Time {
-                        unit: TimeUnit::MILLIS(Default::default()),
+                        unit: TimeUnit::MILLIS,
                         is_adjusted_to_u_t_c: false,
                     }),
                     ConvertedType::TIME_MILLIS,
@@ -686,7 +702,7 @@ mod tests {
                     Some(42),
                     PhysicalType::INT32,
                     Some(LogicalType::Time {
-                        unit: TimeUnit::MILLIS(Default::default()),
+                        unit: TimeUnit::MILLIS,
                         is_adjusted_to_u_t_c: false,
                     }),
                     ConvertedType::TIME_MILLIS,
@@ -779,6 +795,62 @@ mod tests {
                 .unwrap(),
                 "REQUIRED BYTE_ARRAY field [42] (STRING);",
             ),
+            (
+                build_primitive_type(
+                    "field",
+                    None,
+                    PhysicalType::BYTE_ARRAY,
+                    Some(LogicalType::Geometry { crs: None }),
+                    ConvertedType::NONE,
+                    Repetition::REQUIRED,
+                )
+                .unwrap(),
+                "REQUIRED BYTE_ARRAY field (GEOMETRY);",
+            ),
+            (
+                build_primitive_type(
+                    "field",
+                    None,
+                    PhysicalType::BYTE_ARRAY,
+                    Some(LogicalType::Geometry {
+                        crs: Some("non-missing CRS".to_string()),
+                    }),
+                    ConvertedType::NONE,
+                    Repetition::REQUIRED,
+                )
+                .unwrap(),
+                "REQUIRED BYTE_ARRAY field (GEOMETRY(non-missing CRS));",
+            ),
+            (
+                build_primitive_type(
+                    "field",
+                    None,
+                    PhysicalType::BYTE_ARRAY,
+                    Some(LogicalType::Geography {
+                        crs: None,
+                        algorithm: Some(EdgeInterpolationAlgorithm::default()),
+                    }),
+                    ConvertedType::NONE,
+                    Repetition::REQUIRED,
+                )
+                .unwrap(),
+                "REQUIRED BYTE_ARRAY field (GEOGRAPHY(SPHERICAL));",
+            ),
+            (
+                build_primitive_type(
+                    "field",
+                    None,
+                    PhysicalType::BYTE_ARRAY,
+                    Some(LogicalType::Geography {
+                        crs: Some("non-missing CRS".to_string()),
+                        algorithm: Some(EdgeInterpolationAlgorithm::default()),
+                    }),
+                    ConvertedType::NONE,
+                    Repetition::REQUIRED,
+                )
+                .unwrap(),
+                "REQUIRED BYTE_ARRAY field (GEOGRAPHY(SPHERICAL, non-missing CRS));",
+            ),
         ];
 
         types_and_strings.into_iter().for_each(|(field, expected)| {
diff --git a/parquet/src/schema/types.rs b/parquet/src/schema/types.rs
index 2f6131571ecc..9629e17b4752 100644
--- a/parquet/src/schema/types.rs
+++ b/parquet/src/schema/types.rs
@@ -17,10 +17,11 @@
 
 //! Contains structs and methods to build Parquet schema and schema descriptors.
 
+use std::vec::IntoIter;
 use std::{collections::HashMap, fmt, sync::Arc};
 
+use crate::file::metadata::thrift_gen::SchemaElement;
 use crate::file::metadata::HeapSize;
-use crate::format::SchemaElement;
 
 use crate::basic::{
     ColumnOrder, ConvertedType, LogicalType, Repetition, SortOrder, TimeUnit, Type as PhysicalType,
@@ -378,13 +379,13 @@ impl<'a> PrimitiveTypeBuilder<'a> {
                 (LogicalType::Date, PhysicalType::INT32) => {}
                 (
                     LogicalType::Time {
-                        unit: TimeUnit::MILLIS(_),
+                        unit: TimeUnit::MILLIS,
                         ..
                     },
                     PhysicalType::INT32,
                 ) => {}
                 (LogicalType::Time { unit, .. }, PhysicalType::INT64) => {
-                    if *unit == TimeUnit::MILLIS(Default::default()) {
+                    if *unit == TimeUnit::MILLIS {
                         return Err(general_err!(
                             "Cannot use millisecond unit on INT64 type for field '{}'",
                             self.name
@@ -401,8 +402,8 @@ impl<'a> PrimitiveTypeBuilder<'a> {
                 (LogicalType::String, PhysicalType::BYTE_ARRAY) => {}
                 (LogicalType::Json, PhysicalType::BYTE_ARRAY) => {}
                 (LogicalType::Bson, PhysicalType::BYTE_ARRAY) => {}
-                (LogicalType::Geometry, PhysicalType::BYTE_ARRAY) => {}
-                (LogicalType::Geography, PhysicalType::BYTE_ARRAY) => {}
+                (LogicalType::Geometry { .. }, PhysicalType::BYTE_ARRAY) => {}
+                (LogicalType::Geography { .. }, PhysicalType::BYTE_ARRAY) => {}
                 (LogicalType::Uuid, PhysicalType::FIXED_LEN_BYTE_ARRAY) if self.length == 16 => {}
                 (LogicalType::Uuid, PhysicalType::FIXED_LEN_BYTE_ARRAY) => {
                     return Err(general_err!(
@@ -1029,11 +1030,15 @@ impl HeapSize for SchemaDescriptor {
 impl SchemaDescriptor {
     /// Creates new schema descriptor from Parquet schema.
     pub fn new(tp: TypePtr) -> Self {
+        const INIT_SCHEMA_DEPTH: usize = 16;
         assert!(tp.is_group(), "SchemaDescriptor should take a GroupType");
-        let mut leaves = vec![];
-        let mut leaf_to_base = Vec::new();
+        // unwrap should be safe since we just asserted tp is a group
+        let n_leaves = num_leaves(&tp).unwrap();
+        let mut leaves = Vec::with_capacity(n_leaves);
+        let mut leaf_to_base = Vec::with_capacity(n_leaves);
+        let mut path = Vec::with_capacity(INIT_SCHEMA_DEPTH);
         for (root_idx, f) in tp.get_fields().iter().enumerate() {
-            let mut path = vec![];
+            path.clear();
             build_tree(f, root_idx, 0, 0, &mut leaves, &mut leaf_to_base, &mut path);
         }
 
@@ -1112,6 +1117,50 @@ impl SchemaDescriptor {
     }
 }
 
+// walk tree and count nodes
+pub(crate) fn num_nodes(tp: &TypePtr) -> Result<usize> {
+    if !tp.is_group() {
+        return Err(general_err!("Root schema must be Group type"));
+    }
+    let mut n_nodes = 1usize; // count root
+    for f in tp.get_fields().iter() {
+        count_nodes(f, &mut n_nodes);
+    }
+    Ok(n_nodes)
+}
+
+pub(crate) fn count_nodes(tp: &TypePtr, n_nodes: &mut usize) {
+    *n_nodes += 1;
+    if let Type::GroupType { ref fields, .. } = tp.as_ref() {
+        for f in fields {
+            count_nodes(f, n_nodes);
+        }
+    }
+}
+
+// do a quick walk of the tree to get proper sizing for SchemaDescriptor arrays
+fn num_leaves(tp: &TypePtr) -> Result<usize> {
+    if !tp.is_group() {
+        return Err(general_err!("Root schema must be Group type"));
+    }
+    let mut n_leaves = 0usize;
+    for f in tp.get_fields().iter() {
+        count_leaves(f, &mut n_leaves);
+    }
+    Ok(n_leaves)
+}
+
+fn count_leaves(tp: &TypePtr, n_leaves: &mut usize) {
+    match tp.as_ref() {
+        Type::PrimitiveType { .. } => *n_leaves += 1,
+        Type::GroupType { ref fields, .. } => {
+            for f in fields {
+                count_leaves(f, n_leaves);
+            }
+        }
+    }
+}
+
 fn build_tree<'a>(
     tp: &'a TypePtr,
     root_idx: usize,
@@ -1164,12 +1213,30 @@ fn build_tree<'a>(
     }
 }
 
-/// Method to convert from Thrift.
-pub fn from_thrift(elements: &[SchemaElement]) -> Result<TypePtr> {
+/// Checks if the logical type is valid.
+fn check_logical_type(logical_type: &Option<LogicalType>) -> Result<()> {
+    if let Some(LogicalType::Integer { bit_width, .. }) = *logical_type {
+        if bit_width != 8 && bit_width != 16 && bit_width != 32 && bit_width != 64 {
+            return Err(general_err!(
+                "Bit width must be 8, 16, 32, or 64 for Integer logical type"
+            ));
+        }
+    }
+    Ok(())
+}
+
+// convert thrift decoded array of `SchemaElement` into this crate's representation of
+// parquet types. this function consumes `elements`.
+pub(crate) fn parquet_schema_from_array<'a>(elements: Vec<SchemaElement<'a>>) -> Result<TypePtr> {
     let mut index = 0;
-    let mut schema_nodes = Vec::new();
-    while index < elements.len() {
-        let t = from_thrift_helper(elements, index)?;
+    let num_elements = elements.len();
+    let mut schema_nodes = Vec::with_capacity(1); // there should only be one element when done
+
+    // turn into iterator so we can take ownership of elements of the vector
+    let mut elements = elements.into_iter();
+
+    while index < num_elements {
+        let t = schema_from_array_helper(&mut elements, num_elements, index)?;
         index = t.0;
         schema_nodes.push(t.1);
     }
@@ -1187,54 +1254,40 @@ pub fn from_thrift(elements: &[SchemaElement]) -> Result<TypePtr> {
     Ok(schema_nodes.remove(0))
 }
 
-/// Checks if the logical type is valid.
-fn check_logical_type(logical_type: &Option<LogicalType>) -> Result<()> {
-    if let Some(LogicalType::Integer { bit_width, .. }) = *logical_type {
-        if bit_width != 8 && bit_width != 16 && bit_width != 32 && bit_width != 64 {
-            return Err(general_err!(
-                "Bit width must be 8, 16, 32, or 64 for Integer logical type"
-            ));
-        }
-    }
-    Ok(())
-}
-
-/// Constructs a new Type from the `elements`, starting at index `index`.
-/// The first result is the starting index for the next Type after this one. If it is
-/// equal to `elements.len()`, then this Type is the last one.
-/// The second result is the result Type.
-fn from_thrift_helper(elements: &[SchemaElement], index: usize) -> Result<(usize, TypePtr)> {
+// recursive helper function for schema conversion
+fn schema_from_array_helper<'a>(
+    elements: &mut IntoIter<SchemaElement<'a>>,
+    num_elements: usize,
+    index: usize,
+) -> Result<(usize, TypePtr)> {
     // Whether or not the current node is root (message type).
     // There is only one message type node in the schema tree.
     let is_root_node = index == 0;
 
-    if index >= elements.len() {
+    if index >= num_elements {
         return Err(general_err!(
             "Index out of bound, index = {}, len = {}",
             index,
-            elements.len()
+            num_elements
         ));
     }
-    let element = &elements[index];
+    let element = elements.next().expect("schema vector should not be empty");
 
     // Check for empty schema
     if let (true, None | Some(0)) = (is_root_node, element.num_children) {
-        let builder = Type::group_type_builder(&element.name);
+        let builder = Type::group_type_builder(element.name);
         return Ok((index + 1, Arc::new(builder.build().unwrap())));
     }
 
-    let converted_type = ConvertedType::try_from(element.converted_type)?;
-    // LogicalType is only present in v2 Parquet files. ConvertedType is always
-    // populated, regardless of the version of the file (v1 or v2).
-    let logical_type = element
-        .logical_type
-        .as_ref()
-        .map(|value| LogicalType::from(value.clone()));
+    let converted_type = element.converted_type.unwrap_or(ConvertedType::NONE);
+
+    // LogicalType is prefered to ConvertedType, but both may be present.
+    let logical_type = element.logical_type;
 
     check_logical_type(&logical_type)?;
 
-    let field_id = elements[index].field_id;
-    match elements[index].num_children {
+    let field_id = element.field_id;
+    match element.num_children {
         // From parquet-format:
         //   The children count is used to construct the nested relationship.
         //   This field is not set when the element is a primitive type
@@ -1242,18 +1295,17 @@ fn from_thrift_helper(elements: &[SchemaElement], index: usize) -> Result<(usize
         // have to handle this case too.
         None | Some(0) => {
             // primitive type
-            if elements[index].repetition_type.is_none() {
+            if element.repetition_type.is_none() {
                 return Err(general_err!(
                     "Repetition level must be defined for a primitive type"
                 ));
             }
-            let repetition = Repetition::try_from(elements[index].repetition_type.unwrap())?;
-            if let Some(type_) = elements[index].type_ {
-                let physical_type = PhysicalType::try_from(type_)?;
-                let length = elements[index].type_length.unwrap_or(-1);
-                let scale = elements[index].scale.unwrap_or(-1);
-                let precision = elements[index].precision.unwrap_or(-1);
-                let name = &elements[index].name;
+            let repetition = element.repetition_type.unwrap();
+            if let Some(physical_type) = element.r#type {
+                let length = element.type_length.unwrap_or(-1);
+                let scale = element.scale.unwrap_or(-1);
+                let precision = element.precision.unwrap_or(-1);
+                let name = element.name;
                 let builder = Type::primitive_type_builder(name, physical_type)
                     .with_repetition(repetition)
                     .with_converted_type(converted_type)
@@ -1264,7 +1316,7 @@ fn from_thrift_helper(elements: &[SchemaElement], index: usize) -> Result<(usize
                     .with_id(field_id);
                 Ok((index + 1, Arc::new(builder.build()?)))
             } else {
-                let mut builder = Type::group_type_builder(&elements[index].name)
+                let mut builder = Type::group_type_builder(element.name)
                     .with_converted_type(converted_type)
                     .with_logical_type(logical_type)
                     .with_id(field_id);
@@ -1282,20 +1334,17 @@ fn from_thrift_helper(elements: &[SchemaElement], index: usize) -> Result<(usize
             }
         }
         Some(n) => {
-            let repetition = elements[index]
-                .repetition_type
-                .map(Repetition::try_from)
-                .transpose()?;
+            let repetition = element.repetition_type;
 
-            let mut fields = vec![];
+            let mut fields = Vec::with_capacity(n as usize);
             let mut next_index = index + 1;
             for _ in 0..n {
-                let child_result = from_thrift_helper(elements, next_index)?;
+                let child_result = schema_from_array_helper(elements, num_elements, next_index)?;
                 next_index = child_result.0;
                 fields.push(child_result.1);
             }
 
-            let mut builder = Type::group_type_builder(&elements[index].name)
+            let mut builder = Type::group_type_builder(element.name)
                 .with_converted_type(converted_type)
                 .with_logical_type(logical_type)
                 .with_fields(fields)
@@ -1317,96 +1366,14 @@ fn from_thrift_helper(elements: &[SchemaElement], index: usize) -> Result<(usize
     }
 }
 
-/// Method to convert to Thrift.
-pub fn to_thrift(schema: &Type) -> Result<Vec<SchemaElement>> {
-    if !schema.is_group() {
-        return Err(general_err!("Root schema must be Group type"));
-    }
-    let mut elements: Vec<SchemaElement> = Vec::new();
-    to_thrift_helper(schema, &mut elements);
-    Ok(elements)
-}
-
-/// Constructs list of `SchemaElement` from the schema using depth-first traversal.
-/// Here we assume that schema is always valid and starts with group type.
-fn to_thrift_helper(schema: &Type, elements: &mut Vec<SchemaElement>) {
-    match *schema {
-        Type::PrimitiveType {
-            ref basic_info,
-            physical_type,
-            type_length,
-            scale,
-            precision,
-        } => {
-            let element = SchemaElement {
-                type_: Some(physical_type.into()),
-                type_length: if type_length >= 0 {
-                    Some(type_length)
-                } else {
-                    None
-                },
-                repetition_type: Some(basic_info.repetition().into()),
-                name: basic_info.name().to_owned(),
-                num_children: None,
-                converted_type: basic_info.converted_type().into(),
-                scale: if scale >= 0 { Some(scale) } else { None },
-                precision: if precision >= 0 {
-                    Some(precision)
-                } else {
-                    None
-                },
-                field_id: if basic_info.has_id() {
-                    Some(basic_info.id())
-                } else {
-                    None
-                },
-                logical_type: basic_info.logical_type().map(|value| value.into()),
-            };
-
-            elements.push(element);
-        }
-        Type::GroupType {
-            ref basic_info,
-            ref fields,
-        } => {
-            let repetition = if basic_info.has_repetition() {
-                Some(basic_info.repetition().into())
-            } else {
-                None
-            };
-
-            let element = SchemaElement {
-                type_: None,
-                type_length: None,
-                repetition_type: repetition,
-                name: basic_info.name().to_owned(),
-                num_children: Some(fields.len() as i32),
-                converted_type: basic_info.converted_type().into(),
-                scale: None,
-                precision: None,
-                field_id: if basic_info.has_id() {
-                    Some(basic_info.id())
-                } else {
-                    None
-                },
-                logical_type: basic_info.logical_type().map(|value| value.into()),
-            };
-
-            elements.push(element);
-
-            // Add child elements for a group
-            for field in fields {
-                to_thrift_helper(field, elements);
-            }
-        }
-    }
-}
-
 #[cfg(test)]
 mod tests {
     use super::*;
 
-    use crate::schema::parser::parse_message_type;
+    use crate::{
+        file::metadata::thrift_gen::tests::{buf_to_schema_list, roundtrip_schema, schema_to_buf},
+        schema::parser::parse_message_type,
+    };
 
     // TODO: add tests for v2 types
 
@@ -2205,7 +2172,8 @@ mod tests {
         let schema = Type::primitive_type_builder("col", PhysicalType::INT32)
             .build()
             .unwrap();
-        let thrift_schema = to_thrift(&schema);
+        let schema = Arc::new(schema);
+        let thrift_schema = schema_to_buf(&schema);
         assert!(thrift_schema.is_err());
         if let Err(e) = thrift_schema {
             assert_eq!(
@@ -2265,8 +2233,7 @@ mod tests {
     }
     ";
         let expected_schema = parse_message_type(message_type).unwrap();
-        let thrift_schema = to_thrift(&expected_schema).unwrap();
-        let result_schema = from_thrift(&thrift_schema).unwrap();
+        let result_schema = roundtrip_schema(Arc::new(expected_schema.clone())).unwrap();
         assert_eq!(result_schema, Arc::new(expected_schema));
     }
 
@@ -2281,8 +2248,7 @@ mod tests {
     }
     ";
         let expected_schema = parse_message_type(message_type).unwrap();
-        let thrift_schema = to_thrift(&expected_schema).unwrap();
-        let result_schema = from_thrift(&thrift_schema).unwrap();
+        let result_schema = roundtrip_schema(Arc::new(expected_schema.clone())).unwrap();
         assert_eq!(result_schema, Arc::new(expected_schema));
     }
 
@@ -2302,8 +2268,10 @@ mod tests {
     }
     ";
 
-        let expected_schema = parse_message_type(message_type).unwrap();
-        let mut thrift_schema = to_thrift(&expected_schema).unwrap();
+        let expected_schema = Arc::new(parse_message_type(message_type).unwrap());
+        let mut buf = schema_to_buf(&expected_schema).unwrap();
+        let mut thrift_schema = buf_to_schema_list(&mut buf).unwrap();
+
         // Change all of None to Some(0)
         for elem in &mut thrift_schema[..] {
             if elem.num_children.is_none() {
@@ -2311,8 +2279,8 @@ mod tests {
             }
         }
 
-        let result_schema = from_thrift(&thrift_schema).unwrap();
-        assert_eq!(result_schema, Arc::new(expected_schema));
+        let result_schema = parquet_schema_from_array(thrift_schema).unwrap();
+        assert_eq!(result_schema, expected_schema);
     }
 
     // Sometimes parquet-cpp sets repetition level for the root node, which is against
@@ -2327,23 +2295,25 @@ mod tests {
     }
     ";
 
-        let expected_schema = parse_message_type(message_type).unwrap();
-        let mut thrift_schema = to_thrift(&expected_schema).unwrap();
-        thrift_schema[0].repetition_type = Some(Repetition::REQUIRED.into());
+        let expected_schema = Arc::new(parse_message_type(message_type).unwrap());
+        let mut buf = schema_to_buf(&expected_schema).unwrap();
+        let mut thrift_schema = buf_to_schema_list(&mut buf).unwrap();
+        thrift_schema[0].repetition_type = Some(Repetition::REQUIRED);
 
-        let result_schema = from_thrift(&thrift_schema).unwrap();
-        assert_eq!(result_schema, Arc::new(expected_schema));
+        let result_schema = parquet_schema_from_array(thrift_schema).unwrap();
+        assert_eq!(result_schema, expected_schema);
     }
 
     #[test]
     fn test_schema_from_thrift_group_has_no_child() {
         let message_type = "message schema {}";
 
-        let expected_schema = parse_message_type(message_type).unwrap();
-        let mut thrift_schema = to_thrift(&expected_schema).unwrap();
-        thrift_schema[0].repetition_type = Some(Repetition::REQUIRED.into());
+        let expected_schema = Arc::new(parse_message_type(message_type).unwrap());
+        let mut buf = schema_to_buf(&expected_schema).unwrap();
+        let mut thrift_schema = buf_to_schema_list(&mut buf).unwrap();
+        thrift_schema[0].repetition_type = Some(Repetition::REQUIRED);
 
-        let result_schema = from_thrift(&thrift_schema).unwrap();
-        assert_eq!(result_schema, Arc::new(expected_schema));
+        let result_schema = parquet_schema_from_array(thrift_schema).unwrap();
+        assert_eq!(result_schema, expected_schema);
     }
 }
diff --git a/parquet/src/thrift.rs b/parquet/src/thrift.rs
index 1cbd47a90001..2eb91162ac38 100644
--- a/parquet/src/thrift.rs
+++ b/parquet/src/thrift.rs
@@ -18,10 +18,7 @@
 //! Custom thrift definitions
 
 pub use thrift::protocol::TCompactOutputProtocol;
-use thrift::protocol::{
-    TFieldIdentifier, TInputProtocol, TListIdentifier, TMapIdentifier, TMessageIdentifier,
-    TOutputProtocol, TSetIdentifier, TStructIdentifier, TType,
-};
+use thrift::protocol::{TInputProtocol, TOutputProtocol};
 
 /// Reads and writes the struct to Thrift protocols.
 ///
@@ -33,333 +30,57 @@ pub trait TSerializable: Sized {
     fn write_to_out_protocol<T: TOutputProtocol>(&self, o_prot: &mut T) -> thrift::Result<()>;
 }
 
-/// A more performant implementation of [`TCompactInputProtocol`] that reads a slice
-///
-/// [`TCompactInputProtocol`]: thrift::protocol::TCompactInputProtocol
-pub(crate) struct TCompactSliceInputProtocol<'a> {
-    buf: &'a [u8],
-    // Identifier of the last field deserialized for a struct.
-    last_read_field_id: i16,
-    // Stack of the last read field ids (a new entry is added each time a nested struct is read).
-    read_field_id_stack: Vec<i16>,
-    // Boolean value for a field.
-    // Saved because boolean fields and their value are encoded in a single byte,
-    // and reading the field only occurs after the field id is read.
-    pending_read_bool_value: Option<bool>,
-}
-
-impl<'a> TCompactSliceInputProtocol<'a> {
-    pub fn new(buf: &'a [u8]) -> Self {
-        Self {
-            buf,
-            last_read_field_id: 0,
-            read_field_id_stack: Vec::with_capacity(16),
-            pending_read_bool_value: None,
-        }
-    }
-
-    pub fn as_slice(&self) -> &'a [u8] {
-        self.buf
-    }
-
-    fn read_vlq(&mut self) -> thrift::Result<u64> {
-        let mut in_progress = 0;
-        let mut shift = 0;
-        loop {
-            let byte = self.read_byte()?;
-            in_progress |= ((byte & 0x7F) as u64).wrapping_shl(shift);
-            shift += 7;
-            if byte & 0x80 == 0 {
-                return Ok(in_progress);
-            }
-        }
-    }
-
-    fn read_zig_zag(&mut self) -> thrift::Result<i64> {
-        let val = self.read_vlq()?;
-        Ok((val >> 1) as i64 ^ -((val & 1) as i64))
-    }
-
-    fn read_list_set_begin(&mut self) -> thrift::Result<(TType, i32)> {
-        let header = self.read_byte()?;
-        let element_type = collection_u8_to_type(header & 0x0F)?;
-
-        let possible_element_count = (header & 0xF0) >> 4;
-        let element_count = if possible_element_count != 15 {
-            // high bits set high if count and type encoded separately
-            possible_element_count as i32
-        } else {
-            self.read_vlq()? as _
-        };
-
-        Ok((element_type, element_count))
-    }
-}
-
-macro_rules! thrift_unimplemented {
-    () => {
-        Err(thrift::Error::Protocol(thrift::ProtocolError {
-            kind: thrift::ProtocolErrorKind::NotImplemented,
-            message: "not implemented".to_string(),
-        }))
-    };
-}
-
-impl TInputProtocol for TCompactSliceInputProtocol<'_> {
-    fn read_message_begin(&mut self) -> thrift::Result<TMessageIdentifier> {
-        unimplemented!()
-    }
-
-    fn read_message_end(&mut self) -> thrift::Result<()> {
-        thrift_unimplemented!()
-    }
-
-    fn read_struct_begin(&mut self) -> thrift::Result<Option<TStructIdentifier>> {
-        self.read_field_id_stack.push(self.last_read_field_id);
-        self.last_read_field_id = 0;
-        Ok(None)
-    }
-
-    fn read_struct_end(&mut self) -> thrift::Result<()> {
-        self.last_read_field_id = self
-            .read_field_id_stack
-            .pop()
-            .expect("should have previous field ids");
-        Ok(())
-    }
-
-    fn read_field_begin(&mut self) -> thrift::Result<TFieldIdentifier> {
-        // we can read at least one byte, which is:
-        // - the type
-        // - the field delta and the type
-        let field_type = self.read_byte()?;
-        let field_delta = (field_type & 0xF0) >> 4;
-        let field_type = match field_type & 0x0F {
-            0x01 => {
-                self.pending_read_bool_value = Some(true);
-                Ok(TType::Bool)
-            }
-            0x02 => {
-                self.pending_read_bool_value = Some(false);
-                Ok(TType::Bool)
-            }
-            ttu8 => u8_to_type(ttu8),
-        }?;
-
-        match field_type {
-            TType::Stop => Ok(
-                TFieldIdentifier::new::<Option<String>, String, Option<i16>>(
-                    None,
-                    TType::Stop,
-                    None,
-                ),
-            ),
-            _ => {
-                if field_delta != 0 {
-                    self.last_read_field_id = self
-                        .last_read_field_id
-                        .checked_add(field_delta as i16)
-                        .map_or_else(
-                            || {
-                                Err(thrift::Error::Protocol(thrift::ProtocolError {
-                                    kind: thrift::ProtocolErrorKind::InvalidData,
-                                    message: format!(
-                                        "cannot add {} to {}",
-                                        field_delta, self.last_read_field_id
-                                    ),
-                                }))
-                            },
-                            Ok,
-                        )?;
-                } else {
-                    self.last_read_field_id = self.read_i16()?;
-                };
-
-                Ok(TFieldIdentifier {
-                    name: None,
-                    field_type,
-                    id: Some(self.last_read_field_id),
-                })
-            }
-        }
-    }
-
-    fn read_field_end(&mut self) -> thrift::Result<()> {
-        Ok(())
-    }
-
-    fn read_bool(&mut self) -> thrift::Result<bool> {
-        match self.pending_read_bool_value.take() {
-            Some(b) => Ok(b),
-            None => {
-                let b = self.read_byte()?;
-                // Previous versions of the thrift specification said to use 0 and 1 inside collections,
-                // but that differed from existing implementations.
-                // The specification was updated in https://github.com/apache/thrift/commit/2c29c5665bc442e703480bb0ee60fe925ffe02e8.
-                // At least the go implementation seems to have followed the previously documented values.
-                match b {
-                    0x01 => Ok(true),
-                    0x00 | 0x02 => Ok(false),
-                    unkn => Err(thrift::Error::Protocol(thrift::ProtocolError {
-                        kind: thrift::ProtocolErrorKind::InvalidData,
-                        message: format!("cannot convert {unkn} into bool"),
-                    })),
-                }
-            }
-        }
-    }
-
-    fn read_bytes(&mut self) -> thrift::Result<Vec<u8>> {
-        let len = self.read_vlq()? as usize;
-        let ret = self.buf.get(..len).ok_or_else(eof_error)?.to_vec();
-        self.buf = &self.buf[len..];
-        Ok(ret)
-    }
-
-    fn read_i8(&mut self) -> thrift::Result<i8> {
-        Ok(self.read_byte()? as _)
-    }
-
-    fn read_i16(&mut self) -> thrift::Result<i16> {
-        Ok(self.read_zig_zag()? as _)
-    }
-
-    fn read_i32(&mut self) -> thrift::Result<i32> {
-        Ok(self.read_zig_zag()? as _)
-    }
-
-    fn read_i64(&mut self) -> thrift::Result<i64> {
-        self.read_zig_zag()
-    }
-
-    fn read_double(&mut self) -> thrift::Result<f64> {
-        let slice = (self.buf[..8]).try_into().unwrap();
-        self.buf = &self.buf[8..];
-        Ok(f64::from_le_bytes(slice))
-    }
-
-    fn read_string(&mut self) -> thrift::Result<String> {
-        let bytes = self.read_bytes()?;
-        String::from_utf8(bytes).map_err(From::from)
-    }
-
-    fn read_list_begin(&mut self) -> thrift::Result<TListIdentifier> {
-        let (element_type, element_count) = self.read_list_set_begin()?;
-        Ok(TListIdentifier::new(element_type, element_count))
-    }
-
-    fn read_list_end(&mut self) -> thrift::Result<()> {
-        Ok(())
-    }
-
-    fn read_set_begin(&mut self) -> thrift::Result<TSetIdentifier> {
-        thrift_unimplemented!()
-    }
-
-    fn read_set_end(&mut self) -> thrift::Result<()> {
-        thrift_unimplemented!()
-    }
-
-    fn read_map_begin(&mut self) -> thrift::Result<TMapIdentifier> {
-        thrift_unimplemented!()
-    }
-
-    fn read_map_end(&mut self) -> thrift::Result<()> {
-        Ok(())
-    }
-
-    #[inline]
-    fn read_byte(&mut self) -> thrift::Result<u8> {
-        let ret = *self.buf.first().ok_or_else(eof_error)?;
-        self.buf = &self.buf[1..];
-        Ok(ret)
-    }
-}
-
-fn collection_u8_to_type(b: u8) -> thrift::Result<TType> {
-    match b {
-        // For historical and compatibility reasons, a reader should be capable to deal with both cases.
-        // The only valid value in the original spec was 2, but due to an widespread implementation bug
-        // the defacto standard across large parts of the library became 1 instead.
-        // As a result, both values are now allowed.
-        // https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#list-and-set
-        0x01 | 0x02 => Ok(TType::Bool),
-        o => u8_to_type(o),
-    }
-}
-
-fn u8_to_type(b: u8) -> thrift::Result<TType> {
-    match b {
-        0x00 => Ok(TType::Stop),
-        0x03 => Ok(TType::I08), // equivalent to TType::Byte
-        0x04 => Ok(TType::I16),
-        0x05 => Ok(TType::I32),
-        0x06 => Ok(TType::I64),
-        0x07 => Ok(TType::Double),
-        0x08 => Ok(TType::String),
-        0x09 => Ok(TType::List),
-        0x0A => Ok(TType::Set),
-        0x0B => Ok(TType::Map),
-        0x0C => Ok(TType::Struct),
-        unkn => Err(thrift::Error::Protocol(thrift::ProtocolError {
-            kind: thrift::ProtocolErrorKind::InvalidData,
-            message: format!("cannot convert {unkn} into TType"),
-        })),
-    }
-}
-
-fn eof_error() -> thrift::Error {
-    thrift::Error::Transport(thrift::TransportError {
-        kind: thrift::TransportErrorKind::EndOfFile,
-        message: "Unexpected EOF".to_string(),
-    })
-}
-
 #[cfg(test)]
 mod tests {
-    use crate::format::{BoundaryOrder, ColumnIndex};
-    use crate::thrift::{TCompactSliceInputProtocol, TSerializable};
+    use crate::{
+        basic::Type,
+        file::page_index::{column_index::ColumnIndexMetaData, index_reader::decode_column_index},
+    };
 
     #[test]
     pub fn read_boolean_list_field_type() {
         // Boolean collection type encoded as 0x01, as used by this crate when writing.
         // Values encoded as 1 (true) or 2 (false) as in the current version of the thrift
         // documentation.
-        let bytes = vec![0x19, 0x21, 2, 1, 0x19, 8, 0x19, 8, 0x15, 0, 0];
-
-        let mut protocol = TCompactSliceInputProtocol::new(bytes.as_slice());
-        let index = ColumnIndex::read_from_in_protocol(&mut protocol).unwrap();
-        let expected = ColumnIndex {
-            null_pages: vec![false, true],
-            min_values: vec![],
-            max_values: vec![],
-            boundary_order: BoundaryOrder::UNORDERED,
-            null_counts: None,
-            repetition_level_histograms: None,
-            definition_level_histograms: None,
+        let bytes = vec![
+            0x19, 0x21, 2, 1, 0x19, 0x28, 1, 0, 0, 0x19, 0x28, 1, 1, 0, 0x15, 0, 0,
+        ];
+        let index = decode_column_index(&bytes, Type::BOOLEAN).unwrap();
+
+        let index = match index {
+            ColumnIndexMetaData::BOOLEAN(index) => index,
+            _ => panic!("expected boolean column index"),
         };
 
-        assert_eq!(&index, &expected);
+        // should be false, true
+        assert!(!index.is_null_page(0));
+        assert!(index.is_null_page(1));
+        assert!(!index.min_value(0).unwrap()); // min is false
+        assert!(index.max_value(0).unwrap()); // max is true
+        assert!(index.min_value(1).is_none());
+        assert!(index.max_value(1).is_none());
     }
 
     #[test]
     pub fn read_boolean_list_alternative_encoding() {
         // Boolean collection type encoded as 0x02, as allowed by the spec.
         // Values encoded as 1 (true) or 0 (false) as before the thrift documentation change on 2024-12-13.
-        let bytes = vec![0x19, 0x22, 0, 1, 0x19, 8, 0x19, 8, 0x15, 0, 0];
-
-        let mut protocol = TCompactSliceInputProtocol::new(bytes.as_slice());
-        let index = ColumnIndex::read_from_in_protocol(&mut protocol).unwrap();
-        let expected = ColumnIndex {
-            null_pages: vec![false, true],
-            min_values: vec![],
-            max_values: vec![],
-            boundary_order: BoundaryOrder::UNORDERED,
-            null_counts: None,
-            repetition_level_histograms: None,
-            definition_level_histograms: None,
+        let bytes = vec![
+            0x19, 0x22, 0, 1, 0x19, 0x28, 1, 0, 0, 0x19, 0x28, 1, 1, 0, 0x15, 0, 0,
+        ];
+        let index = decode_column_index(&bytes, Type::BOOLEAN).unwrap();
+
+        let index = match index {
+            ColumnIndexMetaData::BOOLEAN(index) => index,
+            _ => panic!("expected boolean column index"),
         };
 
-        assert_eq!(&index, &expected);
+        // should be false, true
+        assert!(!index.is_null_page(0));
+        assert!(index.is_null_page(1));
+        assert!(!index.min_value(0).unwrap()); // min is false
+        assert!(index.max_value(0).unwrap()); // max is true
+        assert!(index.min_value(1).is_none());
+        assert!(index.max_value(1).is_none());
     }
 }
diff --git a/parquet/src/variant.rs b/parquet/src/variant.rs
index b135f2bb7a59..413c5570cdb7 100644
--- a/parquet/src/variant.rs
+++ b/parquet/src/variant.rs
@@ -199,7 +199,9 @@ mod tests {
         // data should have been written with the Variant logical type
         assert_eq!(
             field.get_basic_info().logical_type(),
-            Some(crate::basic::LogicalType::Variant)
+            Some(crate::basic::LogicalType::Variant {
+                specification_version: None
+            })
         );
     }
 
diff --git a/parquet/tests/arrow_reader/bad_data.rs b/parquet/tests/arrow_reader/bad_data.rs
index c767115eaa7b..be401030e7f9 100644
--- a/parquet/tests/arrow_reader/bad_data.rs
+++ b/parquet/tests/arrow_reader/bad_data.rs
@@ -80,10 +80,13 @@ fn test_invalid_files() {
 #[test]
 fn test_parquet_1481() {
     let err = read_file("PARQUET-1481.parquet").unwrap_err();
+    #[cfg(feature = "encryption")]
     assert_eq!(
         err.to_string(),
-        "Parquet error: unexpected parquet type: -7"
+        "Parquet error: Could not parse metadata: Parquet error: Unexpected Type -7"
     );
+    #[cfg(not(feature = "encryption"))]
+    assert_eq!(err.to_string(), "Parquet error: Unexpected Type -7");
 }
 
 #[test]
@@ -98,7 +101,7 @@ fn test_arrow_gh_41317() {
     let err = read_file("ARROW-GH-41317.parquet").unwrap_err();
     assert_eq!(
         err.to_string(),
-        "External: Parquet argument error: External: bad data"
+        "External: Parquet argument error: Parquet error: StructArrayReader out of sync in read_records, expected 5 read, got 2"
     );
 }
 
diff --git a/parquet/tests/arrow_reader/io/mod.rs b/parquet/tests/arrow_reader/io/mod.rs
index 2895b61eaf2b..2f335d9f7f82 100644
--- a/parquet/tests/arrow_reader/io/mod.rs
+++ b/parquet/tests/arrow_reader/io/mod.rs
@@ -47,9 +47,9 @@ use parquet::arrow::arrow_reader::{
 use parquet::arrow::{ArrowWriter, ProjectionMask};
 use parquet::data_type::AsBytes;
 use parquet::file::metadata::{FooterTail, ParquetMetaData, ParquetOffsetIndex};
+use parquet::file::page_index::offset_index::PageLocation;
 use parquet::file::properties::WriterProperties;
 use parquet::file::FOOTER_SIZE;
-use parquet::format::PageLocation;
 use parquet::schema::types::SchemaDescriptor;
 use std::collections::BTreeMap;
 use std::fmt::Display;
@@ -287,8 +287,7 @@ impl TestRowGroups {
                     .enumerate()
                     .map(|(col_idx, col_meta)| {
                         let column_name = col_meta.column_descr().name().to_string();
-                        let page_locations =
-                            offset_index[rg_index][col_idx].page_locations().to_vec();
+                        let page_locations = offset_index[rg_index][col_idx].page_locations();
                         let dictionary_page_location = col_meta.dictionary_page_offset();
 
                         // We can find the byte range of the entire column chunk
@@ -300,7 +299,7 @@ impl TestRowGroups {
                             name: column_name.clone(),
                             location: start_offset..end_offset,
                             dictionary_page_location,
-                            page_locations,
+                            page_locations: page_locations.clone(),
                         }
                     })
                     .map(|test_column_chunk| {
diff --git a/parquet/tests/encryption/encryption.rs b/parquet/tests/encryption/encryption.rs
index 96dd8654cd76..0261c22c2c2d 100644
--- a/parquet/tests/encryption/encryption.rs
+++ b/parquet/tests/encryption/encryption.rs
@@ -982,23 +982,17 @@ pub fn test_retrieve_row_group_statistics_after_encrypted_write() {
     }
     let file_metadata = writer.close().unwrap();
 
-    assert_eq!(file_metadata.row_groups.len(), 1);
-    let row_group = &file_metadata.row_groups[0];
-    assert_eq!(row_group.columns.len(), 1);
-    let column = &row_group.columns[0];
-    let column_stats = column
-        .meta_data
-        .as_ref()
-        .unwrap()
-        .statistics
-        .as_ref()
-        .unwrap();
+    assert_eq!(file_metadata.num_row_groups(), 1);
+    let row_group = file_metadata.row_group(0);
+    assert_eq!(row_group.num_columns(), 1);
+    let column = row_group.column(0);
+    let column_stats = column.statistics().unwrap();
     assert_eq!(
-        column_stats.min_value.as_deref(),
+        column_stats.min_bytes_opt(),
         Some(3i32.to_le_bytes().as_slice())
     );
     assert_eq!(
-        column_stats.max_value.as_deref(),
+        column_stats.max_bytes_opt(),
         Some(19i32.to_le_bytes().as_slice())
     );
 }
diff --git a/parquet/tests/encryption/encryption_agnostic.rs b/parquet/tests/encryption/encryption_agnostic.rs
index e071471712f4..48b5c77d9b97 100644
--- a/parquet/tests/encryption/encryption_agnostic.rs
+++ b/parquet/tests/encryption/encryption_agnostic.rs
@@ -72,7 +72,7 @@ pub fn read_plaintext_footer_file_without_decryption_properties() {
 
     match record_reader.next() {
         Some(Err(ArrowError::ParquetError(s))) => {
-            assert!(s.contains("protocol error"));
+            assert!(s.contains("Parquet error"));
         }
         _ => {
             panic!("Expected ArrowError::ParquetError");
@@ -137,7 +137,7 @@ pub async fn read_plaintext_footer_file_without_decryption_properties_async() {
 
     match record_reader.next().await {
         Some(Err(ParquetError::ArrowError(s))) => {
-            assert!(s.contains("protocol error"));
+            assert!(s.contains("Parquet error"));
         }
         _ => {
             panic!("Expected ArrowError::ParquetError");
diff --git a/parquet/tests/encryption/encryption_async.rs b/parquet/tests/encryption/encryption_async.rs
index 9c1e0c00a3f6..6999b1a931f4 100644
--- a/parquet/tests/encryption/encryption_async.rs
+++ b/parquet/tests/encryption/encryption_async.rs
@@ -34,9 +34,9 @@ use parquet::arrow::{ArrowWriter, AsyncArrowWriter};
 use parquet::encryption::decrypt::FileDecryptionProperties;
 use parquet::encryption::encrypt::FileEncryptionProperties;
 use parquet::errors::ParquetError;
+use parquet::file::metadata::ParquetMetaData;
 use parquet::file::properties::{WriterProperties, WriterPropertiesBuilder};
 use parquet::file::writer::SerializedFileWriter;
-use parquet::format::FileMetaData;
 use std::io::Write;
 use std::sync::Arc;
 use tokio::fs::File;
@@ -647,7 +647,7 @@ fn spawn_column_parallel_row_group_writer(
 async fn concatenate_parallel_row_groups<W: Write + Send>(
     mut parquet_writer: SerializedFileWriter<W>,
     mut serialize_rx: Receiver<JoinHandle<RBStreamSerializeResult>>,
-) -> Result<FileMetaData, ParquetError> {
+) -> Result<ParquetMetaData, ParquetError> {
     while let Some(task) = serialize_rx.recv().await {
         let result = task.await;
         let mut rg_out = parquet_writer.next_row_group()?;
@@ -818,8 +818,7 @@ async fn test_multi_threaded_encrypted_writing() {
     let metadata = serialized_file_writer.close().unwrap();
 
     // Close the file writer which writes the footer
-    assert_eq!(metadata.num_rows, 50);
-    assert_eq!(metadata.schema, metadata.schema);
+    assert_eq!(metadata.file_metadata().num_rows(), 50);
 
     // Check that the file was written correctly
     let (read_record_batches, read_metadata) =
@@ -909,8 +908,7 @@ async fn test_multi_threaded_encrypted_writing_deprecated() {
 
     // Close the file writer which writes the footer
     let metadata = writer.finish().unwrap();
-    assert_eq!(metadata.num_rows, 100);
-    assert_eq!(metadata.schema, metadata.schema);
+    assert_eq!(metadata.file_metadata().num_rows(), 100);
 
     // Check that the file was written correctly
     let (read_record_batches, read_metadata) =
diff --git a/parquet/tests/encryption/encryption_util.rs b/parquet/tests/encryption/encryption_util.rs
index f53e12adb720..32372fcca31b 100644
--- a/parquet/tests/encryption/encryption_util.rs
+++ b/parquet/tests/encryption/encryption_util.rs
@@ -202,11 +202,11 @@ pub(crate) fn verify_column_indexes(metadata: &ParquetMetaData) {
     let column_index = &column_index[0][float_col_idx];
 
     match column_index {
-        parquet::file::page_index::index::Index::FLOAT(float_index) => {
-            assert_eq!(float_index.indexes.len(), 1);
-            assert_eq!(float_index.indexes[0].min, Some(0.0f32));
-            assert!(float_index.indexes[0]
-                .max
+        parquet::file::page_index::column_index::ColumnIndexMetaData::FLOAT(float_index) => {
+            assert_eq!(float_index.num_pages(), 1);
+            assert_eq!(float_index.min_value(0), Some(&0.0f32));
+            assert!(float_index
+                .max_value(0)
                 .is_some_and(|max| (max - 53.9).abs() < 1e-6));
         }
         _ => {
diff --git a/parquet/tests/geospatial.rs b/parquet/tests/geospatial.rs
new file mode 100644
index 000000000000..b3de40491b30
--- /dev/null
+++ b/parquet/tests/geospatial.rs
@@ -0,0 +1,123 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Tests for Geometry and Geography logical types
+use parquet::{
+    basic::{EdgeInterpolationAlgorithm, LogicalType},
+    file::{
+        metadata::ParquetMetaData,
+        reader::{FileReader, SerializedFileReader},
+    },
+    geospatial::bounding_box::BoundingBox,
+};
+use serde_json::Value;
+use std::fs::File;
+
+fn read_metadata(geospatial_test_file: &str) -> ParquetMetaData {
+    let path = format!(
+        "{}/geospatial/{geospatial_test_file}",
+        arrow::util::test_util::parquet_test_data(),
+    );
+    let file = File::open(path).unwrap();
+    let reader = SerializedFileReader::try_from(file).unwrap();
+    reader.metadata().clone()
+}
+
+#[test]
+fn test_read_logical_type() {
+    // Some crs values are short strings
+    let expected_logical_type = [
+        ("crs-default.parquet", LogicalType::Geometry { crs: None }),
+        (
+            "crs-srid.parquet",
+            LogicalType::Geometry {
+                crs: Some("srid:5070".to_string()),
+            },
+        ),
+        (
+            "crs-projjson.parquet",
+            LogicalType::Geometry {
+                crs: Some("projjson:projjson_epsg_5070".to_string()),
+            },
+        ),
+        (
+            "crs-geography.parquet",
+            LogicalType::Geography {
+                crs: None,
+                algorithm: Some(EdgeInterpolationAlgorithm::SPHERICAL),
+            },
+        ),
+    ];
+
+    for (geospatial_file, expected_type) in expected_logical_type {
+        let metadata = read_metadata(geospatial_file);
+        let logical_type = metadata
+            .file_metadata()
+            .schema_descr()
+            .column(1)
+            .logical_type()
+            .unwrap();
+
+        assert_eq!(logical_type, expected_type);
+    }
+
+    // The crs value may also contain arbitrary values (in this case some JSON
+    // a bit too lengthy to type out)
+    let metadata = read_metadata("crs-arbitrary-value.parquet");
+    let logical_type = metadata
+        .file_metadata()
+        .schema_descr()
+        .column(1)
+        .logical_type()
+        .unwrap();
+
+    if let LogicalType::Geometry { crs } = logical_type {
+        let crs_parsed: Value = serde_json::from_str(&crs.unwrap()).unwrap();
+        assert_eq!(crs_parsed.get("id").unwrap().get("code").unwrap(), 5070);
+    } else {
+        panic!("Expected geometry type but got {logical_type:?}");
+    }
+}
+
+#[test]
+fn test_read_geospatial_statistics() {
+    let metadata = read_metadata("geospatial.parquet");
+
+    // geospatial.parquet schema:
+    //    optional binary field_id=-1 group (String);
+    //    optional binary field_id=-1 wkt (String);
+    //    optional binary field_id=-1 geometry (Geometry(crs=));
+    let fields = metadata.file_metadata().schema().get_fields();
+    let logical_type = fields[2].get_basic_info().logical_type().unwrap();
+    assert_eq!(logical_type, LogicalType::Geometry { crs: None });
+
+    let geo_statistics = metadata.row_group(0).column(2).geo_statistics();
+    assert!(geo_statistics.is_some());
+
+    let expected_bbox = BoundingBox::new(10.0, 40.0, 10.0, 40.0)
+        .with_zrange(30.0, 80.0)
+        .with_mrange(200.0, 1600.0);
+    let expected_geospatial_types = vec![
+        1, 2, 3, 4, 5, 6, 7, 1001, 1002, 1003, 1004, 1005, 1006, 1007, 2001, 2002, 2003, 2004,
+        2005, 2006, 2007, 3001, 3002, 3003, 3004, 3005, 3006, 3007,
+    ];
+    assert_eq!(
+        geo_statistics.unwrap().geospatial_types(),
+        Some(&expected_geospatial_types)
+    );
+    assert_eq!(geo_statistics.unwrap().bounding_box(), Some(&expected_bbox));
+}