You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### What problem does this PR solve?
Problem Summary:
### Release note
[Feature] Implementation of Parquet File Page Cache and Integration with
Unified Page Cache Framework
#### Solution Overview
This PR implements a page-level caching mechanism for Parquet files and
integrates it with Apache Doris's existing unified page cache framework,
significantly improving query performance by caching decompressed (or
compressed) data pages in memory.
Key Features
1. Unified Page Cache Integration
• Leverages Existing Framework: Directly integrates with Doris's
StoragePageCache infrastructure used for internal tables
• Shared Resource Management: Parquet cache shares memory pool and
eviction policies with internal table caches
• Consistent Monitoring: Reuses existing cache statistics and
RuntimeProfile for unified performance monitoring
• Cache Type Identification: Uses segment_v2::DATA_PAGE as cache page
type, consistent with internal table data page caching
2. Smart Caching Strategy
• Compression Ratio Awareness: Automatically chooses between caching
compressed or decompressed data based on
parquet_page_cache_decompress_threshold (default: 1.5)
• Flexible Storage: Caches decompressed data when
uncompressed_size/compressed_size ≤ threshold, otherwise caches
compressed data if enable_parquet_cache_compressed_pages=true
• Cache Key Design: Uses file_path::mtime::offset as key to ensure cache
consistency across file modifications
0 commit comments