@@ -132,6 +132,33 @@ impl RowGroupAccessPlanFilter {
132132 /// | +-----------------------------------+-----------------------------+ |
133133 /// +-----------------------------------------------------------------------+
134134 ///
135+ /// # Example with Statistics Truncation and NOT Inversion
136+ ///
137+ /// When statistics are truncated to length 6 (e.g., `statistics_truncate_length = 6`),
138+ /// the min/max values become:
139+ ///
140+ /// ```
141+ /// Row group 3: species_min="Alpine", species_max="Alpine" (truncated from "Alpine Ibex"/"Alpine Sheep")
142+ /// s_min=76, s_max=101
143+ /// ```
144+ ///
145+ /// To identify this as fully matching, the system uses NOT inversion:
146+ /// 1. Original predicate: `species LIKE 'Alpine%' AND s >= 50`
147+ /// 2. Inverted predicate: `NOT (species LIKE 'Alpine%' AND s >= 50)`
148+ /// Simplified to: `species NOT LIKE 'Alpine%' OR s < 50`
149+ /// 3. Pruning predicate generated:
150+ /// `(species_min NOT LIKE 'Alpine%' OR species_max NOT LIKE 'Alpine%') OR s_min < 50`
151+ ///
152+ /// For row group 3 with truncated stats:
153+ /// - Evaluating `species_min NOT LIKE 'Alpine%'`: `"A" NOT LIKE 'Alpine%'` = `false`
154+ /// - Evaluating `species_max NOT LIKE 'Alpine%'`: `"A" NOT LIKE 'Alpine%'` = `false`
155+ /// - Evaluating `s_min < 50`: `76 < 50` = `false`
156+ /// - Final result: `(false OR false) OR false` = `false`
157+ ///
158+ /// Since the inverted predicate would prune this row group (returns false), it means
159+ /// no rows in this group could possibly satisfy the inverted predicate.
160+ /// Therefore, all rows in this group must match the original predicate, making it fully matched
161+ ///
135162 /// Without limit pruning: Scan Partition 2 → Partition 3 → Partition 4 (until limit reached)
136163 /// With limit pruning: If Partition 3 contains enough rows to satisfy the limit,
137164 /// skip Partitions 2 and 4 entirely and go directly to Partition 3.
0 commit comments