|
42 | 42 |
|
43 | 43 | /** |
44 | 44 | * Loads and manages metadata filter configurations for the CLP connector. |
45 | | - * <p> |
| 45 | + * <p></p> |
46 | 46 | * The configuration file is specified by the {@code clp.metadata-filter-config} property |
47 | 47 | * and defines metadata filters used to optimize query execution through split pruning. |
48 | | - * Filters can be declared at different scopes: |
| 48 | + * <p></p> |
| 49 | + * Filter configs can be declared at either a catalog, schema, or table scope. Filter configs under |
| 50 | + * a particular scope will apply to all child scopes (e.g., schema-level filter configs will apply |
| 51 | + * to all tables within that schema). |
| 52 | + * <p></p> |
| 53 | + * Each filter config includes the following fields: |
49 | 54 | * <ul> |
50 | | - * <li><b>Catalog-level</b>: applies to all schemas and tables within a catalog.</li> |
51 | | - * <li><b>Schema-level</b>: applies to all tables within a specific catalog and schema.</li> |
52 | | - * <li><b>Table-level</b>: applies to a fully qualified table {@code catalog.schema.table}.</li> |
53 | | - * </ul> |
| 55 | + * <li><b>{@code columnName}</b>: the name of a column in the table's logical schema. Currently, |
| 56 | + * only numeric-type columns can be used as metadata filters.</li> |
54 | 57 | * |
55 | | - * <p>Each scope maps to a list of filter definitions. Each filter includes the following fields: |
56 | | - * <ul> |
57 | | - * <li><b>{@code columnName}</b> (required): the name of a column in the table's logical schema. |
58 | | - * Only columns of numeric type are currently supported as metadata filters.</li> |
| 58 | + * <li><b>{@code rangeMapping}</b> <i>(optional)</i>: an object with the following properties: |
| 59 | + * |
| 60 | + * <br><br> |
| 61 | + * <b>Note:</b> This option is only valid if the column has a numeric type. |
| 62 | + * |
| 63 | + * <ul> |
| 64 | + * <li>{@code lowerBound}: The metadata column that represents the lower bound of values |
| 65 | + * in a split for the data column.</li> |
| 66 | + * <li>{@code upperBound}: The metadata column that represents the upper bound of values |
| 67 | + * in a split for the data column.</li> |
| 68 | + * </ul> |
59 | 69 | * |
60 | | - * <li><b>{@code rangeMapping}</b> (optional): remaps a logical filter to physical metadata-only columns. |
61 | | - * This field is valid only for numeric-type columns. |
62 | | - * For example, a condition such as: |
63 | | - * <pre>{@code |
64 | | - * "msg.timestamp" > 1234 AND "msg.timestamp" < 5678 |
65 | | - * }</pre> |
66 | | - * will be rewritten as: |
67 | | - * <pre>{@code |
68 | | - * "end_timestamp" > 1234 AND "begin_timestamp" < 5678 |
69 | | - * }</pre> |
70 | | - * This ensures the filter applies to a superset of the actual result set, enabling safe pruning.</li> |
| 70 | + * <p> |
| 71 | + * For example, a condition such as: |
| 72 | + * </p> |
| 73 | + * <pre>{@code |
| 74 | + * "msg.timestamp" > 1234 AND "msg.timestamp" < 5678 |
| 75 | + * }</pre> |
71 | 76 | * |
72 | | - * <li><b>{@code required}</b> (optional, default: {@code false}): indicates whether the filter must be present |
73 | | - * in the extracted metadata filter SQL query. If a required filter is missing or cannot be pushed down, |
74 | | - * the query will be rejected.</li> |
| 77 | + * <p> |
| 78 | + * will be rewritten as: |
| 79 | + * </p> |
| 80 | + * <pre>{@code |
| 81 | + * "end_timestamp" > 1234 AND "begin_timestamp" < 5678 |
| 82 | + * }</pre> |
| 83 | + * |
| 84 | + * <p> |
| 85 | + * This ensures the filter applies to a superset of the actual result set, enabling safe |
| 86 | + * pruning. |
| 87 | + * </p> |
| 88 | + * </li> |
| 89 | + * |
| 90 | + * <li><b>{@code required}</b> (optional, defaults to {@code false}): indicates whether the |
| 91 | + * filter must be present in the translated metadata filter SQL query. If a required filter |
| 92 | + * is missing or cannot be pushed down, the query will be rejected.</li> |
75 | 93 | * </ul> |
76 | 94 | */ |
77 | 95 | public class ClpMetadataFilterProvider |
@@ -112,29 +130,27 @@ public void checkContainsRequiredFilters(SchemaTableName schemaTableName, String |
112 | 130 | } |
113 | 131 |
|
114 | 132 | /** |
115 | | - * Rewrites the input SQL string by remapping filter conditions based on the configured |
116 | | - * metadata filter range mappings for the given scope. |
| 133 | + * Rewrites the given SQL string to remap filter conditions based on the configured range |
| 134 | + * mappings for the given scope. |
117 | 135 | * |
118 | 136 | * <p>The {@code scope} follows the format {@code catalog[.schema][.table]}, and determines |
119 | | - * which filter mappings to apply. For each level of scope (catalog, schema, table), this |
120 | | - * method collects all range mappings defined in the metadata filter configuration. Mappings |
121 | | - * from more specific scopes (e.g., table-level) override or supplement those from broader |
122 | | - * scopes (e.g., catalog-level). |
| 137 | + * which filter mappings to apply, since mappings from more specific scopes (e.g., table-level) |
| 138 | + * override or supplement those from broader scopes (e.g., catalog-level). For each scope |
| 139 | + * (catalog, schema, table), this method collects all range mappings defined in the metadata |
| 140 | + * filter configuration. |
123 | 141 | * |
124 | 142 | * <p>This method performs regex-based replacements to convert numeric filter expressions such |
125 | 143 | * as: |
126 | | - * |
127 | 144 | * <ul> |
128 | 145 | * <li>{@code "msg.timestamp" >= 1234} → {@code end_timestamp >= 1234}</li> |
129 | 146 | * <li>{@code "msg.timestamp" <= 5678} → {@code begin_timestamp <= 5678}</li> |
130 | 147 | * <li>{@code "msg.timestamp" = 4567} → |
131 | 148 | * {@code (begin_timestamp <= 4567 AND end_timestamp >= 4567)}</li> |
132 | 149 | * </ul> |
133 | 150 | * |
134 | | - * @param scope the catalog.schema.table scope used to resolve applicable filter mappings |
135 | | - * @param sql the original SQL expression to be remapped |
136 | | - * @return the rewritten SQL string with metadata filter expressions remapped according to the |
137 | | - * configured range mappings |
| 151 | + * @param scope |
| 152 | + * @param sql |
| 153 | + * @return the rewritten SQL string |
138 | 154 | */ |
139 | 155 | public String remapFilterSql(String scope, String sql) |
140 | 156 | { |
|
0 commit comments