Commit 1384a4f
authored
feat(core): Add support for
## Which issue does this PR close?
- Closes #1766.
## What changes are included in this PR?
Integrates virtual field handling for the `_file` metadata column into
`RecordBatchTransformer` using a pre-computed constants map, eliminating
post-processing and duplicate lookups.
## Key Changes
**New `metadata_columns.rs` module**: Centralized utilities for metadata
columns
- Constants: `RESERVED_FIELD_ID_FILE`, `RESERVED_COL_NAME_FILE`
- Helper functions: `get_metadata_column_name()`,
`get_metadata_field_id()`, `is_metadata_field()`,
`is_metadata_column_name()`
**Enhanced `RecordBatchTransformer`**:
- Added `constant_fields: HashMap<i32, (DataType, PrimitiveLiteral)>` -
pre-computed during initialization
- New `with_constant()` method - computes Arrow type once during setup
- Updated to use pre-computed types and values (avoids duplicate
lookups)
- Handles `DataType::RunEndEncoded` for constant strings (memory
efficient)
**Simplified `reader.rs`**:
- Pass full `project_field_ids` (including virtual) to
RecordBatchTransformer
- Single `with_constant()` call to register `_file` column
- Removed post-processing loop
**Updated `scan/mod.rs`**:
- Use `is_metadata_column_name()` and `get_metadata_field_id()` instead
of hardcoded checks
## Are these changes tested?
Yes, comprehensive tests have been added to verify the functionality:
### New Tests (7 tests added)
#### Table Scan API Tests (7 tests)
1. **`test_select_with_file_column`** - Verifies basic functionality of
selecting `_file` with regular columns
2. **`test_select_file_column_position`** - Verifies column ordering is
preserved
3. **`test_select_file_column_only`** - Tests selecting only the `_file`
column
4. **`test_file_column_with_multiple_files`** - Tests multiple data
files scenario
5. **`test_file_column_at_start`** - Tests `_file` at position 0
6. **`test_file_column_at_end`** - Tests `_file` at the last position
7. **`test_select_with_repeated_column_names`** - Tests repeated column
selection_file column (#1824)1 parent 26b9839 commit 1384a4f
File tree
7 files changed
+1031
-213
lines changed- crates/iceberg/src
- arrow
- scan
7 files changed
+1031
-213
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
54 | 54 | | |
55 | 55 | | |
56 | 56 | | |
| 57 | + | |
57 | 58 | | |
58 | 59 | | |
59 | 60 | | |
| |||
250 | 251 | | |
251 | 252 | | |
252 | 253 | | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
253 | 262 | | |
254 | 263 | | |
255 | 264 | | |
256 | 265 | | |
257 | 266 | | |
258 | | - | |
| 267 | + | |
259 | 268 | | |
260 | 269 | | |
261 | 270 | | |
| |||
266 | 275 | | |
267 | 276 | | |
268 | 277 | | |
269 | | - | |
270 | | - | |
| 278 | + | |
| 279 | + | |
271 | 280 | | |
272 | 281 | | |
273 | 282 | | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
274 | 290 | | |
275 | 291 | | |
276 | 292 | | |
277 | 293 | | |
278 | | - | |
| 294 | + | |
279 | 295 | | |
280 | 296 | | |
281 | 297 | | |
| |||
416 | 432 | | |
417 | 433 | | |
418 | 434 | | |
419 | | - | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
420 | 439 | | |
421 | 440 | | |
422 | 441 | | |
| |||
0 commit comments