You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -359,7 +359,7 @@ datafusion.execution.parquet.metadata_size_hint NULL (reading) If specified, the
359
359
datafusion.execution.parquet.pruning true (reading) If true, the parquet reader attempts to skip entire row groups based on the predicate in the query and the metadata (min/max values) stored in the parquet file
360
360
datafusion.execution.parquet.pushdown_filters false (reading) If true, filter expressions are be applied during the parquet decoding operation to reduce the number of rows decoded. This optimization is sometimes called "late materialization".
361
361
datafusion.execution.parquet.reorder_filters false (reading) If true, filter expressions evaluated during the parquet decoding operation will be reordered heuristically to minimize the cost of evaluation. If false, the filters are applied in the same order as written in the query
362
-
datafusion.execution.parquet.schema_force_view_types false (reading) If true, parquet reader will read columns of `Utf8/Utf8Large` with `Utf8View`, and `Binary/BinaryLarge` with `BinaryView`.
362
+
datafusion.execution.parquet.schema_force_view_types true (reading) If true, parquet reader will read columns of `Utf8/Utf8Large` with `Utf8View`, and `Binary/BinaryLarge` with `BinaryView`.
363
363
datafusion.execution.parquet.skip_arrow_metadata false (writing) Skip encoding the embedded arrow metadata in the KV_meta This is analogous to the `ArrowWriterOptions::with_skip_arrow_metadata`. Refer to <https://docs.rs/parquet/53.3.0/parquet/arrow/arrow_writer/struct.ArrowWriterOptions.html#method.with_skip_arrow_metadata>
364
364
datafusion.execution.parquet.skip_metadata true (reading) If true, the parquet reader skip the optional embedded metadata that may be in the file Schema. This setting can help avoid schema conflicts when querying multiple parquet files with schemas containing compatible types but different metadata
365
365
datafusion.execution.parquet.statistics_enabled page (writing) Sets if statistics are enabled for any column Valid values are: "none", "chunk", and "page" These values are not case sensitive. If NULL, uses default parquet writer setting
# Run an explain plan to show the cast happens in the plan (a CAST is needed for the predicates)
398
398
query TT
@@ -405,11 +405,11 @@ EXPLAIN
405
405
binaryview_col LIKE '%a%';
406
406
----
407
407
logical_plan
408
-
01)Filter: CAST(binary_as_string_default.binary_col AS Utf8) LIKE Utf8("%a%") AND CAST(binary_as_string_default.largebinary_col AS Utf8) LIKE Utf8("%a%") AND CAST(binary_as_string_default.binaryview_col AS Utf8) LIKE Utf8("%a%")
409
-
02)--TableScan: binary_as_string_default projection=[binary_col, largebinary_col, binaryview_col], partial_filters=[CAST(binary_as_string_default.binary_col AS Utf8) LIKE Utf8("%a%"), CAST(binary_as_string_default.largebinary_col AS Utf8) LIKE Utf8("%a%"), CAST(binary_as_string_default.binaryview_col AS Utf8) LIKE Utf8("%a%")]
408
+
01)Filter: CAST(binary_as_string_default.binary_col AS Utf8View) LIKE Utf8View("%a%") AND CAST(binary_as_string_default.largebinary_col AS Utf8View) LIKE Utf8View("%a%") AND CAST(binary_as_string_default.binaryview_col AS Utf8View) LIKE Utf8View("%a%")
409
+
02)--TableScan: binary_as_string_default projection=[binary_col, largebinary_col, binaryview_col], partial_filters=[CAST(binary_as_string_default.binary_col AS Utf8View) LIKE Utf8View("%a%"), CAST(binary_as_string_default.largebinary_col AS Utf8View) LIKE Utf8View("%a%"), CAST(binary_as_string_default.binaryview_col AS Utf8View) LIKE Utf8View("%a%")]
410
410
physical_plan
411
411
01)CoalesceBatchesExec: target_batch_size=8192
412
-
02)--FilterExec: CAST(binary_col@0 AS Utf8) LIKE %a% AND CAST(largebinary_col@1 AS Utf8) LIKE %a% AND CAST(binaryview_col@2 AS Utf8) LIKE %a%
412
+
02)--FilterExec: CAST(binary_col@0 AS Utf8View) LIKE %a% AND CAST(largebinary_col@1 AS Utf8View) LIKE %a% AND CAST(binaryview_col@2 AS Utf8View) LIKE %a%
04)------DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/parquet/binary_as_string.parquet]]}, projection=[binary_col, largebinary_col, binaryview_col], file_type=parquet, predicate=CAST(binary_col@0 AS Utf8View) LIKE %a% AND CAST(largebinary_col@1 AS Utf8View) LIKE %a% AND CAST(binaryview_col@2 AS Utf8View) LIKE %a%
415
415
@@ -432,15 +432,15 @@ select
432
432
arrow_typeof(binaryview_col), binaryview_col
433
433
FROM binary_as_string_option;
434
434
----
435
-
Utf8 aaa Utf8 aaa Utf8 aaa
436
-
Utf8 bbb Utf8 bbb Utf8 bbb
437
-
Utf8 ccc Utf8 ccc Utf8 ccc
438
-
Utf8 ddd Utf8 ddd Utf8 ddd
439
-
Utf8 eee Utf8 eee Utf8 eee
440
-
Utf8 fff Utf8 fff Utf8 fff
441
-
Utf8 ggg Utf8 ggg Utf8 ggg
442
-
Utf8 hhh Utf8 hhh Utf8 hhh
443
-
Utf8 iii Utf8 iii Utf8 iii
435
+
Utf8View aaa Utf8View aaa Utf8View aaa
436
+
Utf8View bbb Utf8View bbb Utf8View bbb
437
+
Utf8View ccc Utf8View ccc Utf8View ccc
438
+
Utf8View ddd Utf8View ddd Utf8View ddd
439
+
Utf8View eee Utf8View eee Utf8View eee
440
+
Utf8View fff Utf8View fff Utf8View fff
441
+
Utf8View ggg Utf8View ggg Utf8View ggg
442
+
Utf8View hhh Utf8View hhh Utf8View hhh
443
+
Utf8View iii Utf8View iii Utf8View iii
444
444
445
445
# Run an explain plan to show the cast happens in the plan (there should be no casts)
446
446
query TT
@@ -453,8 +453,8 @@ EXPLAIN
453
453
binaryview_col LIKE '%a%';
454
454
----
455
455
logical_plan
456
-
01)Filter: binary_as_string_option.binary_col LIKE Utf8("%a%") AND binary_as_string_option.largebinary_col LIKE Utf8("%a%") AND binary_as_string_option.binaryview_col LIKE Utf8("%a%")
457
-
02)--TableScan: binary_as_string_option projection=[binary_col, largebinary_col, binaryview_col], partial_filters=[binary_as_string_option.binary_col LIKE Utf8("%a%"), binary_as_string_option.largebinary_col LIKE Utf8("%a%"), binary_as_string_option.binaryview_col LIKE Utf8("%a%")]
456
+
01)Filter: binary_as_string_option.binary_col LIKE Utf8View("%a%") AND binary_as_string_option.largebinary_col LIKE Utf8View("%a%") AND binary_as_string_option.binaryview_col LIKE Utf8View("%a%")
457
+
02)--TableScan: binary_as_string_option projection=[binary_col, largebinary_col, binaryview_col], partial_filters=[binary_as_string_option.binary_col LIKE Utf8View("%a%"), binary_as_string_option.largebinary_col LIKE Utf8View("%a%"), binary_as_string_option.binaryview_col LIKE Utf8View("%a%")]
458
458
physical_plan
459
459
01)CoalesceBatchesExec: target_batch_size=8192
460
460
02)--FilterExec: binary_col@0 LIKE %a% AND largebinary_col@1 LIKE %a% AND binaryview_col@2 LIKE %a%
# date_col > '2023-01-01' AND date_col > '2023-02-01' should simplify to date_col > '2023-02-01'
@@ -120,7 +120,7 @@ WHERE int_col > 5
120
120
AND float_col BETWEEN 1 AND 100;
121
121
----
122
122
logical_plan
123
-
01)Filter: test_data.str_col LIKE Utf8("A%") AND test_data.float_col >= Float32(1) AND test_data.float_col <= Float32(100) AND test_data.int_col > Int32(10)
123
+
01)Filter: test_data.str_col LIKE Utf8View("A%") AND test_data.float_col >= Float32(1) AND test_data.float_col <= Float32(100) AND test_data.int_col > Int32(10)
0 commit comments