Skip to content

[Feature Request]: Enable Nested Column support in Column Pruning in Iceberg IO #37486

@barunkumaracharya

Description

@barunkumaracharya

What would you like to happen?

This PR enabled column pruning support in Managed Iceberg IO.
We as users were able to use the "keep" / "drop" property to define the columns that we wanted to fetch from Iceberg.
I wanted to know, if it is possible to define a nested column in this property and fetch only the nested column from Iceberg rather than the entire struct. Can we enable that feature in managed Iceberg IO in apache beam.

Something like this

  • keep: ["colA.colB", "colE.colC"]

When i tried doing the above, i faced the following error -
26/02/03 11:34:23 ERROR IcebergIO: Error reading from Iceberg tables: Invalid source configuration: 'keep' specifies unknown field(s): [data.entry.field1.field2] java.lang.IllegalArgumentException: Invalid source configuration: 'keep' specifies unknown field(s): [data.entry.field1.field2] at org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:143) at org.apache.beam.sdk.io.iceberg.IcebergScanConfig.validate(IcebergScanConfig.java:332) at org.apache.beam.sdk.io.iceberg.IcebergIO$ReadRows.expand(IcebergIO.java:608)

Issue Priority

Priority: 2 (default / most feature requests should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions