Skip to content

[Bug]: Varchar logical type fails Iceberg Pipelines #36864

@tarun-google

Description

@tarun-google

What happened?

While writing a pipeline for Postgres->Iceberg the pipeline fails when the table contains VARCHAR fields. This is because it is not added as part of BeamSchemaLogical type here. Expectation of this Bug is to review more such JDBC Datatypes that needs to be supported for JDBC->Iceberg and fix them.

Root cause: org.apache.beam.sdk.util.UserCodeException: java.lang.RuntimeException: Unsupported Beam logical type beam:logical_type:var_char:v1 at org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:39) at org.apache.beam.sdk.io.iceberg.WriteUngroupedRowsToFiles$WriteUngroupedRowsToFilesDoFn$DoFnInvoker.invokeProcessElement(Unknown Source) at org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForWindowObservingParDo(FnApiDoFnRunner.java:639) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:349) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:276) at org.apache.beam.fn.harness.FnApiDoFnRunner.outputTo(FnApiDoFnRunner.java:1319) at org.apache.beam.fn.harness.FnApiDoFnRunner.access$2700(FnApiDoFnRunner.java:139) at org.apache.beam.fn.harness.FnApiDoFnRunner$WindowObservingProcessBundleContext.output(FnApiDoFnRunner.java:1729) at org.apache.beam.sdk.io.iceberg.AssignDestinations$1.processElement(AssignDestinations.java:66) at org.apache.beam.sdk.io.iceberg.AssignDestinations$1$DoFnInvoker.invokeProcessElement(Unknown Source) at org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForWindowObservingParDo(FnApiDoFnRunner.java:639) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:349) at org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:276) at org.apache.beam.fn.harness.BeamFnDataReadRunner.forwardElementToConsumer(BeamFnDataReadRunner.java:228) at org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.multiplexElements(BeamFnDataInboundObserver.java:232) at org.apache.beam.fn.harness.control.ProcessBundleHandler.processBundle(ProcessBundleHandler.java:527) at org.apache.beam.fn.harness.control.BeamFnControlClient.delegateOnInstructionRequestType(BeamFnControlClient.java:150) at org.apache.beam.fn.harness.control.BeamFnControlClient$InboundObserver.lambda$onNext$0(BeamFnControlClient.java:115) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at org.apache.beam.sdk.util.UnboundedScheduledExecutorService$ScheduledFutureTask.run(UnboundedScheduledExecutorService.java:163) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: java.lang.RuntimeException: Unsupported Beam logical type beam:logical_type:var_char:v1 at org.apache.beam.sdk.io.iceberg.IcebergUtils.beamFieldTypeToIcebergFieldType(IcebergUtils.java:182) at org.apache.beam.sdk.io.iceberg.IcebergUtils.beamSchemaToIcebergSchema(IcebergUtils.java:268) at org.apache.beam.sdk.io.iceberg.RecordWriterManager.getOrCreateTable(RecordWriterManager.java:321) at org.apache.beam.sdk.io.iceberg.RecordWriterManager.lambda$write$0(RecordWriterManager.java:354) at java.base/java.util.HashMap.computeIfAbsent(HashMap.java:1134) at org.apache.beam.sdk.io.iceberg.RecordWriterManager.write(RecordWriterManager.java:350) at org.apache.beam.sdk.io.iceberg.WriteUngroupedRowsToFiles$WriteUngroupedRowsToFilesDoFn.processElement(WriteUngroupedRowsToFiles.java:244)

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions