Skip to content

Add Instrumentation for Apache Iceberg #15113

@ghareeb-falazi

Description

@ghareeb-falazi

Is your feature request related to a problem? Please describe.

Apache Iceberg is a data lake table format that has a lot of interesting features. Iceberg provides an SDK that makes it easy for engines to implement its specifications. Furthermore, the SDK emits metrics regarding table scan planning and commit handling (see: https://iceberg.apache.org/docs/nightly/metrics-reporting/).
This feature would introduce OpenTelemetry instrumentation for the Apache Iceberg SDK.

Describe the solution you'd like

Image

In the figure, you can see how the Iceberg SDK handles metrics reporting. Custom metric reporters can be associated with TableScan objects during runtime, and they will be used to consume the emitted metrics (MetricReport). The desired solution would allow database engines that use the Iceberg SDK to consume the metrics emitted by this SDK and convert them into OpenTelemetry metrics.

Describe alternatives you've considered

To add an OpenTelemetry-based Metric Reporter directly into Iceberg SDK. However, this would not benefit the SDK versions before such an addition is introduced.

Additional context

Multiple existing open source database engines that implement the Iceberg specs already use OpenTelemetry, e.g., Snowflake, Trino, Starburst, and Doris. All of these engines, and others, would benefit from the proposed feature.

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestneeds triageNew issue that requires triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions