Avoid duplicate bucket ID projection in native write paths

## Overview
Native write paths in Spark 3.2/3.3 and the ClickHouse MergeTree writer
recompute the `__bucket_value__` expression even when a precomputed
attribute already exists. This adds unnecessary overhead and complicates
downstream projections.

## Steps to Reproduce
- Write bucketed data using the native writer.
- Inspect the execution plan; the bucket ID is projected multiple times
  instead of reusing a single attribute.

## Expected Behavior
The bucket ID should be computed once (e.g., in an initial `ProjectExec`)
and reused by subsequent stages.

## Actual Behavior
Every stage re-evaluates the bucket expression, leading to redundant
projections and performance overhead.

## Impact
- Increased CPU time for bucketed writes.
- Harder-to-read execution plans with repetitive projections.

## Proposed Fix
- Guard projections in Spark 3.2/3.3 shims and ClickHouse MergeTree writer
  so they append `__bucket_value__` only when missing.
- Store the computed bucket ID in an attribute and reuse it downstream.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid duplicate bucket ID projection in native write paths #10359

Overview

Steps to Reproduce

Expected Behavior

Actual Behavior

Impact

Proposed Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Avoid duplicate bucket ID projection in native write paths #10359

Description

Overview

Steps to Reproduce

Expected Behavior

Actual Behavior

Impact

Proposed Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions