Skip to content

[Bug] ArrowFieldWriters.TimeWriter ignores startIndex and reads incorrect rows #6767

@guluo2016

Description

@guluo2016

Search before asking

  • I searched in the issues and found nothing similar.

Paimon version

master

Compute Engine

None

Minimal reproduce step

Description:
In ArrowFieldWriters.TimeWriter, the doWrite method uses wrong index i instead of row when reading from the source columnVector.

Impact:
When startIndex > 0, the writer reads incorrect values from the source column vector, causing data corruption in the Arrow output.

Reproduction:

public static void main(String[] args) {
        RowType rowType = new RowType(
                Collections.singletonList(new DataField(0, "time_field", DataTypes.TIME())));

        int startIndex = 2;
        final int batchRows = 3;

        try (RootAllocator allocator = new RootAllocator();
             VectorSchemaRoot vsr = ArrowUtils.createVectorSchemaRoot(rowType, allocator)) {

            ArrowFieldWriter[] fieldWriters = ArrowUtils.createArrowFieldWriters(vsr, rowType);

            IntColumnVector timeVec = new IntColumnVector() {
                final int[] values = new int[] {0, 1000, 2000, 3000, 4000};
                @Override public int getInt(int i) { return values[i]; }
                @Override public boolean isNullAt(int i) { return false; }
            };

            fieldWriters[0].write(timeVec, null, 2, batchRows);
            vsr.setRowCount(batchRows);

            TimeMilliVector timeMilliVector = (TimeMilliVector) vsr.getVector("time_field");

            for (int i = 0; i < batchRows; i++) {
                System.out.printf("paimonValue: %s, <--> arrowValue: %s%n",
                        timeVec.getInt(i + startIndex), timeMilliVector.get(i));
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

Print output:

paimonValue: 2000, <--> arrowValue: 0
paimonValue: 3000, <--> arrowValue: 1000
paimonValue: 4000, <--> arrowValue: 2000

What doesn't meet your expectations?

None

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions