Skip to content

feat: [presto][iceberg] Add Iceberg V3 deletion vector write path with DV page sink and compaction procedure (#27395)#27395

Open
apurva-meta wants to merge 5 commits intoprestodb:masterfrom
apurva-meta:export-D97531549
Open

feat: [presto][iceberg] Add Iceberg V3 deletion vector write path with DV page sink and compaction procedure (#27395)#27395
apurva-meta wants to merge 5 commits intoprestodb:masterfrom
apurva-meta:export-D97531549

Conversation

@apurva-meta
Copy link

@apurva-meta apurva-meta commented Mar 21, 2026

Summary:

  • Add IcebergDeletionVectorPageSink for writing DV files during table maintenance
  • Add RewriteDeleteFilesProcedure for DV compaction
  • Wire DV page sink through IcebergCommonModule, IcebergAbstractMetadata, IcebergPageSourceProvider
  • Add IcebergUpdateablePageSource for DV-aware page source
  • Update CommitTaskData, IcebergUtil for DV support
  • Add test coverage in TestIcebergV3

== RELEASE NOTES ==
General Changes

  • Upgrade Apache Iceberg library from 1.10.0 to 1.10.1.
    Hive Connector Changes
  • Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure.
  • Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips.
  • Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types.
  • Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation.
  • Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance.
  • Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables.
  • Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables.
  • Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution.
  • Improve IcebergSplitReader error handling and fix test file handle leaks.
  • Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries.

Differential Revision: D97531549

@apurva-meta apurva-meta requested review from a team, ZacBlanco and hantangwangd as code owners March 21, 2026 06:49
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Mar 21, 2026

Reviewer's Guide

Implements Iceberg V3 deletion vector (DV) write and compaction support across Presto Java and native (Velox) stacks: adds a Puffin-based DV page sink and DV-aware metadata/commit path, wires DV and dataSequenceNumber through split/protocol handling, enables V3 row-level operations and default values, introduces a DV compaction procedure, and expands V3 integration tests around DVs, schema evolution, and partition transforms.

Sequence diagram for Iceberg V3 deletion vector write path

sequenceDiagram
    actor User
    participant Coordinator as PrestoCoordinator
    participant PageSourceProvider as IcebergPageSourceProvider
    participant UpdateableSource as IcebergUpdateablePageSource
    participant DVPageSink as IcebergDeletionVectorPageSink
    participant Metadata as IcebergAbstractMetadata
    participant Iceberg as IcebergTable

    User->>Coordinator: Execute DELETE on Iceberg v3 table
    Coordinator->>PageSourceProvider: createPageSource(tableFormatVersion=3)
    PageSourceProvider->>PageSourceProvider: detect format-version >= 3
    PageSourceProvider->>PageSourceProvider: create deleteSinkSupplier() -> IcebergDeletionVectorPageSink
    PageSourceProvider-->>Coordinator: IcebergUpdateablePageSource with deleteSinkSupplier

    loop during delete execution
        Coordinator->>UpdateableSource: updateRows/deleteRows
        UpdateableSource->>DVPageSink: appendPage(Page rowIds)
        DVPageSink->>DVPageSink: collect row positions
    end

    Coordinator->>UpdateableSource: finish()
    UpdateableSource->>DVPageSink: finish()
    DVPageSink->>DVPageSink: sort positions and serialize Roaring bitmap
    DVPageSink->>DVPageSink: write Puffin DV file via PuffinWriter
    DVPageSink-->>UpdateableSource: CommitTaskData JSON (FileFormat=PUFFIN, content=POSITION_DELETES, contentOffset, contentSizeInBytes, recordCount, referencedDataFile)

    UpdateableSource-->>Coordinator: fragments containing CommitTaskData
    Coordinator->>Metadata: finishDeleteWithOutput(fragments)
    Metadata->>Metadata: decode CommitTaskData
    Metadata->>Metadata: if FileFormat=PUFFIN
    Metadata->>Metadata: set recordCount, contentOffset, contentSizeInBytes
    Metadata->>Iceberg: FileMetadata.deleteFileBuilder()
    Iceberg-->>Metadata: DeleteFile (format=PUFFIN, content=POSITION_DELETES with DV metadata)
    Metadata->>Iceberg: commit delete file to table snapshot
    Iceberg-->>Coordinator: commit success
    Coordinator-->>User: DELETE completed
Loading

Class diagram for Iceberg V3 deletion vector page sink and commit metadata

classDiagram
    class CommitTaskData {
        +String path
        +long fileSizeInBytes
        +MetricsWrapper metrics
        +int partitionSpecId
        +Optional~String~ partitionDataJson
        +FileFormat fileFormat
        +Optional~String~ referencedDataFile
        +FileContent content
        +OptionalLong contentOffset
        +OptionalLong contentSizeInBytes
        +OptionalLong recordCount
        +CommitTaskData(path, fileSizeInBytes, metrics, partitionSpecId, partitionDataJson, fileFormat, referencedDataFile, content, contentOffset, contentSizeInBytes, recordCount)
        +CommitTaskData(path, fileSizeInBytes, metrics, partitionSpecId, partitionDataJson, fileFormat, referencedDataFile, content)
        +String getPath()
        +long getFileSizeInBytes()
        +MetricsWrapper getMetrics()
        +int getPartitionSpecId()
        +Optional~String~ getPartitionDataJson()
        +FileFormat getFileFormat()
        +Optional~String~ getReferencedDataFile()
        +FileContent getContent()
        +OptionalLong getContentOffset()
        +OptionalLong getContentSizeInBytes()
        +OptionalLong getRecordCount()
    }

    class IcebergDeletionVectorPageSink {
        -PartitionSpec partitionSpec
        -Optional~PartitionData~ partitionData
        -LocationProvider locationProvider
        -HdfsEnvironment hdfsEnvironment
        -HdfsContext hdfsContext
        -JsonCodec~CommitTaskData~ jsonCodec
        -ConnectorSession session
        -String dataFile
        -List~Integer~ collectedPositions
        +IcebergDeletionVectorPageSink(partitionSpec, partitionDataAsJson, locationProvider, hdfsEnvironment, hdfsContext, jsonCodec, session, dataFile)
        +long getCompletedBytes()
        +long getSystemMemoryUsage()
        +long getValidationCpuNanos()
        +CompletableFuture appendPage(Page page)
        +CompletableFuture~Collection~finish()
        +void abort()
        -static byte[] serializeRoaringBitmap(List~Integer~ sortedPositions)
    }

    class RewriteDeleteFilesProcedure {
        -IcebergMetadataFactory metadataFactory
        +RewriteDeleteFilesProcedure(IcebergMetadataFactory metadataFactory)
        +Procedure get()
        +void rewriteDeleteFiles(ConnectorSession clientSession, String schemaName, String tableName)
        -void readDeletionVectorPositions(Table table, DeleteFile dv, Set~Integer~ positions)
        -DeleteFile writeMergedDeletionVector(Table table, DeleteFile templateDv, String dataFilePath, Set~Integer~ mergedPositions)
        -static void deserializeRoaringBitmap(ByteBuffer buffer, Set~Integer~ positions)
        -static byte[] serializeRoaringBitmap(List~Integer~ sortedPositions)
    }

    class DeleteFile {
        +FileContent content
        +String path
        +FileFormat format
        +long recordCount
        +long fileSizeInBytes
        +List~Integer~ equalityFieldIds
        +Map~Integer, byte[]~ lowerBounds
        +Map~Integer, byte[]~ upperBounds
        +long dataSequenceNumber
        +static DeleteFile fromIceberg(org.apache.iceberg.DeleteFile deleteFile)
        +DeleteFile(content, path, format, recordCount, fileSizeInBytes, equalityFieldIds, lowerBounds, upperBounds, dataSequenceNumber)
        +FileContent getContent()
        +String getPath()
        +FileFormat getFormat()
        +long getRecordCount()
        +long getFileSizeInBytes()
        +List~Integer~ getEqualityFieldIds()
        +Map~Integer, byte[]~ getLowerBounds()
        +Map~Integer, byte[]~ getUpperBounds()
        +long getDataSequenceNumber()
        +String toString()
    }

    class IcebergAbstractMetadata {
        +Optional~ConnectorOutputMetadata~ finishDeleteWithOutput(ConnectorSession session, ConnectorTableHandle tableHandle, List~ColumnHandle~ columns, List~Slice~ fragments)
    }

    class IcebergPageSourceProvider {
        +ConnectorPageSource createPageSource(ConnectorTransactionHandle transaction, ConnectorSession session, ConnectorSplit split, ConnectorTableHandle table, List~ColumnHandle~ columns, SplitContext splitContext)
    }

    class IcebergUpdateablePageSource {
        -Supplier~ConnectorPageSink~ deleteSinkSupplier
        -ConnectorPageSink positionDeleteSink
        +IcebergUpdateablePageSource(ConnectorPageSource delegate, List~IcebergColumnHandle~ delegateColumns, Supplier~ConnectorPageSink~ deleteSinkSupplier, Supplier~Optional~RowPredicate~~ deletePredicate, Supplier~List~DeleteFilter~~ deleteFilters, Supplier~IcebergPageSink~ updatedRowPageSinkSupplier, ConnectorSession session)
        +CompletableFuture~Collection~finish()
    }

    IcebergDeletionVectorPageSink ..> CommitTaskData : creates
    IcebergDeletionVectorPageSink ..> PartitionSpec
    IcebergDeletionVectorPageSink ..> PartitionData
    IcebergDeletionVectorPageSink ..|> ConnectorPageSink

    IcebergUpdateablePageSource ..> ConnectorPageSink : deleteSinkSupplier
    IcebergUpdateablePageSource o--> IcebergDeletionVectorPageSink : for formatVersion>=3

    IcebergPageSourceProvider ..> IcebergDeletionVectorPageSink : constructs for v3 tables
    IcebergPageSourceProvider ..> IcebergUpdateablePageSource

    IcebergAbstractMetadata ..> CommitTaskData : consumes
    IcebergAbstractMetadata ..> DeleteFile : builds Iceberg delete files

    RewriteDeleteFilesProcedure ..|> Provider~Procedure~
    RewriteDeleteFilesProcedure ..> IcebergMetadataFactory
    RewriteDeleteFilesProcedure ..> IcebergAbstractMetadata
    RewriteDeleteFilesProcedure ..> DeleteFile

    DeleteFile ..> FileContent
    DeleteFile ..> FileFormat

    CommitTaskData ..> FileFormat
    CommitTaskData ..> FileContent
Loading

File-Level Changes

Change Details Files
Add Puffin-based deletion vector page sink and plumb DV-specific metadata into the commit path for V3 tables.
  • Introduce IcebergDeletionVectorPageSink that collects row positions, serializes them into a Roaring bitmap in a Puffin blob, writes a .puffin file, and emits CommitTaskData with DV metadata (content offset, size, record count, referenced data file).
  • Extend CommitTaskData with optional contentOffset, contentSizeInBytes, and recordCount fields and expose them via JSON for DV delete files.
  • Adjust IcebergAbstractMetadata.finishDeleteWithOutput to build Iceberg DeleteFile metadata differently for PUFFIN DVs (using recordCount/content offset/size and referenced data file) vs other delete formats (which still use Metrics).
  • Update IcebergFileFormat enum to support PUFFIN and map Iceberg PUFFIN format into Presto FileFormat so DV .puffin files are recognized.
presto-iceberg/src/main/java/com/facebook/presto/iceberg/delete/IcebergDeletionVectorPageSink.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/CommitTaskData.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergAbstractMetadata.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/FileFormat.java
Enable Iceberg V3 row-level operations and DV-aware update/delete flow in the Java connector, including DV compaction via a new procedure.
  • Allow V3 row-level operations by raising MAX_FORMAT_VERSION_FOR_ROW_LEVEL_OPERATIONS to 3 and removing the validation that rejects v3 column default values.
  • Switch IcebergPageSourceProvider’s delete sink selection to use IcebergDeletionVectorPageSink for format-version >= 3, while keeping IcebergDeletePageSink for older tables, and generalize IcebergUpdateablePageSource to work with a generic ConnectorPageSink delete sink.
  • Introduce RewriteDeleteFilesProcedure that scans V3 tables for Puffin-based positional delete files, groups DVs by referenced data file, merges their Roaring bitmap payloads, writes consolidated Puffin DV files, and commits a RewriteFiles operation swapping old DVs for new ones.
  • Register RewriteDeleteFilesProcedure in IcebergCommonModule so it is available as CALL system.rewrite_delete_files(schema, table).
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergUtil.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergPageSourceProvider.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergUpdateablePageSource.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergCommonModule.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/procedure/RewriteDeleteFilesProcedure.java
Wire deletion vector and equality-delete semantics through the native (Velox) split path, including PUFFIN format support and dataSequenceNumber propagation for delete conflict resolution.
  • Extend the protocol-side Iceberg enums and structs to support PUFFIN FileFormat, an explicit EQUALITY_DELETES FileContent, and a dataSequenceNumber field on DeleteFile; update JSON (de)serialization accordingly.
  • Update the Presto-to-Velox connector split conversion to map PUFFIN to a placeholder DWRF format, reclassify PUFFIN positional delete files as kDeletionVector so IcebergSplitReader routes them to DeletionVectorReader, propagate delete-file dataSequenceNumber into Velox IcebergDeleteFile, and pass the split’s dataSequenceNumber through both infoColumns and the HiveIcebergSplit constructor.
  • Add a mapping for EQUALITY_DELETES in toVeloxFileContent so equality delete files are correctly tagged in Velox.
presto-native-execution/presto_cpp/presto_protocol/connector/iceberg/presto_protocol_iceberg.h
presto-native-execution/presto_cpp/presto_protocol/connector/iceberg/presto_protocol_iceberg.cpp
presto-native-execution/presto_cpp/main/connectors/IcebergPrestoToVeloxConnector.cpp
Remove coordinator-side rejection of Puffin deletion vectors and propagate delete-file dataSequenceNumber through the Java split model.
  • Drop the NOT_SUPPORTED guard in IcebergSplitSource that previously rejected PUFFIN deletion vectors during split enumeration so DV metadata can flow to workers.
  • Extend the Java DeleteFile model to include a dataSequenceNumber field derived from Iceberg’s DeleteFile.dataSequenceNumber, expose it via JSON, and include it in the string representation for debugging.
presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergSplitSource.java
presto-iceberg/src/main/java/com/facebook/presto/iceberg/delete/DeleteFile.java
Expand Iceberg V3 integration tests to cover DV read/write, DV metadata, compaction, schema evolution, partition transforms, and enabling DELETE/UPDATE/MERGE on V3 tables.
  • Convert prior negative tests for DELETE/UPDATE/MERGE on V3 tables into positive tests that execute the operations and, for deletes, verify that Puffin-format DV files are produced with sensible metadata.
  • Add multiple end-to-end DV tests that hand-craft Puffin deletion vector files with specific Roaring bitmaps, attach them to tables via the Iceberg API, and validate coordinator behavior (split enumeration, metadata fields, multi-file scenarios, and multi-snapshot behavior).
  • Add a comprehensive write-read round-trip test that exercises DV-backed DELETEs interleaved with INSERTs and validates both query correctness and DV metadata across multiple data files.
  • Introduce tests for V3 schema evolution (add/rename columns and defaults) and for multi-argument partition transforms (bucket and truncate) to ensure V3 tables are fully usable.
  • Add tests for the rewrite_delete_files procedure on V3 tables (compacting multiple DVs into fewer Puffin files) and on V2 tables (no-op behavior with correctness preserved).
presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergV3.java

Possibly linked issues

  • #native(Iceberg): PR implements Iceberg V3 DV reading and applying (plus writing/compaction), directly satisfying the deletion-vector support issue.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The deletion vector roaring bitmap encoders/decoders (IcebergDeletionVectorPageSink.serializeRoaringBitmap, RewriteDeleteFilesProcedure.serializeRoaringBitmap/deserializeRoaringBitmap, and the test helper) are hand-rolled and in some cases assume a single container (positions < 2^16) and cast long to int without bounds checks; consider either leveraging an existing roaring implementation (e.g. Iceberg/roaringbitmap utilities) or at least validating position ranges and supporting multi-container bitmaps to avoid subtle corruption for larger row positions.
  • Both the DV page sink and the rewrite_delete_files compaction procedure write Puffin blobs with hardcoded snapshotId/sequenceNumber values of 0; if downstream readers or metadata tooling rely on these fields for DV discovery or conflict resolution, it would be safer to plumb through the actual snapshot/sequence information (or explicitly document and enforce that they are unused).
  • In RewriteDeleteFilesProcedure.rewriteDeleteFiles, you build filesToRemove/filesToAdd by iterating planFiles() and then call icebergTable.newRewrite().rewriteFiles(...).commit(); you never remove any data files here, only delete files—consider using the more specific rewriteFiles(rewrittenDeleteFiles, newDeleteFiles) overload (or adding a comment) so it’s clear that this operation is only compacting delete files and not touching data files.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The deletion vector roaring bitmap encoders/decoders (`IcebergDeletionVectorPageSink.serializeRoaringBitmap`, `RewriteDeleteFilesProcedure.serializeRoaringBitmap`/`deserializeRoaringBitmap`, and the test helper) are hand-rolled and in some cases assume a single container (positions < 2^16) and cast `long` to `int` without bounds checks; consider either leveraging an existing roaring implementation (e.g. Iceberg/roaringbitmap utilities) or at least validating position ranges and supporting multi-container bitmaps to avoid subtle corruption for larger row positions.
- Both the DV page sink and the rewrite_delete_files compaction procedure write Puffin blobs with hardcoded snapshotId/sequenceNumber values of 0; if downstream readers or metadata tooling rely on these fields for DV discovery or conflict resolution, it would be safer to plumb through the actual snapshot/sequence information (or explicitly document and enforce that they are unused).
- In `RewriteDeleteFilesProcedure.rewriteDeleteFiles`, you build `filesToRemove`/`filesToAdd` by iterating `planFiles()` and then call `icebergTable.newRewrite().rewriteFiles(...).commit()`; you never remove any data files here, only delete files—consider using the more specific `rewriteFiles(rewrittenDeleteFiles, newDeleteFiles)` overload (or adding a comment) so it’s clear that this operation is only compacting delete files and not touching data files.

## Individual Comments

### Comment 1
<location path="presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergV3.java" line_range="150-154" />
<code_context>
     }

     @Test
-    public void testDeleteOnV3TableNotSupported()
+    public void testDeleteOnV3Table()
</code_context>
<issue_to_address>
**suggestion (testing):** Add coverage for rewrite_delete_files on a V3 table that does not require compaction (0 or 1 DV per data file).

This currently covers the compaction case (multiple DVs per data file). Please also add a test where a V3 table has no DVs or exactly one DV per data file so that `CALL system.rewrite_delete_files` is a no-op, to confirm we never drop the only DV on a file and that the metadata commit path is safe when there’s nothing to rewrite.

Suggested implementation:

```java
    }

    @Test
    public void testRewriteDeleteFilesNoOpOnV3Table()
    {
        String tableName = "test_v3_rewrite_delete_noop";
        try {
            assertUpdate("CREATE TABLE " + tableName + " (id BIGINT, name VARCHAR, salary DOUBLE) " +
                    "WITH (format = 'PARQUET', format_version = 3)", 0);

            assertUpdate("INSERT INTO " + tableName + " VALUES " +
                    "(1, 'Alice', 100.0), (2, 'Bob', 200.0), (3, 'Charlie', 300.0)", 3);
            assertQuery("SELECT * FROM " + tableName + " ORDER BY id",
                    "VALUES (1, 'Alice', 100.0), (2, 'Bob', 200.0), (3, 'Charlie', 300.0)");

            // Issue deletes expected to create at most one delete vector per data file
            // and ensure the remaining visible row is correct.
            assertUpdate("DELETE FROM " + tableName + " WHERE id IN (1, 3)", 2);
            assertQuery("SELECT * FROM " + tableName,
                    "VALUES (2, 'Bob', 200.0)");

            // rewrite_delete_files should be a no-op: data must remain unchanged
            // and deleted rows must stay deleted.
            assertUpdate("CALL system.rewrite_delete_files(table => '" + tableName + "')");
            assertQuery("SELECT * FROM " + tableName,
                    "VALUES (2, 'Bob', 200.0)");
        }
        finally {
            assertUpdate("DROP TABLE IF EXISTS " + tableName);
        }
    }

    @Test
    public void testDeleteOnV3Table()

```

1. Align the `CALL system.rewrite_delete_files` invocation with the existing tests in this file:
   - If other tests pass catalog and schema separately (e.g., `CALL system.rewrite_delete_files('iceberg', 'tpch', '<table>')` or similar), update the call in `testRewriteDeleteFilesNoOpOnV3Table` accordingly.
   - If tests use fully-qualified table names including schema (e.g., `"CALL system.rewrite_delete_files(table => 'tpch." + tableName + "')"`), mirror that pattern instead of the bare `tableName` used above.
2. If the V3 table creation in other tests uses a connector-specific `WITH` clause (e.g., `format_version = 3, partitioning`, or different format), adjust the `CREATE TABLE` statement to match those conventions so that the DV layout matches production expectations.
3. If there is a helper to generate unique table names or to qualify them with a schema/catalog, use it here instead of the raw `tableName` string.
</issue_to_address>

### Comment 2
<location path="presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergV3.java" line_range="629-638" />
<code_context>
+import static com.facebook.presto.iceberg.IcebergUtil.getIcebergTable;
+import static java.util.Objects.requireNonNull;
+
+/**
+ * Procedure to compact deletion vectors (DVs) on V3 Iceberg tables.
+ *
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a small self-checking test around serializeRoaringBitmapNoRun to guard against encoding regressions.

This helper encodes a specific RoaringBitmap "no-run" portable format expected by Velox and is currently only exercised indirectly. Please add a focused unit test that builds a small bitmap, calls this helper, and validates the cookie/structure (or round-trips through an Iceberg/Velox reader) so encoding changes are caught early and the constraints (e.g., positions < 65536) are documented in executable form.
</issue_to_address>

### Comment 3
<location path="presto-iceberg/src/test/java/com/facebook/presto/iceberg/TestIcebergV3.java" line_range="357-362" />
<code_context>
+            try {
+                computeActual("SELECT * FROM " + tableName);
+            }
+            catch (RuntimeException e) {
+                // Verify the error is NOT the old "PUFFIN not supported" rejection.
+                // Other failures (e.g., fake .puffin file not on disk) are acceptable.
+                assertFalse(
+                        e.getMessage().contains("Iceberg deletion vectors") && e.getMessage().contains("not supported"),
+                        "PUFFIN deletion vectors should be accepted, not rejected: " + e.getMessage());
+            }
         }
</code_context>
<issue_to_address>
**nitpick:** Defensively handle a null exception message in PUFFIN DV acceptance tests.

Here and in similar catch blocks, `e.getMessage()` is dereferenced directly in `contains` checks and the assertion message. If the message is null, the test will throw a `NullPointerException` instead of reporting the original failure. Using `String.valueOf(e.getMessage())` (for both the checks and the message) or adding an explicit null guard would make these tests more robust.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines 150 to 154
@Test
public void testDeleteOnV3TableNotSupported()
public void testDeleteOnV3Table()
{
String tableName = "test_v3_delete";
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add coverage for rewrite_delete_files on a V3 table that does not require compaction (0 or 1 DV per data file).

This currently covers the compaction case (multiple DVs per data file). Please also add a test where a V3 table has no DVs or exactly one DV per data file so that CALL system.rewrite_delete_files is a no-op, to confirm we never drop the only DV on a file and that the metadata commit path is safe when there’s nothing to rewrite.

Suggested implementation:

    }

    @Test
    public void testRewriteDeleteFilesNoOpOnV3Table()
    {
        String tableName = "test_v3_rewrite_delete_noop";
        try {
            assertUpdate("CREATE TABLE " + tableName + " (id BIGINT, name VARCHAR, salary DOUBLE) " +
                    "WITH (format = 'PARQUET', format_version = 3)", 0);

            assertUpdate("INSERT INTO " + tableName + " VALUES " +
                    "(1, 'Alice', 100.0), (2, 'Bob', 200.0), (3, 'Charlie', 300.0)", 3);
            assertQuery("SELECT * FROM " + tableName + " ORDER BY id",
                    "VALUES (1, 'Alice', 100.0), (2, 'Bob', 200.0), (3, 'Charlie', 300.0)");

            // Issue deletes expected to create at most one delete vector per data file
            // and ensure the remaining visible row is correct.
            assertUpdate("DELETE FROM " + tableName + " WHERE id IN (1, 3)", 2);
            assertQuery("SELECT * FROM " + tableName,
                    "VALUES (2, 'Bob', 200.0)");

            // rewrite_delete_files should be a no-op: data must remain unchanged
            // and deleted rows must stay deleted.
            assertUpdate("CALL system.rewrite_delete_files(table => '" + tableName + "')");
            assertQuery("SELECT * FROM " + tableName,
                    "VALUES (2, 'Bob', 200.0)");
        }
        finally {
            assertUpdate("DROP TABLE IF EXISTS " + tableName);
        }
    }

    @Test
    public void testDeleteOnV3Table()
  1. Align the CALL system.rewrite_delete_files invocation with the existing tests in this file:
    • If other tests pass catalog and schema separately (e.g., CALL system.rewrite_delete_files('iceberg', 'tpch', '<table>') or similar), update the call in testRewriteDeleteFilesNoOpOnV3Table accordingly.
    • If tests use fully-qualified table names including schema (e.g., "CALL system.rewrite_delete_files(table => 'tpch." + tableName + "')"), mirror that pattern instead of the bare tableName used above.
  2. If the V3 table creation in other tests uses a connector-specific WITH clause (e.g., format_version = 3, partitioning, or different format), adjust the CREATE TABLE statement to match those conventions so that the DV layout matches production expectations.
  3. If there is a helper to generate unique table names or to qualify them with a schema/catalog, use it here instead of the raw tableName string.

Comment on lines +629 to +638
/**
* Serializes a roaring bitmap in the portable "no-run" format (cookie = 12346).
* This produces the exact binary format expected by Velox's DeletionVectorReader.
* Only supports positions within a single container (all < 65536).
*/
private static byte[] serializeRoaringBitmapNoRun(int[] positions)
{
// Cookie (12346) + numContainers (1)
// + 1 container key-cardinality pair (4 bytes)
// + sorted uint16 values (2 bytes each)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding a small self-checking test around serializeRoaringBitmapNoRun to guard against encoding regressions.

This helper encodes a specific RoaringBitmap "no-run" portable format expected by Velox and is currently only exercised indirectly. Please add a focused unit test that builds a small bitmap, calls this helper, and validates the cookie/structure (or round-trips through an Iceberg/Velox reader) so encoding changes are caught early and the constraints (e.g., positions < 65536) are documented in executable form.

Comment on lines +357 to +362
catch (RuntimeException e) {
// Verify the error is NOT the old "PUFFIN not supported" rejection.
// Other failures (e.g., fake .puffin file not on disk) are acceptable.
assertFalse(
e.getMessage().contains("Iceberg deletion vectors") && e.getMessage().contains("not supported"),
"PUFFIN deletion vectors should be accepted, not rejected: " + e.getMessage());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: Defensively handle a null exception message in PUFFIN DV acceptance tests.

Here and in similar catch blocks, e.getMessage() is dereferenced directly in contains checks and the assertion message. If the message is null, the test will throw a NullPointerException instead of reporting the original failure. Using String.valueOf(e.getMessage()) (for both the checks and the message) or adding an explicit null guard would make these tests more robust.

apurva-meta added a commit to apurva-meta/presto that referenced this pull request Mar 21, 2026
…h DV page sink and compaction procedure (prestodb#27395)

Summary:

- Add IcebergDeletionVectorPageSink for writing DV files during table maintenance
- Add RewriteDeleteFilesProcedure for DV compaction
- Wire DV page sink through IcebergCommonModule, IcebergAbstractMetadata, IcebergPageSourceProvider
- Add IcebergUpdateablePageSource for DV-aware page source
- Update CommitTaskData, IcebergUtil for DV support
- Add test coverage in TestIcebergV3

== RELEASE NOTES ==
General Changes
* Upgrade Apache Iceberg library from 1.10.0 to 1.10.1.
Hive Connector Changes
* Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure.
* Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips.
* Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types.
* Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation.
* Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance.
* Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables.
* Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables.
* Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution.
* Improve IcebergSplitReader error handling and fix test file handle leaks.
* Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries.

Differential Revision: D97531549
@meta-codesync meta-codesync bot changed the title feat: [presto][iceberg] Add Iceberg V3 deletion vector write path with DV page sink and compaction procedure feat: [presto][iceberg] Add Iceberg V3 deletion vector write path with DV page sink and compaction procedure (#27395) Mar 21, 2026
apurva-meta added a commit to apurva-meta/presto that referenced this pull request Mar 21, 2026
…h DV page sink and compaction procedure (prestodb#27395)

Summary:

- Add IcebergDeletionVectorPageSink for writing DV files during table maintenance
- Add RewriteDeleteFilesProcedure for DV compaction
- Wire DV page sink through IcebergCommonModule, IcebergAbstractMetadata, IcebergPageSourceProvider
- Add IcebergUpdateablePageSource for DV-aware page source
- Update CommitTaskData, IcebergUtil for DV support
- Add test coverage in TestIcebergV3

== RELEASE NOTES ==
General Changes
* Upgrade Apache Iceberg library from 1.10.0 to 1.10.1.
Hive Connector Changes
* Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure.
* Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips.
* Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types.
* Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation.
* Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance.
* Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables.
* Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables.
* Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution.
* Improve IcebergSplitReader error handling and fix test file handle leaks.
* Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries.

Differential Revision: D97531549
…or extensibility (prestodb#27391)

Summary:

- Reformat FileContent enum in presto_protocol_iceberg.h from single-line
  to multi-line for better readability and future extension.
- Add blank line for visual separation before infoColumns initialization.

Protocol files are auto-generated from Java sources via chevron. The manual
edits here mirror what the generator would produce once the Java changes
are landed and the protocol is regenerated.

== RELEASE NOTES ==
General Changes
* Upgrade Apache Iceberg library from 1.10.0 to 1.10.1.
Hive Connector Changes
* Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure.
* Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips.
* Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types.
* Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation.
* Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance.
* Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables.
* Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables.
* Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution.
* Improve IcebergSplitReader error handling and fix test file handle leaks.
* Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries.

Differential Revision: D97531548
… for equality delete conflict resolution (prestodb#27392)

Summary:

Wire the dataSequenceNumber field from the Java Presto protocol to the
C++ Velox connector layer, enabling server-side sequence number conflict
resolution for equality delete files.

Changes:
- Add dataSequenceNumber field to IcebergSplit protocol (Java + C++)
- Parse dataSequenceNumber in IcebergPrestoToVeloxConnector and pass it
  through HiveIcebergSplit to IcebergSplitReader
- Add const qualifiers to local variables for code clarity

== RELEASE NOTES ==
General Changes
* Upgrade Apache Iceberg library from 1.10.0 to 1.10.1.
Hive Connector Changes
* Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure.
* Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips.
* Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types.
* Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation.
* Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance.
* Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables.
* Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables.
* Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution.
* Improve IcebergSplitReader error handling and fix test file handle leaks.
* Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries.

Differential Revision: D97531547
…ector discovery (prestodb#27393)

Summary:

Iceberg V3 introduces deletion vectors stored as blobs inside Puffin files.
Previously, the coordinator's IcebergSplitSource rejected PUFFIN-format delete
files with a NOT_SUPPORTED error, preventing V3 deletion vectors from being
discovered and sent to workers.

This diff:
1. Adds PUFFIN to the FileFormat enum (both presto-trunk and
   presto-facebook-trunk) so fromIcebergFileFormat() can convert
   Iceberg's PUFFIN format to Presto's FileFormat.PUFFIN.
2. Removes the PUFFIN rejection check in presto-trunk's
   IcebergSplitSource.toIcebergSplit(), allowing deletion vector
   files to flow through to workers.
3. Updates TestIcebergV3 to verify PUFFIN files are accepted rather
   than rejected at split enumeration time.

The C++ worker-side changes (protocol enum + connector conversion) will
follow in a separate diff.

Differential Revision: D97531557
…ocol and connector layer (prestodb#27394)

Summary:

This is the C++ counterpart to the Java PUFFIN support diff. It wires
the PUFFIN file format through the Prestissimo protocol and connector
conversion layer so that Iceberg V3 deletion vector files can be
deserialized and handled by native workers.

Changes:
1. Adds PUFFIN to the C++ protocol FileFormat enum and its JSON
   serialization table in presto_protocol_iceberg.{h,cpp}.
2. Handles PUFFIN in toVeloxFileFormat() in
   IcebergPrestoToVeloxConnector.cpp, mapping it to DWRF as a
   placeholder since DeletionVectorReader reads raw binary and
   does not use the DWRF/Parquet reader infrastructure.

== RELEASE NOTES ==
General Changes
* Upgrade Apache Iceberg library from 1.10.0 to 1.10.1.
Hive Connector Changes
* Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure.
* Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips.
* Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types.
* Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation.
* Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance.
* Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables.
* Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables.
* Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution.
* Improve IcebergSplitReader error handling and fix test file handle leaks.
* Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries.

Differential Revision: D97531555
…h DV page sink and compaction procedure (prestodb#27395)

Summary:

- Add IcebergDeletionVectorPageSink for writing DV files during table maintenance
- Add RewriteDeleteFilesProcedure for DV compaction
- Wire DV page sink through IcebergCommonModule, IcebergAbstractMetadata, IcebergPageSourceProvider
- Add IcebergUpdateablePageSource for DV-aware page source
- Update CommitTaskData, IcebergUtil for DV support
- Add test coverage in TestIcebergV3

== RELEASE NOTES ==
General Changes
* Upgrade Apache Iceberg library from 1.10.0 to 1.10.1.
Hive Connector Changes
* Add Iceberg V3 deletion vector (DV) support using Puffin-encoded roaring�bitmaps, including a DV reader, writer, page sink, and compaction procedure.
* Add Iceberg equality delete file reader with sequence number conflict�resolution per the Iceberg V2+ spec: equality deletes skip when�deleteFileSeqNum <= dataFileSeqNum; positional deletes and DVs skip when�deleteFileSeqNum < dataFileSeqNum; sequence number 0 (V1 legacy) never skips.
* Wire dataSequenceNumber through the Presto protocol layer (Java → C++)�to enable server-side sequence number conflict resolution for all delete�file types.
* Add PUFFIN file format support for deletion vector discovery, enabling�the coordinator to locate DV files during split creation.
* Add Iceberg V3 deletion vector write path with DV page sink and�rewrite_delete_files compaction procedure for DV maintenance.
* Add nanosecond timestamp (TIMESTAMP_NANO) type support for Iceberg V3�tables.
* Add Variant type support for Iceberg V3, enabling semi-structured data�columns in Iceberg tables.
* Eagerly collect delete files during split creation with improved logging�for easier debugging of Iceberg delete file resolution.
* Improve IcebergSplitReader error handling and fix test file handle leaks.
* Add end-to-end integration tests for Iceberg V3 covering snapshot�lifecycle (INSERT, DELETE with equality/positional/DV deletes, UPDATE,�MERGE, time-travel) and all 99 TPC-DS queries.

Differential Revision: D97531549
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant