Skip to content

Conversation

@brfrn169
Copy link
Collaborator

@brfrn169 brfrn169 commented Nov 21, 2025

Description

This PR adds support for transaction metadata decoupling in Consensus Commit. This feature allows transaction metadata to be stored separately from user data, enabling the use of virtual tables that reference existing database tables without copying data.

Implementation details

  • Transaction metadata decoupling enabled when transaction_metadata_decoupling=true is specified in options

  • ConsensusCommitAdmin.createTable() with transaction metadata decoupling

    • The names of newly created tables:
      • Virtual table: <table_name>
      • Data table: <table_name>_data
      • Transaction metadata table: <table_name>_tx_metadata
    • Implementation
      1. Storage Compatibility Check
        • Validates via throwIfTransactionMetadataDecouplingNotSupportedStorage()
          • Cross-table atomic mutations are possible (NAMESPACE or STORAGE level atomicity)
          • Consistent virtual table reads are guaranteed
      2. Create Data Table
        • Named <table_name>_data, stores only user data
      3. Create Transaction Metadata Table
        • Named <table_name>_tx_metadata, contains:
          • Primary key columns
          • Transaction metadata columns (tx_id, tx_state, tx_version, before columns, etc.)
          • Before-image columns (prefixed with before_* for non-primary key columns)
      4. Create Virtual Table
        • Named <table_name>, joins data table and metadata table with INNER JOIN
        • Appears as a single table to users
  • ConsensusCommitAdmin.importTable() with transaction metadata decoupling

    • The names of imported tables
      • Original table: <table_name> (used as data table)
      • Transaction metadata table: <table_name>_tx_metadata
      • Virtual table: <table_name>_scalardb
    • Implementation
      1. Storage Compatibility Check
        • Validates via throwIfTransactionMetadataDecouplingNotSupportedStorage()
      2. Import Existing Table
        • Imports with original name (<table_name>) as the data table
      3. Create Transaction Metadata Table
        • Creates <table_name>_tx_metadata
      4. Create Virtual Table
        • Named <table_name>_scalardb, joins data and metadata tables with LEFT OUTER JOIN
        • LEFT OUTER JOIN allows reading records not yet processed by transactions (no metadata exists yet)
  • ConsensusCommitAdmin.dropTable()

    • If the table is a virtual table, drop the data table, the metadata table, and the virtual table
  • Unsupported administrative operations:

    • repairTable()
    • addNewColumnToTable()
    • dropColumnFromTable()
    • renameColumn()
    • alterColumnType()
    • renameTable()
  • Added checks for all CRID operations in ConsensusCommitOperationChecker

    1. Check if the target table is a virtual table
    2. Get storage information and validate consistent reads: Ensures the storage guarantees consistent reads across the joined tables (data table + metadata table)
  • Currently, to determine whether a table is a transaction metadata–decoupling table, we simply check whether it is a virtual table. In the future, we may need to introduce dedicated metadata for this in Consensus Commit.

Related issues and/or PRs

Changes made

  • Core Functionality

    • ConsensusCommitAdmin: Added support for creating and managing virtual tables with metadata decoupling
      • Validates storage compatibility with transaction metadata decoupling
      • Checks if storage guarantees consistent reads for virtual tables
      • Implements virtual table creation and management logic
    • ConsensusCommitUtils: Added utility methods for virtual table operations
    • ConsensusCommitOperationChecker: Added validation logic to prevent unsupported operations on virtual tables (e.g., mutations on read-only virtual tables)
  • Schema Loader

    • Updated to support importing tables with metadata decoupling enabled
  • Added comprehensive integration tests for metadata decoupling:

    • ConsensusCommitWithMetadataDecouplingIntegrationTestBase
    • ConsensusCommitImportTableWithMetadataDecouplingIntegrationTestBase
    • ConsensusCommitAdminImportTableWithMetadataDecouplingIntegrationTestBase
    • ConsensusCommitSpecificWithMetadataDecouplingIntegrationTestBase
    • SchemaLoaderWithMetadataDecouplingIntegrationTestBase
    • SchemaLoaderImportWithMetadataDecouplingIntegrationTestBase

Checklist

The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • I have considered whether similar issues could occur in other products, components, or modules if this PR is for bug fixes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

N/A

Release notes

Added transaction metadata decoupling support in Consensus Commit. This feature enables users to perform Consensus Commit ScalarDB transactions on pre-existing data without schema modifications or data migration.

@brfrn169 brfrn169 self-assigned this Nov 21, 2025
Copilot AI review requested due to automatic review settings November 21, 2025 11:40
@brfrn169 brfrn169 added the enhancement New feature or request label Nov 21, 2025
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @brfrn169, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances Consensus Commit by introducing transaction metadata decoupling. This feature allows for a more flexible and potentially optimized storage of transaction metadata by separating it from the main data. The implementation involves substantial changes to how tables are created and imported, leveraging virtual tables to join data and metadata. Robust validation has been added to ensure operations on these new table structures are consistent and supported by the underlying storage. The changes are thoroughly tested with new integration tests covering various database types and schema loading scenarios.

Highlights

  • Transaction Metadata Decoupling: Introduced support for transaction metadata decoupling in Consensus Commit, allowing transaction metadata to be stored separately from the actual data. This is achieved by creating distinct data and transaction metadata tables, which are then joined via a virtual table.
  • ConsensusCommitAdmin Enhancements: The ConsensusCommitAdmin now handles createTable and importTable operations with metadata decoupling enabled. For createTable, it creates a data table, a transaction metadata table, and a virtual table. For importTable, it imports the original table as a data table, creates a transaction metadata table, and a virtual table with a LEFT_OUTER join. It also includes checks to prevent unsupported administrative operations on these decoupled tables.
  • Operation Validation: The ConsensusCommitOperationChecker has been updated to validate operations on virtual tables, ensuring that the underlying storage guarantees consistent reads, throwing an IllegalArgumentException if not.
  • JDBC Database Operations on Virtual Tables: The JdbcDatabase now intelligently handles Put and Delete operations on virtual tables that use LEFT_OUTER joins. Specifically, if all conditions for the right source table are IS_NULL, it will use PutIfNotExists or DeleteIfExists for efficiency and correctness.
  • Isolation Level Configuration for JDBC: JDBC data sources used for table metadata and admin operations can now have their default transaction isolation level configured, ensuring appropriate consistency settings.
  • Extensive Integration Tests: Numerous new integration tests have been added across JDBC storage and SchemaLoader to thoroughly validate the transaction metadata decoupling feature, covering various scenarios including table creation, import, data manipulation, and recovery, with specific considerations for different RDB engines.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for transaction metadata decoupling in Consensus Commit, a significant feature that separates transaction metadata from application data. The changes are extensive, touching core logic and adding a substantial number of integration tests. The implementation appears solid, but there are several areas with code duplication, particularly in the test setup, that could be refactored to improve maintainability. I've provided specific suggestions for these refactorings.

Copilot finished reviewing on behalf of brfrn169 November 21, 2025 12:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces support for transaction metadata decoupling in Consensus Commit, enabling storage of transaction metadata in separate tables from user data. This is a significant architectural enhancement that improves performance and scalability for certain storage backends.

Key Changes:

  • Added transaction metadata decoupling feature with separate data and metadata tables joined via virtual tables
  • Refactored SchemaOperator.importTables() to retrieve options from each ImportTableSchema rather than passing a global options parameter
  • Extended integration test suites with new base classes for metadata decoupling scenarios
  • Added validation to ensure metadata decoupling is only used with supported storage backends (JDBC with proper isolation levels)

Reviewed Changes

Copilot reviewed 43 out of 43 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
ConsensusCommitAdmin.java Implements create/drop/import table operations for metadata decoupling with virtual tables
ConsensusCommitUtils.java Adds utility to build transaction metadata table schema from data table metadata
ConsensusCommitOperationChecker.java Validates operations on virtual tables require consistent reads from storage
SchemaOperator.java Refactored to use per-table options from ImportTableSchema.getOptions()
JdbcDatabase.java Handles LEFT_OUTER join conditions for virtual tables with IS_NULL checks
JdbcUtils.java Configures transaction isolation levels for JDBC data sources
Integration test bases New test base classes for metadata decoupling scenarios across different storage backends

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@brfrn169 brfrn169 force-pushed the support-transaction-metadata-decoupling-in-consensus-commit branch from a93e918 to 629633b Compare November 21, 2025 13:53
@brfrn169 brfrn169 marked this pull request as draft November 22, 2025 22:40
@brfrn169 brfrn169 force-pushed the support-transaction-metadata-decoupling-in-consensus-commit branch 2 times, most recently from 5363a25 to b82c7b3 Compare November 23, 2025 12:04
@brfrn169 brfrn169 force-pushed the support-transaction-metadata-decoupling-in-consensus-commit branch from b82c7b3 to a7f4cd8 Compare November 24, 2025 14:11
@brfrn169 brfrn169 marked this pull request as ready for review November 25, 2025 06:11
@brfrn169 brfrn169 requested a review from Copilot November 25, 2025 06:11
@brfrn169
Copy link
Collaborator Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces transaction metadata decoupling, a significant feature that allows ScalarDB to work with existing data without migration. The implementation looks solid, with new logic in ConsensusCommitAdmin to handle the creation, import, and dropping of decoupled tables. The changes are well-supported by a comprehensive set of new integration tests.

I have two main points of feedback:

  1. The behavior of dropTable for imported tables is destructive as it drops the user's original data table. This should be reconsidered to avoid unexpected data loss.
  2. There is some duplicated code in the new test classes for setting the JDBC isolation level, which could be refactored for better maintainability.

Overall, this is a great enhancement to ScalarDB.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot finished reviewing on behalf of brfrn169 November 25, 2025 07:21
Copy link
Contributor

@feeblefakie feeblefakie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looking good. Thank you!
Left some minor comments. PTAL!

@brfrn169 brfrn169 requested a review from feeblefakie November 25, 2025 08:04
Copy link
Contributor

@Torch3333 Torch3333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@brfrn169 brfrn169 requested a review from komamitsu November 25, 2025 08:20
Copy link
Contributor

@komamitsu komamitsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

Copy link
Contributor

@feeblefakie feeblefakie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

@brfrn169 brfrn169 merged commit e0777a8 into master Nov 25, 2025
139 of 140 checks passed
@brfrn169 brfrn169 deleted the support-transaction-metadata-decoupling-in-consensus-commit branch November 25, 2025 12:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants