Skip to content

payments/db: remap legacy zero attempt IDs#10799

Open
ziggie1984 wants to merge 1 commit into
lightningnetwork:masterfrom
ziggie1984:fix-kv-payment-migration
Open

payments/db: remap legacy zero attempt IDs#10799
ziggie1984 wants to merge 1 commit into
lightningnetwork:masterfrom
ziggie1984:fix-kv-payment-migration

Conversation

@ziggie1984
Copy link
Copy Markdown
Collaborator

Legacy KV payments can contain HTLC attempts with attempt ID zero. The SQL payments schema requires payment_htlc_attempts.attempt_index to be globally unique, so migrating multiple such attempts can fail with a UNIQUE constraint violation.

This PR allocates synthetic attempt indexes for legacy zero-ID attempts from the switch payment ID sequencer horizon, keeps nonzero attempt IDs unchanged, and advances the switch sequence once after migration validation succeeds.

It also wraps HTLC insert errors with the attempted index and payment hash so future migration failures identify the problematic row.

Fixes #10797

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where legacy KV payments containing HTLC attempts with an ID of zero cause unique constraint violations when migrating to the SQL schema. By implementing a synthetic ID allocation strategy that leverages the existing switch payment ID sequencer, the migration process can now safely assign unique identifiers to these legacy records. Additionally, the PR improves diagnostic logging and error reporting for the migration process to better identify and resolve potential data conflicts.

Highlights

  • Legacy Attempt ID Remapping: Introduced an attemptIDAllocator to generate synthetic, unique attempt indexes for legacy HTLC attempts that previously used a zero ID, preventing SQL unique constraint violations during migration.
  • Sequencer Synchronization: Ensures the switch payment ID sequencer is advanced after migration to avoid future ID collisions with the newly allocated legacy attempt indexes.
  • Improved Error Reporting: Enhanced HTLC insertion error messages to include the problematic attempt index and payment hash, facilitating easier debugging of migration failures.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@ziggie1984 ziggie1984 self-assigned this May 11, 2026
@ziggie1984 ziggie1984 added no-changelog backport-v0.21.x-branch This label triggers a backport to branch `v0.21.x-branch ` sql labels May 11, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the payment migration logic to handle legacy HTLC attempts with zero IDs by allocating unique identifiers from the switch sequencer's horizon, preventing SQL unique constraint collisions. The changes introduce an attemptIDAllocator and a collectMigrationState function to manage this process. Feedback was provided to ensure the new implementation adheres to the repository style guide, specifically regarding the requirement for function documentation to begin with the function name and the use of in-body comments to explain the intention of logical code blocks.

Comment thread payments/db/migration1/sql_migration.go
Comment thread payments/db/migration1/sql_migration.go
Comment thread payments/db/migration1/sql_migration.go
Comment thread payments/db/migration1/sql_migration.go Outdated
@ziggie1984 ziggie1984 force-pushed the fix-kv-payment-migration branch from 5c1be14 to c89f2ac Compare May 11, 2026 14:24
@github-actions github-actions Bot added the severity-medium Focused review required label May 11, 2026
@github-actions
Copy link
Copy Markdown

🟡 PR Severity: MEDIUM Automated classification | 1 file (non-test) | 212 lines changed Medium (1 file): - payments/db/migration1/sql_migration.go - payments package SQL migration Excluded from scoring (1 file): - payments/db/migration1/sql_migration_test.go - test file Analysis: This PR modifies a SQL migration under payments/db/migration1/. The payments/* path maps to MEDIUM severity. The always-CRITICAL migration override covers channeldb/migration*, sqldb/, and wtdb/ -- this path does not match. Bump thresholds: 1 non-test file (threshold: >20) and 212 lines changed (threshold: >500) -- no bump applied. To override, add a severity-override-{critical,high,medium,low} label. <!-- pr-severity-bot -->

@ziggie1984 ziggie1984 added this to the v0.21.0 milestone May 11, 2026
@ziggie1984 ziggie1984 added this to v0.21 May 11, 2026
@ziggie1984 ziggie1984 force-pushed the fix-kv-payment-migration branch 9 times, most recently from 43c48bc to 5458424 Compare May 11, 2026 16:00
@ziggie1984
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to handle legacy KV payments with attempt ID zero during SQL migration by using an attemptIDAllocator to generate unique synthetic IDs from the switch sequencer horizon. It also includes logic to update the sequencer post-migration and adds validation helpers. Feedback focuses on a critical bug where the allocator starts at the last used sequence ID instead of the next available one, potentially causing collisions. Additionally, the reviewer noted several locations where missing nil checks on SessionKey() could lead to panics during migration and validation.

Comment thread payments/db/migration1/sql_migration.go
Comment thread payments/db/migration1/migration_validation.go Outdated
Comment thread payments/db/migration1/migration_validation.go Outdated
Comment thread payments/db/migration1/sql_migration.go Outdated
@ziggie1984
Copy link
Copy Markdown
Collaborator Author

Regarding the review suggestion to start the synthetic attempt ID allocator at nextSwitchPaymentID + 1: I think we should keep this at nextSwitchPaymentID.

The switch sequencer does not use the generic bbolt NextSequence() semantics here. In htlcswitch/sequencer.go, it reads Sequence() as the first ID in the next range, then immediately persists nextID + defaultSequenceBatchSize as the new horizon before returning nextID.

So if the stored value is 2000, then 2000 has not been handed out yet. If the switch ever hands out 2000, it first advances the persisted value to 3000. That means the stored sequence value is already beyond all IDs that were handed out by the switch, and starting synthetic migration IDs at that value is safe. Starting at +1 would only skip one otherwise valid unused ID.

@ziggie1984 ziggie1984 force-pushed the fix-kv-payment-migration branch 2 times, most recently from 7f0c728 to 3dfdd4f Compare May 11, 2026 16:17
@ziggie1984 ziggie1984 requested a review from yyforyongyu May 11, 2026 16:18
@ziggie1984 ziggie1984 force-pushed the fix-kv-payment-migration branch 2 times, most recently from dbd2261 to eb7ce77 Compare May 11, 2026 19:57
@ziggie1984 ziggie1984 requested a review from Copilot May 11, 2026 20:05
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR updates the payments KV→SQL migration to remap legacy HTLC attempts that have attempt ID 0 by allocating synthetic, globally-unique SQL attempt indexes derived from the switch payment ID sequencer horizon, preventing UNIQUE constraint collisions during migration.

Changes:

  • Add attempt-index remapping for legacy zero attempt IDs and advance the switch sequencer once migration succeeds.
  • Improve migration error context for HTLC attempt inserts (include attempted index + payment hash).
  • Add migration validation/test normalization so KV vs SQL comparisons account for expected remapped attempt IDs.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
payments/db/migration1/sql_migration.go Introduces legacy attempt ID allocator, uses it during attempt migration, and persists updated switch sequencer horizon after successful migration.
payments/db/migration1/migration_validation.go Normalizes legacy zero attempt IDs (KV side) during deep-compare so expected remaps don’t cause false mismatches.
payments/db/migration1/sql_migration_test.go Adds test coverage for legacy zero attempt IDs and refactors test helpers to allow fixed attempt IDs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread payments/db/migration1/sql_migration.go
Comment thread payments/db/migration1/sql_migration.go
Comment thread payments/db/migration1/sql_migration.go Outdated
Comment thread payments/db/migration1/sql_migration_test.go Outdated
Comment thread payments/db/migration1/sql_migration.go Outdated
// attempt ID.
nextID uint64

// persistNextID is the final next-unused ID to persist after
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need both of these variables?

IIUC the main thing we need to know is if we had to allocate any values at all from this range.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, changing it because we can be sure that the migration runs exclusively.

@ziggie1984 ziggie1984 force-pushed the fix-kv-payment-migration branch from 128995a to d370b98 Compare May 12, 2026 01:34
@ziggie1984 ziggie1984 requested a review from Roasbeef May 12, 2026 01:35
// It uses the in-memory counter initialized from the switch payment ID
// sequencer horizon.
func (a *attemptIDAllocator) allocateLegacyAttemptID() (uint64, error) {
if a.nextID == 0 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still fails the migration when legacy zero-ID HTLCs exist but the switch sequencer is at its initial horizon. collectMigrationState leaves nextSwitchPaymentID as 0 when the bucket is absent or when it exists but has never allocated a batch, and then this path returns cannot allocate legacy attempt ID: switch payment ID sequence is zero. A node with old zero-ID attempts and no later outgoing payment after the sequencer was introduced would still be unable to migrate. The runtime sequencer handles this initial state by skipping ID 0 and issuing 1, so the migration should do the same and create/advance the sequencer bucket as needed. I confirmed this locally with a temporary regression test that creates the sequencer bucket at sequence 0 plus a zero-ID payment; the migration fails here.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok done, now starting with 1 if it was zero in the first place

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. The migration should mirror the runtime sequencer here instead of failing on the initial horizon.

The adjusted approach is to keep seeding the allocator from the switch sequence bucket, but when the allocator is actually asked to issue a replacement ID and the cursor is 0, it skips 0 and starts at 1, matching htlcswitch/sequencer.go. After migration, the switch sequence is advanced to the final next-unused value, creating/advancing the sequencer bucket as needed.

I also added coverage for both cases:

  • initial sequence 0: legacy attempts get 1, 2, final sequence 3
  • advanced sequence 10: legacy attempts get 10, 11, final sequence 12

sessionKeyBytes := sessionKey.Serialize()

attemptID := htlc.AttemptID
if attemptID == 0 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This remaps every zero-ID attempt, including attempts that still have no settle/fail resolution. If such an attempt is genuinely in flight, the switch/circuit/result state may still know it as attempt ID 0, while SQL will resume it using the synthetic ID assigned here. That would make startup unable to match the live attempt/result. If legacy zero-ID attempts are guaranteed to be terminal historical records, then this is fine, but otherwise the migration should either preserve/handle live zero-ID attempts specially or explicitly guard/fail with a clear error before remapping them.

I think we should just mark these payments as failed if they are still pending now - they must have been failed already given how old they are.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok moving the payment and attempt to failed in case we find something like this. Increased the scope a bit.

@ziggie1984 ziggie1984 force-pushed the fix-kv-payment-migration branch from d370b98 to 76584b5 Compare May 12, 2026 14:29
@ziggie1984
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a strategy for migrating legacy HTLC attempts with zero IDs from KV to SQL. It introduces a synthetic ID allocator that utilizes the switch sequencer's horizon to generate unique attempt indices and updates the migration process to fail unresolved legacy attempts. The changes also include logic to synchronize these IDs during post-migration validation and provide new unit tests for these edge cases. Reviewer feedback focused on adhering to the repository style guide by using newlines to separate logical stanzas for better readability.

Comment thread payments/db/migration1/migration_validation.go
Comment thread payments/db/migration1/sql_migration.go
@saubyk saubyk moved this to In review in v0.21 May 12, 2026
@ziggie1984 ziggie1984 force-pushed the fix-kv-payment-migration branch from 76584b5 to 3a21b75 Compare May 12, 2026 16:03
Legacy KV payments can contain HTLC attempts with attempt ID zero. The
SQL payments schema requires payment_htlc_attempts.attempt_index to be
globally unique, so migrating multiple such attempts can fail with a
UNIQUE constraint violation.

Allocate synthetic attempt indexes for legacy zero-ID attempts from the
switch payment ID sequencer horizon. Keep nonzero attempt IDs unchanged
and advance the switch sequence once after migration validation succeeds.

This preserves the SQL uniqueness invariant and prevents future switch
IDs from colliding with migrated attempts. It also wraps HTLC insert
errors with the attempted index and payment hash so future migration
failures identify the problematic row.
@ziggie1984 ziggie1984 force-pushed the fix-kv-payment-migration branch from 3a21b75 to c11a292 Compare May 12, 2026 16:19
@ziggie1984 ziggie1984 requested a review from yyforyongyu May 12, 2026 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-v0.21.x-branch This label triggers a backport to branch `v0.21.x-branch ` no-changelog severity-medium Focused review required sql

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

[bug]: UNIQUE constraint failed when applying kv_payments_migration

5 participants