Skip to content

Fix resuming a retroactively time partitioning, upon a master swings.#5784

Merged
dorinhogea merged 1 commit intobloomberg:mainfrom
dorinhogea:retropartrstrt2
Mar 17, 2026
Merged

Fix resuming a retroactively time partitioning, upon a master swings.#5784
dorinhogea merged 1 commit intobloomberg:mainfrom
dorinhogea:retropartrstrt2

Conversation

@dorinhogea
Copy link
Copy Markdown
Contributor

^title

It requires multitable_ddl option.
it preserves the already partitioning effort.
It decides the sc_genids values as maximum genids for each shard and stripe.

Copy link
Copy Markdown

@roborivers roborivers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cbuild submission: Success ✓.
Regression testing: Success ✓.

The first 10 failing tests are:
sc_truncate
sc_resume_logicalsc_generated
cldeadlock
consumer_non_atomic_default_consumer_generated
sc_transactional_rowlocks_generated
remsql_locks_rte_connect_generated
remsql_locks
reco-ddlk-sql

Copy link
Copy Markdown

@roborivers roborivers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cbuild submission: Success ✓.
Regression testing: Success ✓.

The first 10 failing tests are:
scindex_logicalsc_generated
consumer_non_atomic_default_consumer_generated
remsql_locks_rte_connect_generated
remsql_locks
truncatesc_offline_generated
reco-ddlk-sql

Copy link
Copy Markdown

@roborivers roborivers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cbuild submission: Error ⚠.
Regression testing: Success ✓.

The first 10 failing tests are:
scindex
sc_resume_logicalsc_generated
consumer_non_atomic_default_consumer_generated
sc_transactional_rowlocks_generated
truncatesc_offline_generated

@dorinhogea
Copy link
Copy Markdown
Contributor Author

cdb2test Mar 4 17:36:29 2026 success retropartrstrt2.R20260304.9

@riverszhang89
Copy link
Copy Markdown
Contributor

I had a couple questions, somewhat related to this pull request:

  1. how are live-writes handled? Will they always operate on the current sc_to? Or do they also go through the retro-route logic?

  2. the retro-routing logic seems a bit off, too:

# There're 3 rows inserted at 2026-03-12T160734
$ cdb2sql mydb2 local "select comdb2_rowtimestamp from t1"
(rowid="2026-03-12T160734.000 America/New_York")
(rowid="2026-03-12T160734.000 America/New_York")
(rowid="2026-03-12T160734.000 America/New_York")

# The table then is partitioned 2-way starting at "2026-03-11T16:30:30"
$ cdb2sql mydb2 local "alter table t1 partitioned by time period 'daily' retention 2 start '2026-03-11T16:30:30 America/New_York' retroactively"

# shards have the correct lo/hi timestamp:
$ cdb2sql mydb2 local "select shardname, cast(low as datetime) as begin, cast(high as datetime) as end from comdb2_timepartshards"
(shardname='$0_A2BDF977', begin="2026-03-10T163030.000 America/New_York", end="2026-03-11T163030.000 America/New_York")
(shardname='$1_226446BF', begin="2026-03-11T163030.000 America/New_York", end="2026-03-12T163030.000 America/New_York")

# However, rows are rebuilt into shard 0, not shard 1
cdb2sql mydb2 local "select comdb2_rowtimestamp from '\$0_A2BDF977'"
(rowid="2026-03-12T160734.000 America/New_York")
(rowid="2026-03-12T160734.000 America/New_York")
(rowid="2026-03-12T160734.000 America/New_York")

@dorinhogea
Copy link
Copy Markdown
Contributor Author

To answer the questions:

  1. yes, live writes proper follow the retro logic, it is part of the test introduced with the feature in the previous PR
  2. the start of a retro partition is the time of the next rollout, and it has to be in the future; if the start is in the past, newer rows will be inserted in the shard 0 (which is the newest shard); the same way we insert rows older than the full retention, which are merged in the oldest shard (which would be shard 1); we do choose not to drop rows during alter

@dorinhogea
Copy link
Copy Markdown
Contributor Author

I have added this PR #5815 to reject alters that use start as the first rollout instead of the next rollout.

riverszhang89
riverszhang89 previously approved these changes Mar 16, 2026
Copy link
Copy Markdown
Contributor

@riverszhang89 riverszhang89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining. This makes sense.

Signed-off-by: Dorin Hogea <dhogea@bloomberg.net>
@dorinhogea
Copy link
Copy Markdown
Contributor Author

@riverszhang89 thank you!

@dorinhogea dorinhogea merged commit b345adb into bloomberg:main Mar 17, 2026
4 checks passed
@dorinhogea dorinhogea deleted the retropartrstrt2 branch March 17, 2026 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants