Skip to content

feat: perm ledger expiry for testnet#1201

Open
roberts-pumpurs wants to merge 50 commits intomasterfrom
rob/perm-ledger-expiry
Open

feat: perm ledger expiry for testnet#1201
roberts-pumpurs wants to merge 50 commits intomasterfrom
rob/perm-ledger-expiry

Conversation

@roberts-pumpurs
Copy link
Contributor

@roberts-pumpurs roberts-pumpurs commented Mar 9, 2026

Describe the changes
Perm ledger can expire if the config parameter is properly set for it.

  • no refunds for users
  • no shadow txs when expired
    generally speaking, the partitions get reprocessed, but the funds are not released on expiry like it is for term ledgers (mimicking the fact that perm does not expire on mainnet either).

Adds optional expiration for Publish (permanent) ledger slots via publish_ledger_epoch_length: Option<u64> in EpochConfig. Mainnet is unaffected (None = never expire). Testnet can enable expiry (e.g. Some(168) ≈ 7 days).

How It Works

At each epoch boundary, expire_partitions() checks both ledger types. For Publish, if configured, slots where last_height <= epoch_height - (epoch_length * num_blocks_in_epoch) are marked expired and their partitions returned to the capacity pool. The last slot is never expired to prevent a bootstrapping deadlock (no active slots → no partition assignment → no data ingress → no new slot creation).

No fees are distributed and no user balances change on Publish expiry — unlike Submit expiry which splits term fees among miners.

Term vs Perm Expiry

Aspect Term (Submit) Perm (Publish)
Expiry Always enabled Optional via config
Fee distribution Splits term fees among miners None
Shadow transactions TermFeeReward generated None
User refunds Possible for unpromoted txs Never
Last-slot protection Yes Yes

Checklist

  • Tests have been added/updated for the changes.
  • Documentation has been updated for the changes (if applicable).
  • The code follows Rust's style guidelines.

Summary by CodeRabbit

  • New Features

    • Configurable epoch-based expiry for permanent (publish) ledger slots (disabled on mainnet by default), never-expire safeguard for last slot, and read-only preview of expiring partitions.
    • New consensus config option to control publish-ledger expiry with validation.
  • Bug Fixes

    • Tightened expiry and fee-distribution control flow to avoid processing non-target ledgers and inconsistent state.
  • Tests

    • Added unit and integration tests covering expiry scenarios and config validation.
  • Documentation

    • Consolidated developer guidance into unified execution instructions and a structured codebase research workflow.

roberts-pumpurs and others added 27 commits March 6, 2026 18:22
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename expire_term_partitions() to expire_partitions() and add publish
ledger expiry logic. When publish_ledger_epoch_length is configured,
perm slots whose last_height is older than (epoch_height - epoch_length
* num_blocks_in_epoch) are expired, with the last slot always protected.
Includes 4 unit tests covering disabled, enabled, last-slot-protection,
and not-enough-blocks scenarios.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename get_expiring_term_partitions() to get_expiring_partitions() and
add perm ledger slot expiry logic that mirrors expire_partitions() but
without mutation. This ensures the read-only path used by EpochSnapshot
reports the same expiring partitions as the mutating path.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…too)

Update callers in epoch_snapshot.rs to use the renamed methods:
- expire_term_ledger_slots() -> expire_ledger_slots()
- expire_term_partitions() -> expire_partitions()
- get_expiring_term_partitions() -> get_expiring_partitions()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add !slot.is_expired check to PermanentLedger::get_slot_needs(),
matching the existing behavior in TermLedger::get_slot_needs().
This prevents expired permanent ledger slots from being offered
for new partition assignments.

Includes a test that verifies expired slots are excluded.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Clarify that the DataLedger::Publish bail prevents accidental fee
calculation, not that publish ledger cannot expire at all.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…_partitions

Prevents a Publish partition state inconsistency from blocking Submit fee
distribution. Previously, get_assignment() was called for ALL expired
partitions before the ledger type check — a missing Publish partition would
bail the entire function.

Fixes security review Finding 1 (Medium-High).
Moves the unreachable bail guard from collect_expired_partitions to a
debug_assert at the calculate_expired_ledger_fees entry point. This
catches misuse during development without runtime overhead in release.

Fixes security review Finding 4 (Low).
All 3 locations computing epoch_length * num_blocks_in_epoch now use
checked_mul with a descriptive panic. This prevents silent overflow
in release builds with extreme config values.

Applied to TermLedger::get_expired_slot_indexes, Ledgers::expire_partitions,
and Ledgers::get_expiring_partitions for consistency.

Fixes security review Finding 3 (Low).
Verifies:
- Perm slots are marked is_expired after expiry height
- No TermFeeReward shadow txs in the expiry epoch block
- Expired partitions are returned to capacity pool

Fixes security review Finding 2 (Medium).
@roberts-pumpurs roberts-pumpurs self-assigned this Mar 9, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 9, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Implements optional epoch-based expiry for publish (perm) ledger slots via a new publish_ledger_epoch_length config; generalizes expiry orchestration from term-only to all ledgers, updates epoch-snapshot and actor fee-distribution logic, adds perm-ledger expiry integration tests, and adds codex/research workflow docs.

Changes

Cohort / File(s) Summary
Configuration & Validation
crates/types/src/config/consensus.rs, crates/types/src/config/mod.rs
Add publish_ledger_epoch_length: Option<u64> to EpochConfig with serde rules; set defaults for mainnet/testing/testnet; validate >0 and guard against multiplication overflow.
Core Ledger Expiry & API
crates/database/src/data_ledger.rs
Replace term-only expiry with unified expire_partitions, add get_expiring_partitions, wire optional publish expiry and blocks-per-epoch, ensure last slot never expires, and update get_slots/get_slot_needs to accept a DataLedger selector.
Epoch Snapshot Integration
crates/domain/src/snapshots/epoch_snapshot/epoch_snapshot.rs
Rename expire_term_ledger_slotsexpire_ledger_slots; call ledgers.expire_partitions / get_expiring_partitions; early-return when nothing to expire; update call sites.
Actor Fee-Distribution & Guards
crates/actors/src/block_producer/ledger_expiry.rs
Restrict fee-distribution to Submit ledgers, defer partition lookup until ledger type confirmed, and only collect miner addresses for matching ledger types to avoid processing Publish.
Tests — Unit & Integration
crates/actors/tests/epoch_snapshot_tests.rs, crates/chain-tests/src/lib.rs, crates/chain-tests/src/perm_ledger_expiry/mod.rs
Set publish_ledger_epoch_length: None in actor tests; add perm_ledger_expiry integration module with five tests covering perm/term expiry scenarios, boundaries, disabled-expiry behavior, capacity pool reassignments, shadow txs, and balance assertions.
Docs & Workflow Guidance
.claude/commands/codex.md, .claude/commands/research-codebase.md
Restructure Codex workflow into step-based guidance and add comprehensive research-codebase doc defining parallel sub-agent research workflow, metadata/frontmatter conventions, and output formatting rules.
Small Refactor / Call-site Updates
crates/domain/..., crates/database/...
Rename and rewire internal methods and calls to reflect generalized expiry (e.g., expire_term_partitionsexpire_partitions) and propagate signature changes for expiry-aware APIs.

Sequence Diagram

sequenceDiagram
    participant EpochSnapshot as EpochSnapshot
    participant Ledgers as Ledgers
    participant Config as ConsensusConfig
    participant BlockProducer as BlockProducer
    participant CapacityPool as CapacityPool

    EpochSnapshot->>EpochSnapshot: perform_epoch_tasks(new_epoch_block)
    EpochSnapshot->>EpochSnapshot: expire_ledger_slots(new_epoch_block)
    EpochSnapshot->>Ledgers: expire_partitions(epoch_height)
    Ledgers->>Config: read publish_ledger_epoch_length & num_blocks_in_epoch
    alt publish_ledger_epoch_length is Some(n)
        Ledgers->>Ledgers: compute expiry_height = epoch_height - (n * num_blocks_in_epoch)
        Ledgers->>Ledgers: identify expired perm slots (skip last slot)
        Ledgers->>CapacityPool: return expired partitions
    else publish_ledger_epoch_length is None
        Ledgers->>Ledgers: skip perm expiry (no-op)
    end
    Ledgers-->>EpochSnapshot: Vec<ExpiringPartitionInfo>
    EpochSnapshot->>BlockProducer: notify expired partitions (distribution/cleanup)
    BlockProducer->>CapacityPool: update capacity pool / reclaim partitions
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • JesseTheRobot
  • antouhou
  • DanMacDonald
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: implementing configurable expiry for permanent ledger slots on testnet.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch rob/perm-ledger-expiry
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@roberts-pumpurs roberts-pumpurs marked this pull request as ready for review March 9, 2026 15:18
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/domain/src/snapshots/epoch_snapshot/epoch_snapshot.rs (1)

472-495: ⚠️ Potential issue | 🟠 Major

Clear expired_partition_infos on epochs with no expiries.

When expire_partitions() returns empty, this exits without resetting self.expired_partition_infos. After the first expiry, later snapshots can keep exposing stale partition hashes and trigger duplicate expiry broadcasts/repacking against already-reassigned partitions.

Suggested fix
     fn expire_ledger_slots(&mut self, new_epoch_block: &IrysBlockHeader) {
         let epoch_height = new_epoch_block.height;
         let expired_partitions: Vec<ExpiringPartitionInfo> =
             self.ledgers.expire_partitions(epoch_height);
+
+        self.expired_partition_infos = None;
 
         // Return early if there's no more work to do
         if expired_partitions.is_empty() {
             return;
         }
@@
 
         self.expired_partition_infos = Some(expired_partitions);
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/domain/src/snapshots/epoch_snapshot/epoch_snapshot.rs` around lines
472 - 495, In expire_ledger_slots, when
self.ledgers.expire_partitions(epoch_height) returns an empty Vec you must
clear/reset self.expired_partition_infos to avoid exposing stale partition data;
update the early-return branch in the expire_ledger_slots function to set
self.expired_partition_infos = None (or equivalent cleared state) before
returning so subsequent snapshots don't reuse old partition hashes from previous
expiries (refer to expire_partitions, expired_partition_infos, and
return_expired_partition_to_capacity to locate the related logic).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/commands/research-codebase.md:
- Around line 21-24: The fenced code blocks must include a language tag and be
surrounded by blank lines to satisfy markdownlint rules MD031/MD040; locate the
example response block containing "I'm ready to research the codebase..." and
change the opening fence to "```text" and ensure there is a blank line before
and after the block, and similarly update the fenced YAML/markdown example (the
block starting with "---" under "Structure the document with YAML frontmatter
followed by content:") to have a blank line before the opening "```markdown" and
the fence should remain labeled "markdown"; apply these same fixes for all other
occurrences referenced (around lines 99-155) so every fenced block has an
appropriate language tag and surrounding blank lines.
- Around line 10-17: Add a single blank line after each Markdown heading that is
immediately followed by a list to satisfy markdownlint MD022; specifically
update the heading "## CRITICAL: YOUR ONLY JOB IS TO DOCUMENT AND EXPLAIN THE
CODEBASE AS IT EXISTS TODAY" and the other headings called out in the comment by
inserting one empty line between the heading line and the following list items
(i.e., where a heading is directly followed by lines starting with "-" or
numbered list markers). Search the file for headings immediately adjacent to
lists and add the blank line so each heading is separated from its list content.
- Around line 97-119: Update the template references that incorrectly point to
step 4: change the string "Use the metadata gathered in step 4" and any mentions
inside the YAML frontmatter block (e.g., the placeholders for date, researcher,
git_commit, branch, repository, last_updated_by) so they reference step 5 (where
metadata is actually collected); ensure the descriptive text around the YAML
frontmatter and the inline "Date/Researcher/Git Commit/Branch/Repository" lines
all cite step 5 instead of step 4 to keep the workflow consistent.

In `@crates/chain-tests/src/perm_ledger_expiry/mod.rs`:
- Around line 450-463: The test currently computes expiry using perm_slots[0]
which lets the single-slot branch pass; require the multi-slot path by asserting
perm_slots.len() > 1 (or otherwise guard) and use a non-last publish slot (e.g.,
perm_slots[1] or perm_slots.get(1).unwrap()) when computing
slot_last_height/earliest_expiry so the test exercises the non-last-slot
boundary case; apply the same change to the other occurrence around the 495-515
region. Ensure references: node.get_canonical_epoch_snapshot(), perm_slots,
DataLedger::Publish, PUBLISH_LEDGER_EPOCH_LENGTH, and BLOCKS_PER_EPOCH.
- Around line 155-164: The test currently silently skips missing assignments
because it uses `if let Some(assignment) =
partition_assignments.get_assignment(*partition_hash) { ... }`; change that to
explicitly unwrap the assignment with `expect` or `unwrap_or_else` so the test
fails if an assignment is missing (e.g., `let assignment =
partition_assignments.get_assignment(*partition_hash).unwrap_or_else(||
panic!("missing partition assignment {:?} at slot {}", partition_hash,
slot_index));`) and then assert `assignment.ledger_id.is_none()` (or
`.is_some()` where appropriate) using the existing error message; apply the same
pattern to the other occurrences referenced (the checks around lines ~362-368,
~378-384, ~616-622, ~772-778) so missing entries cause a test failure rather
than being silently ignored.
- Around line 591-624: The current check uses perm_slots.last() at the end which
may be a newer slot; before the five-epoch wait capture the original overdue
slot (e.g. store its slot_id and partitions in a variable like original_slot_id
/ original_partitions from
epoch_snapshot.ledgers.get_slots(DataLedger::Publish)[orig_index]) or assert the
test precondition perm_slots.len() == 1, then after the wait locate the same
slot by matching slot_id (or partitions) in the fresh epoch_snapshot instead of
using perm_slots.last(), and run the same is_expired and assignment assertions
against that matched slot (and use partition_assignments.get_assignment on
original_partitions) to prove the original slot remained assigned and unexpired.

In `@crates/domain/src/snapshots/epoch_snapshot/epoch_snapshot.rs`:
- Around line 1199-1202: The method get_expiring_partition_info currently clones
self.ledgers unnecessarily; remove the clone and call
Ledgers::get_expiring_partitions by reference to avoid copying large vectors.
Update get_expiring_partition_info to invoke
get_expiring_partitions(&self.ledgers, epoch_height) (i.e.,
self.ledgers.get_expiring_partitions(epoch_height)) so the read-only path
borrows the ledgers instead of cloning them.

In `@crates/types/src/config/mod.rs`:
- Around line 149-156: The current validate() only checks
publish_ledger_epoch_length > 0 but allows values that later overflow in
crates/database/src/data_ledger.rs where publish_ledger_epoch_length is
multiplied and unwrapped with checked_mul(...).expect(...); update the
validation in mod.rs to also reject values that would overflow by computing the
same multiplier used in data_ledger.rs and ensuring publish_ledger_epoch_length
<= u64::MAX / multiplier (or use checked_mul there to test and ensure! on None),
e.g., capture the multiplier from the data_ledger logic, perform a bounds check
(n <= std::u64::MAX / multiplier) and fail validation with a clear message if it
would overflow.

---

Outside diff comments:
In `@crates/domain/src/snapshots/epoch_snapshot/epoch_snapshot.rs`:
- Around line 472-495: In expire_ledger_slots, when
self.ledgers.expire_partitions(epoch_height) returns an empty Vec you must
clear/reset self.expired_partition_infos to avoid exposing stale partition data;
update the early-return branch in the expire_ledger_slots function to set
self.expired_partition_infos = None (or equivalent cleared state) before
returning so subsequent snapshots don't reuse old partition hashes from previous
expiries (refer to expire_partitions, expired_partition_infos, and
return_expired_partition_to_capacity to locate the related logic).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e0eb039d-e29d-4c9d-966c-41d30a906253

📥 Commits

Reviewing files that changed from the base of the PR and between 021e28f and 592f5d4.

📒 Files selected for processing (10)
  • .claude/commands/codex.md
  • .claude/commands/research-codebase.md
  • crates/actors/src/block_producer/ledger_expiry.rs
  • crates/actors/tests/epoch_snapshot_tests.rs
  • crates/chain-tests/src/lib.rs
  • crates/chain-tests/src/perm_ledger_expiry/mod.rs
  • crates/database/src/data_ledger.rs
  • crates/domain/src/snapshots/epoch_snapshot/epoch_snapshot.rs
  • crates/types/src/config/consensus.rs
  • crates/types/src/config/mod.rs

- Harden test assertions: replace silent `if let Some` with explicit
  unwrap_or_else so missing partition assignments fail tests instead of
  being silently skipped (5 locations)
- Require multi-slot precondition in exact-boundary test and remove
  single-slot fallback that could mask regressions
- Assert single-slot precondition in last-slot-never-expires test
- Remove unnecessary Ledgers clone in get_expiring_partition_info
- Add overflow validation for publish_ledger_epoch_length * num_blocks_in_epoch
  at config startup
- Fix markdown lint issues and step references in research-codebase command
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/domain/src/snapshots/epoch_snapshot/epoch_snapshot.rs (1)

472-495: ⚠️ Potential issue | 🟠 Major

Clear expired_partition_infos before the empty-return path.

When expire_partitions() returns no entries, this method exits without resetting self.expired_partition_infos, so the previous epoch's expirations leak into later snapshots. That makes downstream consumers read stale expiries as if they happened again.

Suggested fix
 fn expire_ledger_slots(&mut self, new_epoch_block: &IrysBlockHeader) {
         let epoch_height = new_epoch_block.height;
+        self.expired_partition_infos = None;
         let expired_partitions: Vec<ExpiringPartitionInfo> =
             self.ledgers.expire_partitions(epoch_height);
 
         // Return early if there's no more work to do
         if expired_partitions.is_empty() {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/domain/src/snapshots/epoch_snapshot/epoch_snapshot.rs` around lines
472 - 495, In expire_ledger_slots, ensure self.expired_partition_infos is
cleared before the early-return when expire_partitions(epoch_height) yields an
empty Vec; specifically, after computing expired_partitions (via
self.ledgers.expire_partitions) check if it's empty and set
self.expired_partition_infos = None (or an empty sentinel) before returning so
previous epoch expirations don't leak; keep the existing loop and assignment
behavior when expired_partitions is non-empty.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/commands/research-codebase.md:
- Line 30: Change the heading string "## Steps to follow after receiving the
research query:" to the shorter "## Steps to follow" by updating the heading
line in the file (replace the exact text "## Steps to follow after receiving the
research query:" with "## Steps to follow"); no other content needs to be
altered.
- Around line 22-26: The fenced code blocks in the response example and the
template block are missing language tags and surrounding blank lines, causing
markdownlint MD031/MD040 failures; update the example block that contains "I'm
ready to research the codebase..." to use a fenced block with a language tag
(e.g., ```text) and ensure there's a blank line before and after the block, and
similarly update the template block (the triple-fenced YAML/markdown template
around the later section) to include the appropriate language tag (e.g.,
```markdown) and blank lines so both fenced blocks conform to MD031/MD040;
changes should be made where the response example text and the template header
(the block starting with --- in the template) are defined.
- Line 10: Add a blank line immediately after markdown headings so list items do
not follow a heading on the next line (fix MD022); specifically insert an empty
line after the heading "## CRITICAL: YOUR ONLY JOB IS TO DOCUMENT AND EXPLAIN
THE CODEBASE AS IT EXISTS TODAY" and likewise add a blank line after the other
offending heading referenced in the comment so each heading is followed by one
blank line before the subsequent list items.

In `@crates/chain-tests/src/perm_ledger_expiry/mod.rs`:
- Around line 150-172: The current loop over perm_slots checks slot.partitions
after expiry, which is empty so assertions are vacuous; instead, iterate
epoch_snapshot.expired_partition_infos (or capture each slot.partitions into a
local variable before driving expiry) to obtain the expired partition hashes and
then call epoch_snapshot.partition_assignments.get_assignment(partition_hash)
(or get_assignment on the captured hash) and assert
assignment.ledger_id.is_none() for each expired partition; update the same logic
used at lines referenced (also applies to the similar block around perm_slots
handling at 362-405) to use epoch_snapshot.expired_partition_infos or pre-expiry
captured partitions and maintain the same panic/assert messages.
- Around line 623-639: The assertions currently only check
assignment.ledger_id.is_some(), which allows incorrect ledger kinds (e.g.,
Submit); update the checks to assert equality with Some(DataLedger::Publish as
u32) instead of is_some(). Locate the partition loop using last_slot.partitions
and partition_assignments (and the other similar block later) and replace the
is_some() assertion with a comparison against Some(DataLedger::Publish as u32)
so the test enforces the Publish ledger specifically.
- Around line 475-478: The info! call references an undeclared variable
num_slots; replace that with perm_slots.len() so the log reads the actual number
of permanent slots—update the info! invocation that currently uses num_slots and
slot0_last_height to use perm_slots.len() and slot0_last_height instead.

In `@crates/types/src/config/mod.rs`:
- Around line 849-880: Add a regression case in
test_publish_ledger_epoch_length_validation that triggers the checked_mul
overflow path in Config::validate: create a NodeConfig via
NodeConfig::testing(), set consensus.get_mut().epoch.publish_ledger_epoch_length
= Some(u64::MAX) and set consensus.get_mut().epoch.num_blocks_in_epoch = 2 (or
any value >1) so that publish_ledger_epoch_length * num_blocks_in_epoch
overflows, build the Config with Config::new_with_random_peer_id and assert that
config.validate().is_err() to cover the checked_mul failure branch.

---

Outside diff comments:
In `@crates/domain/src/snapshots/epoch_snapshot/epoch_snapshot.rs`:
- Around line 472-495: In expire_ledger_slots, ensure
self.expired_partition_infos is cleared before the early-return when
expire_partitions(epoch_height) yields an empty Vec; specifically, after
computing expired_partitions (via self.ledgers.expire_partitions) check if it's
empty and set self.expired_partition_infos = None (or an empty sentinel) before
returning so previous epoch expirations don't leak; keep the existing loop and
assignment behavior when expired_partitions is non-empty.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4cc88d7b-47ce-4f9e-9276-226bffac7918

📥 Commits

Reviewing files that changed from the base of the PR and between 592f5d4 and fb0c820.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (4)
  • .claude/commands/research-codebase.md
  • crates/chain-tests/src/perm_ledger_expiry/mod.rs
  • crates/domain/src/snapshots/epoch_snapshot/epoch_snapshot.rs
  • crates/types/src/config/mod.rs

- Fix num_slots compilation error from round 1 refactor
- Assert Publish ledger specifically instead of just is_some()
- Add overflow regression test for publish_ledger_epoch_length validation
- Fix remaining markdown lint issues
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/chain-tests/src/perm_ledger_expiry/mod.rs`:
- Line 624: The assertion numbering in the test comments is off: change the
comment string "Assertion 3: All last-slot partitions still have Publish ledger
assignment" to "Assertion 2: All last-slot partitions still have Publish ledger
assignment" and likewise renumber the later "Assertion 4" (near the call/assert
around line with function/test block that follows) to "Assertion 3" so the
sequence follows "Assertion 1", "Assertion 2", "Assertion 3"; update the two
comment lines in the perm_ledger_expiry test accordingly (search for the exact
comment text "All last-slot partitions still have Publish ledger assignment" and
the subsequent "Assertion 4" comment) to keep numbering consistent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6068b2dd-2ec7-4e9a-9036-a1bd9602c9c2

📥 Commits

Reviewing files that changed from the base of the PR and between fb0c820 and 4a6c89c.

📒 Files selected for processing (3)
  • .claude/commands/research-codebase.md
  • crates/chain-tests/src/perm_ledger_expiry/mod.rs
  • crates/types/src/config/mod.rs

Add partition counters to all capacity-pool assertion loops so they
fail if no partitions were actually checked. Prevents silent passes
if slot.partitions is ever empty after expiry.
@roberts-pumpurs roberts-pumpurs force-pushed the rob/perm-ledger-expiry branch from 742dc38 to 876a19b Compare March 10, 2026 12:41
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
crates/chain-tests/src/perm_ledger_expiry/mod.rs (1)

651-669: ⚠️ Potential issue | 🟠 Major

Guard these Publish-assignment assertions against empty partition lists.

Line 653 and Line 816 can both become vacuous: if a regression leaves the slot metadata alive but clears slot.partitions, the loop never runs and the test still logs success. Add a checked_partitions > 0 guard here too.

🔍 Add a non-vacuous partition count
     let partition_assignments = &epoch_snapshot.partition_assignments;
+    let mut checked_last_slot_partitions = 0_usize;
     for partition_hash in &last_slot.partitions {
         let assignment = partition_assignments
             .get_assignment(*partition_hash)
@@
         assert!(
             assignment.ledger_id == Some(DataLedger::Publish as u32),
             "Last-slot partition {:?} should still be assigned to Publish, but has ledger_id={:?}",
             partition_hash,
             assignment.ledger_id
         );
+        checked_last_slot_partitions += 1;
     }
+    assert!(
+        checked_last_slot_partitions > 0,
+        "Expected last publish slot to retain at least one partition"
+    );

Apply the same pattern to the multi-slot loop in heavy_perm_expiry_disabled_nothing_expires.

Also applies to: 813-832

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/chain-tests/src/perm_ledger_expiry/mod.rs` around lines 651 - 669, The
test currently iterates last_slot.partitions and the multi-slot loop in
heavy_perm_expiry_disabled_nothing_expires without asserting that partitions is
non-empty, so a cleared slot.partitions would make the loop vacuous; update both
places (the loop using epoch_snapshot.partition_assignments.get_assignment(...)
over last_slot.partitions and the analogous multi-slot loop in
heavy_perm_expiry_disabled_nothing_expires) to first check a counted variable
(checked_partitions > 0) and fail the test if zero, ensuring the assertions are
guarded against empty partition lists.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/chain-tests/src/perm_ledger_expiry/mod.rs`:
- Around line 286-295: The current logic computes a single target_height using
earliest.max(...) then rounds up, which can hide cases where the two ledgers
would round to different epoch boundaries; update the code to compute each
ledger's rounded expiry separately by computing perm_target =
(perm_earliest.div_ceil(BLOCKS_PER_EPOCH) * BLOCKS_PER_EPOCH) and submit_target
= (submit_earliest.div_ceil(BLOCKS_PER_EPOCH) * BLOCKS_PER_EPOCH), then assert
perm_target == submit_target (or panic/fail the test) before proceeding and use
that common value as target_height; reference the variables perm_earliest,
submit_earliest, perm_target, submit_target, BLOCKS_PER_EPOCH, and div_ceil.

---

Duplicate comments:
In `@crates/chain-tests/src/perm_ledger_expiry/mod.rs`:
- Around line 651-669: The test currently iterates last_slot.partitions and the
multi-slot loop in heavy_perm_expiry_disabled_nothing_expires without asserting
that partitions is non-empty, so a cleared slot.partitions would make the loop
vacuous; update both places (the loop using
epoch_snapshot.partition_assignments.get_assignment(...) over
last_slot.partitions and the analogous multi-slot loop in
heavy_perm_expiry_disabled_nothing_expires) to first check a counted variable
(checked_partitions > 0) and fail the test if zero, ensuring the assertions are
guarded against empty partition lists.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2dc05744-29ca-4335-8f7d-9d855022b74c

📥 Commits

Reviewing files that changed from the base of the PR and between 742dc38 and 876a19b.

📒 Files selected for processing (1)
  • crates/chain-tests/src/perm_ledger_expiry/mod.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants