Skip to content

feat: make sure mithril package files are stable#722

Merged
jeluard merged 1 commit intomainfrom
jeluard/better-mithril
Mar 12, 2026
Merged

feat: make sure mithril package files are stable#722
jeluard merged 1 commit intomainfrom
jeluard/better-mithril

Conversation

@jeluard
Copy link
Contributor

@jeluard jeluard commented Mar 12, 2026

Improve mithril package creation so that they are stable and not overlapping packages are created.

Summary by CodeRabbit

  • New Features

    • Resume block packaging from the last checkpoint to continue interrupted operations.
    • In-memory archive creation with deterministic archive naming and zeroed timestamps for reproducible archives.
    • Packaging now operates on structured block identifiers to improve archive organization and resume behavior.
    • Enhanced logging around resume points and archive handling.
  • Bug Fixes

    • Prevents duplicate archives and correctly retains or replaces tail archives during packaging.
  • Tests

    • Added tests for archive naming, boundary parsing, latest-selection, and resume-point logic.

Copilot AI review requested due to automatic review settings March 12, 2026 11:00
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

Walkthrough

Adds in-memory tar.gz archive construction and archive metadata parsing, computes resume points from existing archives, and implements resumable block packaging with tail-archive replacement, duplicate avoidance, and enhanced logging and tests.

Changes

Cohort / File(s) Summary
Mithril Ledger Archive Management
crates/amaru/src/bin/ledger/cmd/mithril.rs
Introduces BLOCKS_PER_ARCHIVE, switches to GzBuilder for in-memory tar.gz creation, uses Point keys (BTreeSet/BTreeMap), builds archives via build_archive_bytes, sets header mtime=0, and adds archive naming/path helpers. Adds resume-point computation from existing archives and integrates resume logic into block reading and packaging loop (skip resume marker, replace tail archive, avoid duplicates). Logging extended to include resume_point and archive actions.
Archive Utilities & Parsing
crates/amaru/src/bin/ledger/cmd/mithril.rs (new helpers)
Adds archive utilities: blocks_dir, archive_name_for_blocks, archive_path_for_blocks, list_existing_archives, parse_archive_point, ArchiveMetadata, parse_archive_metadata, parse_archive_bounds, latest_archive_end_point, sorted_archives, latest_archive, and resume_point_for_archives.
Packaging Flow & Resume Integration
crates/amaru/src/bin/ledger/cmd/mithril.rs (run loop changes)
Reworks packaging loop to read blocks as BTreeMap<Point, &Vec<u8>>, compute archive names from Point, handle tail vs non-tail batch logic, replace tail archives when needed, and integrate from_chunk_for_resume_point.
Tests
crates/amaru/src/bin/ledger/cmd/mithril.rs (tests added)
Adds tests covering archive naming, bounds/metadata parsing, latest-archive selection, and resume-point calculation.
Imports & Small Adjustments
crates/amaru/src/bin/ledger/cmd/mithril.rs
Adds BTreeSet import, replaces GzEncoder usage with GzBuilder, and minor logging/variable adjustments (+263/-29 lines).

Sequence Diagram(s)

sequenceDiagram
    participant System
    participant Archives
    participant BlockLoader
    participant Archiver
    participant Storage

    System->>Archives: list_existing_archives()
    Archives-->>System: archive metadata list
    System->>System: parse_archive_bounds() / resume_point_for_archives()

    rect rgba(100,150,200,0.5)
    Note over System,BlockLoader: Resume from computed point
    System->>BlockLoader: read_blocks(from=resume_point)
    BlockLoader-->>System: batch of blocks (BTreeMap<Point, Vec<u8>>)
    end

    System->>Archiver: prepare batch (skip resume marker if present)
    Archiver->>Archiver: group into BLOCKS_PER_ARCHIVE, archive_name_for_blocks()
    Archiver->>Archiver: build_archive_bytes() (tar.gz in-memory)

    rect rgba(150,200,100,0.5)
    Note over Archiver,Storage: Archive Management
    Archiver->>Storage: check archive exists
    alt tail archive should be replaced
        Storage-->>Archiver: exists (tail)
        Archiver->>Storage: replace archive
    else new archive
        Storage-->>Archiver: not found
        Archiver->>Storage: write new archive
    end
    end

    System->>System: update resume_point & log progress
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

A jaunty tar rolls up, in-memory and neat,
Resume marks like savegames keep our progress sweet,
Tail swapped, no dupes — the archive dance is chic,
Blocks march in order, headers frozen, oh so sleek,
Logs wink like NPCs: "All good, mate — continue the streak."

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 39.13% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: make sure mithril package files are stable' accurately captures the main objective of improving mithril package creation stability and preventing overlapping packages, which aligns with the core changes in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch jeluard/better-mithril

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jeluard jeluard force-pushed the jeluard/better-mithril branch from facf993 to e5aad49 Compare March 12, 2026 11:05
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/amaru/src/bin/ledger/cmd/mithril.rs`:
- Around line 201-207: archive_name_for_blocks currently derives ordering from
BTreeMap<String, &Vec<u8>> which sorts lexicographically and breaks numeric slot
ordering; change the map key to a structured ordering key (e.g. Point or a tuple
like (slot: u64, hash: String) used elsewhere) and use that for first/last
lookup in archive_name_for_blocks and the same related code at the other
occurrence (lines referenced 452-459), then only format the human-readable
filename (format!("{first}__{last}.tar.gz")) at the point of writing the archive
so ordering is based on numeric slot/hash rather than rendered filenames.
- Around line 164-173: The current package_blocks implementation writes the new
archive directly then removes the old tail, which can leave a missing or partial
archive if the process dies in between; change package_blocks (and the other
archive-writing sites in this file that use
archive_path_for_blocks/blocks_dir/build_archive_bytes) to a safe write pattern:
create a temporary file in the same directory (e.g., archive_path + ".tmp" or
use a unique suffix), write compressed bytes to that temp file, flush and sync
the file to disk, atomically rename (fs::rename) the temp file to the final
archive_path, and only after the rename remove any stale tail file; ensure you
use the same blocks_dir/ archive_path_for_blocks logic so rename is on the same
filesystem and replicate this pattern for the other places that write .tar.gz
archives.
- Around line 430-433: The calculation of from_chunk can underflow when
resume_point is Point::Origin or in chunk 0; change the subtraction logic in the
block around resume_point_for_archives, get_latest_chunk and
infer_chunk_from_slot so you never subtract 1 from 0 — use a saturating
subtraction (e.g., chunk.saturating_sub(1)) or a checked/conditional branch to
clamp at 0 when calling
infer_chunk_from_slot(resume_point.slot_or_default().into()) before assigning to
from_chunk; update the assignment of from_chunk to use that safe value instead
of the raw `- 1`.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bfd8e5fe-5eae-4945-ad9b-a9798cffa51f

📥 Commits

Reviewing files that changed from the base of the PR and between acc113f and facf993.

📒 Files selected for processing (1)
  • crates/amaru/src/bin/ledger/cmd/mithril.rs

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
crates/amaru/src/bin/ledger/cmd/mithril.rs (3)

164-173: ⚠️ Potential issue | 🟠 Major

Make the tail swap crash-safe.

Lines 170-171 write straight to the final archive, Lines 220-224 trust any *.tar.gz filename, and Lines 474-476 delete the old tail before the replacement is durable. If the process cops it there, the next resume can happily treat a truncated file as valid. Write to a temp file in the same directory, sync_all, rename, then remove the stale tail only after the rename lands.

🛠️ Safer write path
 fn package_blocks(network: &NetworkName, blocks: &BTreeMap<String, &Vec<u8>>) -> io::Result<String> {
     let compressed = build_archive_bytes(blocks)?;

     let dir = blocks_dir(*network);
     fs::create_dir_all(&dir)?;
     let archive_path = archive_path_for_blocks(network, blocks).expect("blocks map is non-empty here by construction");
-    let mut file = File::create(&archive_path)?;
-    file.write_all(&compressed)?;
+    let tmp_path = format!("{archive_path}.tmp");
+    let mut file = File::create(&tmp_path)?;
+    file.write_all(&compressed)?;
+    file.sync_all()?;
+    fs::rename(&tmp_path, &archive_path)?;

     Ok(archive_path)
 }

Then move the remove_file() block so it runs only after package_blocks() succeeds.

Also applies to: 214-224, 473-477

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru/src/bin/ledger/cmd/mithril.rs` around lines 164 - 173,
package_blocks currently writes the compressed archive directly to the final
archive path which can lead to truncated files on crash; change it to write to a
temporary file in the same directory (use blocks_dir and archive_path_for_blocks
to derive path), flush and sync the temp file (call sync_all on the File or its
parent dir), then atomically rename the temp into place before returning the
final path; also move any remove_file(old_tail) logic so it executes only after
package_blocks returns successfully (i.e., after the rename has landed) to
ensure the stale tail is deleted only when the new archive is durably installed.

201-207: ⚠️ Potential issue | 🟠 Major

Don’t use rendered filenames as the ordering key.

Because this is a BTreeMap<String, _>, 99999...cbor and 100000...cbor sort like strings, not slots. That can skew archive bounds, tar member order, and the tail-match check once the slot width changes. Point already has an Ord, so keying by Point (or (slot, hash)) and formatting the filename only when writing keeps the ordering stable.

Also applies to: 452-459

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru/src/bin/ledger/cmd/mithril.rs` around lines 201 - 207, The
archive_name_for_blocks function currently uses rendered filename strings as
BTreeMap keys which orders lexicographically (e.g., "99999.cbor" <
"100000.cbor") and breaks slot ordering; change the map key to a typed ordering
(use Point or a (slot, hash) tuple) instead of String (e.g., BTreeMap<Point,
&Vec<u8>>), update any callers that build or iterate the map (including the
related logic at the other mentioned block range) to insert and query by Point,
and only format the archive member names with .cbor when creating the filename
in archive_name_for_blocks (or equivalent writer) so ordering, tail-match
checks, and tar member order use the natural Ord of Point. Ensure function
signatures that reference archive_name_for_blocks and the map type are updated
accordingly.

287-295: ⚠️ Potential issue | 🔴 Critical

Clamp both chunk rewinds at zero.

Line 294 and Line 433 both do a raw - 1. On chunk 0 or Point::Origin, that either panics with overflow checks or wraps to the moon, which is a proper boss fight for the bootstrap path. saturating_sub(1) in both places keeps the rewind sane.

🧯 Minimal fix
     if immutable_dir.try_exists()? {
         return Ok(fs::read_dir(immutable_dir)?
             .filter_map(Result::ok)
             .filter_map(|entry| entry.path().file_name()?.to_str().map(|s| s.to_owned()))
             .filter_map(|name| name.strip_suffix(".chunk").and_then(|id| id.parse::<u64>().ok()))
             .max()
-            .map(|n| n - 1)); // Last immutable might not be finalized (hint from JP from Mithril team)
+            .map(|n| n.saturating_sub(1))); // Last immutable might not be finalized (hint from JP from Mithril team)
     }
@@
-    let from_chunk = latest_chunk.unwrap_or(infer_chunk_from_slot(resume_point.slot_or_default().into()) - 1);
+    let from_chunk = latest_chunk.unwrap_or_else(|| {
+        infer_chunk_from_slot(resume_point.slot_or_default().into()).saturating_sub(1)
+    });

Also applies to: 430-433

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru/src/bin/ledger/cmd/mithril.rs` around lines 287 - 295, Replace
the raw subtraction that can underflow with saturating subtraction in both
places: in get_latest_chunk() change the closure .map(|n| n - 1) to .map(|n|
n.saturating_sub(1)), and make the same change where the code rewinds a
Point/chunk near the use of Point::Origin (the other raw - 1 at the second
occurrence) to use saturating_sub(1) so a zero value clamps to zero instead of
underflowing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/amaru/src/bin/ledger/cmd/mithril.rs`:
- Around line 503-506: Remove the unused imports BLOCKS_PER_ARCHIVE,
build_archive_bytes, and parse_archive_metadata from the use super{...} list in
mithril.rs so the compiler warning is resolved; locate the use statement that
currently imports ArchiveMetadata, BLOCKS_PER_ARCHIVE, archive_name_for_blocks,
build_archive_bytes, latest_archive, latest_archive_end_point,
parse_archive_bounds, parse_archive_metadata, resume_point_for_archives and
delete only the three unused symbols (BLOCKS_PER_ARCHIVE, build_archive_bytes,
parse_archive_metadata).
- Around line 276-281: Change resume_point_for_archives so that when there are
zero or one parsed archives it returns Point::Origin instead of falling back to
tip: update the function resume_point_for_archives to use Point::Origin as the
default fallback (instead of tip) and ensure any code that reads from that point
and then calls skip(1) (the consumer around the resume logic) will have at least
the genesis available; add a regression test exercising 0-archive and 1-archive
cases to assert blocks are produced (no empty iterator/crash) and document that
if rebuilding from origin is undesirable a persisted boundary must be introduced
before the first archive.

---

Duplicate comments:
In `@crates/amaru/src/bin/ledger/cmd/mithril.rs`:
- Around line 164-173: package_blocks currently writes the compressed archive
directly to the final archive path which can lead to truncated files on crash;
change it to write to a temporary file in the same directory (use blocks_dir and
archive_path_for_blocks to derive path), flush and sync the temp file (call
sync_all on the File or its parent dir), then atomically rename the temp into
place before returning the final path; also move any remove_file(old_tail) logic
so it executes only after package_blocks returns successfully (i.e., after the
rename has landed) to ensure the stale tail is deleted only when the new archive
is durably installed.
- Around line 201-207: The archive_name_for_blocks function currently uses
rendered filename strings as BTreeMap keys which orders lexicographically (e.g.,
"99999.cbor" < "100000.cbor") and breaks slot ordering; change the map key to a
typed ordering (use Point or a (slot, hash) tuple) instead of String (e.g.,
BTreeMap<Point, &Vec<u8>>), update any callers that build or iterate the map
(including the related logic at the other mentioned block range) to insert and
query by Point, and only format the archive member names with .cbor when
creating the filename in archive_name_for_blocks (or equivalent writer) so
ordering, tail-match checks, and tar member order use the natural Ord of Point.
Ensure function signatures that reference archive_name_for_blocks and the map
type are updated accordingly.
- Around line 287-295: Replace the raw subtraction that can underflow with
saturating subtraction in both places: in get_latest_chunk() change the closure
.map(|n| n - 1) to .map(|n| n.saturating_sub(1)), and make the same change where
the code rewinds a Point/chunk near the use of Point::Origin (the other raw - 1
at the second occurrence) to use saturating_sub(1) so a zero value clamps to
zero instead of underflowing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7b600536-f0c9-403a-969b-c53cd5e653f7

📥 Commits

Reviewing files that changed from the base of the PR and between facf993 and e5aad49.

📒 Files selected for processing (1)
  • crates/amaru/src/bin/ledger/cmd/mithril.rs

@codecov
Copy link

codecov bot commented Mar 12, 2026

Codecov Report

❌ Patch coverage is 58.28571% with 73 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/amaru/src/bin/ledger/cmd/mithril.rs 58.28% 73 Missing ⚠️
Files with missing lines Coverage Δ
crates/amaru/src/bin/ledger/cmd/mithril.rs 32.07% <58.28%> (+32.07%) ⬆️

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: jeluard <jeluard@users.noreply.github.com>
@jeluard jeluard force-pushed the jeluard/better-mithril branch from e5aad49 to b92f9f8 Compare March 12, 2026 11:55
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
crates/amaru/src/bin/ledger/cmd/mithril.rs (1)

164-173: ⚠️ Potential issue | 🟠 Major

Make the tail swap atomic.

Still a bit of a Dark Souls checkpoint, mate: the old tail gets deleted before the replacement is safely on disk, and package_blocks writes straight to the final path. If the process cops a crash in that window, the next resume can see either no tail or a half-written archive with the final filename. Write to a temp file in the same directory, sync_all, rename, and only then remove the stale tail.

Also applies to: 474-477

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/amaru/src/bin/ledger/cmd/mithril.rs` around lines 164 - 173,
package_blocks currently writes the final archive directly and deletes the old
tail before ensuring the new file is safely on disk; change it to write to a
uniquely-named temp file in the same directory returned by blocks_dir(network),
write the compressed bytes there, call file.sync_all() and then directory
handle.sync_all() (open the dir with File::open(&dir) for syncing), atomically
rename the temp file to the path returned by archive_path_for_blocks(network,
blocks) using std::fs::rename, and only after a successful rename remove the old
tail; apply the same temp-write + sync + rename pattern to any other call sites
that write archives via archive_path_for_blocks to avoid half-written files.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@crates/amaru/src/bin/ledger/cmd/mithril.rs`:
- Around line 164-173: package_blocks currently writes the final archive
directly and deletes the old tail before ensuring the new file is safely on
disk; change it to write to a uniquely-named temp file in the same directory
returned by blocks_dir(network), write the compressed bytes there, call
file.sync_all() and then directory handle.sync_all() (open the dir with
File::open(&dir) for syncing), atomically rename the temp file to the path
returned by archive_path_for_blocks(network, blocks) using std::fs::rename, and
only after a successful rename remove the old tail; apply the same temp-write +
sync + rename pattern to any other call sites that write archives via
archive_path_for_blocks to avoid half-written files.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3b9b597b-e8d7-43e0-a638-1f96bee0b666

📥 Commits

Reviewing files that changed from the base of the PR and between e5aad49 and b92f9f8.

📒 Files selected for processing (1)
  • crates/amaru/src/bin/ledger/cmd/mithril.rs

@jeluard jeluard merged commit 1de248b into main Mar 12, 2026
40 of 41 checks passed
@jeluard jeluard deleted the jeluard/better-mithril branch March 12, 2026 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants