Skip to content

Conversation

@yamingk
Copy link
Contributor

@yamingk yamingk commented Nov 7, 2025

Problem Statement:

Replying logs on a graceful shutdown is unnecessary and will slow down the speed for HomeBlks/HomeStore opening up IO traffic unnecessarily. When HomeBlks/HomeStore is being gracefully shutdown, we want to avoid log replay on the next recovery boot.

Solution:

Add check in solo repl dev on_log_found to avoid replying the log if the lsn is already committed and cp-flushed.
Add assert in HomeBlks (PR in a different repo eBay/HomeBlocks#146) that for a graceful shutdown, no log reply should happen.

Also fixed one issue in log truncation that we should set in-memory-only to false.

Testing:

Without the fix, running below command will fail on restart boot (because of a known race that is being fixed, which is not relavant in this PR, the fix for that issue is being worked on) consistently.
With the fix, it could pass, and we can verify there is no log reply happened.

./Debug/src/lib/volume/tests/test_volume_io --gtest_filter=VolumeIOTest.LongRunningRandomIO --num_restarts=4 --num_vols=32 --write_num_io=300 --read_num_io=300 --dev_size_mb=1024000 --run_time=300

I am running longer hours with --num_restarts setting to 200 and --write_num_io=999999 for more aggressive testing.

@codecov-commenter
Copy link

codecov-commenter commented Nov 19, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 63.63636% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.61%. Comparing base (1a0cef8) to head (2c2f32b).
⚠️ Report is 291 commits behind head on master.

Files with missing lines Patch % Lines
src/lib/replication/repl_dev/solo_repl_dev.cpp 40.00% 1 Missing and 2 partials ⚠️
src/lib/homestore.cpp 80.00% 0 Missing and 1 partial ⚠️
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #827      +/-   ##
==========================================
- Coverage   56.51%   49.61%   -6.91%     
==========================================
  Files         108      110       +2     
  Lines       10300    11308    +1008     
  Branches     1402     5327    +3925     
==========================================
- Hits         5821     5610     -211     
+ Misses       3894     2085    -1809     
- Partials      585     3613    +3028     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@sanebay sanebay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

HS_REL_ASSERT_EQ(entry->major_version, repl_journal_entry::JOURNAL_ENTRY_MAJOR,
"Mismatched version of journal entry found");
// HS_LOG(DEBUG, solorepl, "SoloReplDev found journal entry at lsn={}", lsn);
LOGINFO("SoloReplDev log replay found journal entry at lsn={}", lsn);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEBUG LOG ?

@yamingk yamingk merged commit cf805b9 into eBay:master Dec 2, 2025
40 of 41 checks passed
@yamingk yamingk deleted the yk_cp branch December 2, 2025 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants