Skip to content

Conversation

@wqfish
Copy link
Contributor

@wqfish wqfish commented Dec 19, 2025

Previously, in order to find all changes between two States, we could just "diff" the two map layers and we would get all the KVs that are changed.

This is insufficient with tiered storage. For instance, consider the following sequence of events:

  1. At the end of block N, key K is read and promoted to hot and its value is V0 in both hot state and cold state.
  2. At the end of block N+1, its value is updated to V1 in hot state.
  3. At the end of block N+2, it's evicted and the value in cold state is updated to V1.
  4. At the end of block N+3, it's read and promoted into hot state again.

If we simply look at the StateDelta in the current implementation, we'd see the value change in hot state, but miss the change from V0 to V1 in the cold state.

So we need to expose all the inner layers in order to accurately compute what changed between two points in time. This PR simply adds the API on LayeredMap.


Note

Introduces per-layer views and parent linking to support inspecting all changes between two layers.

  • Add parent: Weak<Self> to LayerInner and set it in new_family/spawn; children remain tracked for drop safety
  • Expose MapLayer::parent() to navigate layer ancestry
  • Add LayeredMap::inner_maps() returning the sequence of inner maps between (base_layer, top_layer] in order
  • Extend tests: helpers naive_view_layer, verify inner_maps() correctness for get() and iter()

Written by Cursor Bugbot for commit 651ec78. This will update automatically on new commits. Configure here.

@wqfish wqfish mentioned this pull request Dec 19, 2025
@wqfish wqfish changed the title inner layers [Layered Map] Expose inner layers Dec 26, 2025
@wqfish wqfish force-pushed the pr18353 branch 4 times, most recently from 5ac051f to 03d935c Compare December 26, 2025 05:01
@wqfish wqfish marked this pull request as ready for review December 26, 2025 05:03
@wqfish wqfish requested a review from zekun000 December 26, 2025 05:03
@wqfish wqfish changed the title [Layered Map] Expose inner layers inner layers Dec 26, 2025
@wqfish wqfish changed the title inner layers [Layered Map] Expose inner layers Dec 26, 2025
@wqfish wqfish force-pushed the pr18353 branch 3 times, most recently from bbdf3e8 to c9185c1 Compare January 1, 2026 20:26
@wqfish wqfish force-pushed the pr18353 branch 2 times, most recently from ad6e4f6 to 11b3d5a Compare January 9, 2026 20:19
Previously, in order to find all changes between two `State`s, we could just "diff" the two map layers and we would get all the KVs that are changed.

This is insufficient with tiered storage. For instance, consider the following sequence of events:
1. At the end of block `N`, key `K` is read and promoted to hot and its value is `V0` in both hot state and cold state.
2. At the end of block `N+1`, its value is updated to `V1` in hot state.
3. At the end of block `N+2`, it's evicted and the value in cold state is updated to `V1`.
4. At the end of block `N+3`, it's read and promoted into hot state again.

If we simply look at the `StateDelta` in the current implementation, we'd see the value change in hot state, but miss the change from `V0` to `V1` in the cold state.

So we need to expose all the inner layers in order to accurately compute what changed between two points in time. This PR simply adds the API on `LayeredMap`.
@wqfish wqfish enabled auto-merge (rebase) January 14, 2026 07:11
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

✅ Forge suite compat success on a3ff6eef75d8e0a24caea52de8522a4e28bd1873 ==> 651ec782214b910fc1418f928edfcfd1b41f2570

Compatibility test results for a3ff6eef75d8e0a24caea52de8522a4e28bd1873 ==> 651ec782214b910fc1418f928edfcfd1b41f2570 (PR)
1. Check liveness of validators at old version: a3ff6eef75d8e0a24caea52de8522a4e28bd1873
compatibility::simple-validator-upgrade::liveness-check : committed: 12720.15 txn/s, latency: 2726.82 ms, (p50: 2800 ms, p70: 3000, p90: 3400 ms, p99: 3700 ms), latency samples: 418520
2. Upgrading first Validator to new version: 651ec782214b910fc1418f928edfcfd1b41f2570
compatibility::simple-validator-upgrade::single-validator-upgrade : committed: 5788.76 txn/s, latency: 5839.16 ms, (p50: 6500 ms, p70: 6600, p90: 6700 ms, p99: 6800 ms), latency samples: 200180
3. Upgrading rest of first batch to new version: 651ec782214b910fc1418f928edfcfd1b41f2570
compatibility::simple-validator-upgrade::half-validator-upgrade : committed: 5896.11 txn/s, latency: 5755.49 ms, (p50: 6300 ms, p70: 6600, p90: 6700 ms, p99: 6700 ms), latency samples: 203360
4. upgrading second batch to new version: 651ec782214b910fc1418f928edfcfd1b41f2570
compatibility::simple-validator-upgrade::rest-validator-upgrade : committed: 9930.14 txn/s, latency: 3276.34 ms, (p50: 3100 ms, p70: 3700, p90: 4400 ms, p99: 5200 ms), latency samples: 333840
5. check swarm health
Compatibility test for a3ff6eef75d8e0a24caea52de8522a4e28bd1873 ==> 651ec782214b910fc1418f928edfcfd1b41f2570 passed
Test Ok

@github-actions
Copy link
Contributor

✅ Forge suite realistic_env_max_load success on 651ec782214b910fc1418f928edfcfd1b41f2570

Forge report malformed: Expecting property name enclosed in double quotes: line 11 column 1 (char 235)
'{\n  "metrics": [\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "submitted_txn",\n      "value": 5099820.0\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "expired_txn",\n[2026-01-14T07:50:34Z INFO  aptos_forge::report] Test Ok\n      "value": 0.0\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "avg_tps",\n      "value": 13695.224805313106\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "avg_latency",\n      "value": 2757.99514551494\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "p50_latency",\n      "value": 2700.0\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "p90_latency",\n      "value": 3000.0\n    },\n    {\n      "test_name": "two traffics test: inner traffic",\n      "metric": "p99_latency",\n      "value": 3500.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "submitted_txn",\n      "value": 42680.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "expired_txn",\n      "value": 0.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "avg_tps",\n      "value": 100.00503419156242\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "avg_latency",\n      "value": 761.4880952380952\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "p50_latency",\n      "value": 700.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "p90_latency",\n      "value": 900.0\n    },\n    {\n      "test_name": "two traffics test",\n      "metric": "p99_latency",\n      "value": 1700.0\n    }\n  ],\n  "text": "two traffics test: inner traffic : committed: 13695.22 txn/s, latency: 2758.00 ms, (p50: 2700 ms, p70: 2900, p90: 3000 ms, p99: 3500 ms), latency samples: 5099820\\ntwo traffics test : committed: 100.01 txn/s, latency: 761.49 ms, (p50: 700 ms, p70: 800, p90: 900 ms, p99: 1700 ms), latency samples: 1680\\nLatency breakdown for phase 0: [\\"MempoolToBlockCreation: max: 2.248, avg: 2.177\\", \\"ConsensusProposalToOrdered: max: 0.167, avg: 0.166\\", \\"ConsensusOrderedToCommit: max: 0.043, avg: 0.040\\", \\"ConsensusProposalToCommit: max: 0.210, avg: 0.206\\"]\\nMax non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.43s no progress at version 5592041 (avg 0.07s) [limit 15].\\nMax epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.33s no progress at version 2444708 (avg 0.33s) [limit 16].\\nTest Ok"\n}'
Trailing Log Lines:
networkchaos.chaos-mesh.org "4-gcp--as-southeast1-to-3-gcp--us-east4-netem" deleted
test CompositeNetworkTest ... ok
Test Statistics: 
two traffics test: inner traffic : committed: 13695.22 txn/s, latency: 2758.00 ms, (p50: 2700 ms, p70: 2900, p90: 3000 ms, p99: 3500 ms), latency samples: 5099820
two traffics test : committed: 100.01 txn/s, latency: 761.49 ms, (p50: 700 ms, p70: 800, p90: 900 ms, p99: 1700 ms), latency samples: 1680
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 2.248, avg: 2.177", "ConsensusProposalToOrdered: max: 0.167, avg: 0.166", "ConsensusOrderedToCommit: max: 0.043, avg: 0.040", "ConsensusProposalToCommit: max: 0.210, avg: 0.206"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.43s no progress at version 5592041 (avg 0.07s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.33s no progress at version 2444708 (avg 0.33s) [limit 16].
Test Ok

=== BEGIN JUNIT ===
<?xml version="1.0" encoding="UTF-8"?>
<testsuites name="forge" tests="1" failures="0" errors="0" uuid="9167dc0a-9af3-4206-a59c-f3390ed564ce">
    <testsuite name="local" tests="1" disabled="0" errors="0" failures="0">
        <testcase name="CompositeNetworkTest(network:multi-region-network-emulation(two traffics test)) with ">
        </testcase>
    </testsuite>
</testsuites>
=== END JUNIT ===
[2026-01-14T07:50:34Z INFO  aptos_forge::backend::k8s::cluster_helper] Deleting namespace forge-e2e-pr-18353: Some(NamespaceStatus { conditions: None, phase: Some("Terminating") })
[2026-01-14T07:50:34Z INFO  aptos_forge::backend::k8s::cluster_helper] aptos-node resources for Forge removed in namespace: forge-e2e-pr-18353

test result: ok. 1 passed; 0 soft failed; 0 hard failed; 0 filtered out

Debugging output:
NAME                                         READY   STATUS      RESTARTS   AGE
aptos-node-0-fullnode-eforge50d734b7-0       1/1     Running     0          12m
aptos-node-0-validator-0                     1/1     Running     0          12m
aptos-node-1-fullnode-eforge50d734b7-0       1/1     Running     0          12m
aptos-node-1-validator-0                     1/1     Running     0          12m
aptos-node-2-fullnode-eforge50d734b7-0       1/1     Running     0          12m
aptos-node-2-validator-0                     1/1     Running     0          12m
aptos-node-3-fullnode-eforge50d734b7-0       1/1     Running     0          12m
aptos-node-3-validator-0                     1/1     Running     0          12m
aptos-node-4-fullnode-eforge50d734b7-0       1/1     Running     0          12m
aptos-node-4-validator-0                     1/1     Running     0          12m
aptos-node-5-validator-0                     1/1     Running     0          12m
aptos-node-6-validator-0                     1/1     Running     0          12m
forge-testnet-deployer-zphpm                 0/1     Completed   0          12m
genesis-aptos-genesis-eforge50d734b7-rvlcx   0/1     Completed   0          12m

@github-actions
Copy link
Contributor

✅ Forge suite framework_upgrade success on a3ff6eef75d8e0a24caea52de8522a4e28bd1873 ==> 651ec782214b910fc1418f928edfcfd1b41f2570

Compatibility test results for a3ff6eef75d8e0a24caea52de8522a4e28bd1873 ==> 651ec782214b910fc1418f928edfcfd1b41f2570 (PR)
Upgrade the nodes to version: 651ec782214b910fc1418f928edfcfd1b41f2570
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 2265.64 txn/s, submitted: 2271.54 txn/s, failed submission: 5.90 txn/s, expired: 5.90 txn/s, latency: 1295.41 ms, (p50: 1200 ms, p70: 1500, p90: 1500 ms, p99: 2300 ms), latency samples: 207420
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 2304.71 txn/s, submitted: 2312.01 txn/s, failed submission: 7.31 txn/s, expired: 7.31 txn/s, latency: 1280.65 ms, (p50: 1200 ms, p70: 1500, p90: 1500 ms, p99: 2700 ms), latency samples: 208201
5. check swarm health
Compatibility test for a3ff6eef75d8e0a24caea52de8522a4e28bd1873 ==> 651ec782214b910fc1418f928edfcfd1b41f2570 passed
Upgrade the remaining nodes to version: 651ec782214b910fc1418f928edfcfd1b41f2570
framework_upgrade::framework-upgrade::full-framework-upgrade : committed: 2498.54 txn/s, submitted: 2508.30 txn/s, failed submission: 9.76 txn/s, expired: 9.76 txn/s, latency: 1182.37 ms, (p50: 1200 ms, p70: 1200, p90: 1500 ms, p99: 2100 ms), latency samples: 220183
Test Ok

@wqfish wqfish merged commit ecb203d into main Jan 14, 2026
119 of 132 checks passed
@wqfish wqfish deleted the pr18353 branch January 14, 2026 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants