Skip to content

Conversation

@Tristan-Wilson
Copy link
Member

This test validates the dynamic batch sizing behavior introduced in commit 05ac6df, which allows DA providers to signal that a message is too large without triggering a fallback to the next writer.

When a DA provider returns ErrMessageTooLarge, the batch poster should:

  1. Re-query GetMaxMessageSize() to learn the new size limit
  2. Rebuild a smaller batch that fits within the limit
  3. Post to the SAME DA provider (not fall back to calldata)

The test has two phases:

  • Phase 1: Posts ~10KB batches with initial max size of 10KB
  • Phase 2: Reduces max size to 5KB mid-stream, verifying that subsequent batches are rebuilt smaller rather than falling back

fixes NIT-4158

This test validates the dynamic batch sizing behavior introduced in
commit 05ac6df, which allows DA providers to signal that a message
is too large without triggering a fallback to the next writer.

When a DA provider returns ErrMessageTooLarge, the batch poster should:
1. Re-query GetMaxMessageSize() to learn the new size limit
2. Rebuild a smaller batch that fits within the limit
3. Post to the SAME DA provider (not fall back to calldata)

The test has two phases:
- Phase 1: Posts ~10KB batches with initial max size of 10KB
- Phase 2: Reduces max size to 5KB mid-stream, verifying that subsequent
  batches are rebuilt smaller rather than falling back
@codecov
Copy link

codecov bot commented Dec 30, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 34.57%. Comparing base (893facb) to head (89dd964).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4183      +/-   ##
==========================================
+ Coverage   32.69%   34.57%   +1.88%     
==========================================
  Files         476      476              
  Lines       56785    56785              
==========================================
+ Hits        18564    19633    +1069     
+ Misses      34986    33673    -1313     
- Partials     3235     3479     +244     

@github-actions
Copy link
Contributor

github-actions bot commented Dec 30, 2025

❌ 5 Tests Failed:

Tests completed Failed Passed Skipped
4026 5 4021 0
View the top 3 failed tests by shortest run time
TestArbOSVersion50
Stack Traces | 5.920s run time
... [CONTENT TRUNCATED: Keeping last 20 lines]
TRACE[01-28|19:07:18.311] Performed indexed log search             begin=28 end=28 "true matches"=1 "false positives"=0 elapsed="80.13µs"
    precompile_inclusion_test.go:94: goroutine 410360 [running]:
        runtime/debug.Stack()
        	/opt/hostedtoolcache/go/1.25.6/x64/src/runtime/debug/stack.go:26 +0x5e
        github.com/offchainlabs/nitro/util/testhelpers.RequireImpl({0x4096910, 0xc006610380}, {0x40536e0, 0xc0169115c0}, {0x0, 0x0, 0x0})
        	/home/runner/work/nitro/nitro/util/testhelpers/testhelpers.go:29 +0x55
        github.com/offchainlabs/nitro/system_tests.Require(0xc006610380, {0x40536e0, 0xc0169115c0}, {0x0, 0x0, 0x0})
        	/home/runner/work/nitro/nitro/system_tests/common_test.go:2049 +0x5d
        github.com/offchainlabs/nitro/system_tests.testPrecompiles(0xc006610380, 0x32, {0xc006553e88, 0x3, 0x5c0f820?})
        	/home/runner/work/nitro/nitro/system_tests/precompile_inclusion_test.go:94 +0x371
        github.com/offchainlabs/nitro/system_tests.TestArbOSVersion50(0xc006610380?)
        	/home/runner/work/nitro/nitro/system_tests/precompile_inclusion_test.go:75 +0x3ef
        testing.tRunner(0xc006610380, 0x3cd3ed0)
        	/opt/hostedtoolcache/go/1.25.6/x64/src/testing/testing.go:1934 +0xea
        created by testing.(*T).Run in goroutine 1
        	/opt/hostedtoolcache/go/1.25.6/x64/src/testing/testing.go:1997 +0x465
        
    precompile_inclusion_test.go:94: �[31;1m [] execution aborted (timeout = 5s) �[0;0m
TRACE[01-28|19:07:24.187] Handled RPC response                     reqid=9    duration="2.304µs"
--- FAIL: TestArbOSVersion50 (5.92s)
TestArbOSVersion60
Stack Traces | 6.020s run time
... [CONTENT TRUNCATED: Keeping last 20 lines]
DEBUG[01-28|19:07:24.466] Executing EVM call finished              runtime="154.701µs"
DEBUG[01-28|19:07:24.466] Served eth_call                          reqid=7706  duration="193.203µs"
TRACE[01-28|19:07:24.466] Handled RPC response                     reqid=7706  duration="1.292µs"
DEBUG[01-28|19:07:24.466] Executing EVM call finished              runtime="109.716µs"
DEBUG[01-28|19:07:24.466] Served eth_call                          reqid=7707  duration="137.268µs"
TRACE[01-28|19:07:24.466] Handled RPC response                     reqid=7707  duration=962ns
DEBUG[01-28|19:07:24.466] Pushed sync data from consensus to execution synced=true  maxMessageCount=278 updatedAt=2026-01-28T19:07:24+0000 hasProgressMap=false
DEBUG[01-28|19:07:24.466] Served eth_getBalance                    reqid=7705  duration="693.365µs"
TRACE[01-28|19:07:24.466] Handled RPC response                     reqid=7705  duration="1.062µs"
INFO [01-28|19:07:24.466] New local node record                    seq=1,769,627,244,466 id=87d3df85c3c40d24   ip=127.0.0.1 udp=0 tcp=0
INFO [01-28|19:07:24.466] Started P2P networking                   self=enode://b6abdf2a541ad3230d5704bc6f0fabce0e08196a1f8d94fe2e75bbd3b9f8e385ccf6f40e1a24ed0bcc2a9b4058fb2b0395268ae00b3be72ba83c56882806105b@127.0.0.1:0
DEBUG[01-28|19:07:24.466] Executing EVM call finished              runtime="274.857µs"
DEBUG[01-28|19:07:24.466] Served eth_getTransactionCount           reqid=7709  duration="47.99µs"
DEBUG[01-28|19:07:24.466] Served eth_call                          reqid=7708  duration="309.502µs"
INFO [01-28|19:07:24.466] Started log indexer
TRACE[01-28|19:07:24.466] Handled RPC response                     reqid=7709  duration="1.343µs"
TRACE[01-28|19:07:24.466] Handled RPC response                     reqid=7708  duration="1.463µs"
WARN [01-28|19:07:24.466] Getting file info                        dir= error="stat : no such file or directory"
DEBUG[01-28|19:07:24.466] Served eth_maxPriorityFeePerGas          reqid=7710  duration="18.675µs"
TRACE[01-28|19:07:24.467] Handled RPC response                     reqid=7710  duration="1.342µs"
TestOutOfGasInStorageCacheFlush
Stack Traces | 14.350s run time
... [CONTENT TRUNCATED: Keeping last 20 lines]
INFO [01-28|19:16:04.293] Blockchain stopped
INFO [01-28|19:16:04.290] InboxTracker                             sequencerBatchCount=309 messageCount=4447 l1Block=338  l1Timestamp=2026-01-28T19:16:02+0000
INFO [01-28|19:16:04.290] Submitted transaction                    hash=0x4bb12cf9f9be51a0626bde43a74e235bae5c08488621495061dbfb7ac0a8687d from=0xb386a74Dcab67b66F8AC07B4f08365d37495Dd23 nonce=308  recipient=0x226174C8697A62b20210008CFBb2Ed246ccFfC22 value=0
INFO [01-28|19:16:04.294] DataPoster sent transaction              nonce=308  hash=4bb12c..a8687d feeCap=500,000,080    tipCap=50,000,000    blobFeeCap=<nil> gas=175,472
INFO [01-28|19:16:04.294] HTTP server stopped                      endpoint=127.0.0.1:36323
TRACE[01-28|19:16:04.294] P2P networking is spinning down
INFO [01-28|19:16:04.294] BatchPoster: batch sent                  sequenceNumber=309 from=4447 to=4463 prevDelayed=287 currentDelayed=288 totalSegments=19 numBlobs=0
INFO [01-28|19:16:04.292] Starting work on payload                 id=0x03ad8818cf6adc26
INFO [01-28|19:16:04.293] DataPoster sent transaction              nonce=75   hash=7448c8..df8000 feeCap=501,336,220    tipCap=50,000,000    blobFeeCap=<nil> gas=161,204
INFO [01-28|19:16:04.294] BatchPoster: batch sent                  sequenceNumber=76  from=214  to=218  prevDelayed=54  currentDelayed=55  totalSegments=8  numBlobs=0
INFO [01-28|19:16:04.296] Starting work on payload                 id=0x03d2f1f0c62b5b9e
INFO [01-28|19:16:04.297] Updated payload                          id=0x03ad8818cf6adc26 number=339  hash=bb54e7..c50276 txs=1   withdrawals=0 gas=163,011    fees=8.15055e-06    root=c57a09..f10660 elapsed=1.639ms
INFO [01-28|19:16:04.298] Updated payload                          id=0x03d2f1f0c62b5b9e number=108  hash=ede98e..19624c txs=1   withdrawals=0 gas=148,855    fees=7.44275e-06    root=7e0c4d..ba747d elapsed=1.443ms
INFO [01-28|19:16:04.298] Stopping work on payload                 id=0x03ad8818cf6adc26 reason=delivery
INFO [01-28|19:16:04.302] Stopping work on payload                 id=0x03d2f1f0c62b5b9e reason=delivery
INFO [01-28|19:16:04.303] Imported new potential chain segment     number=108  hash=ede98e..19624c blocks=1   txs=1   mgas=0.149  elapsed=1.371ms      mgasps=108.498  triediffs=775.46KiB  triedirty=0.00B
INFO [01-28|19:16:04.304] Chain head was updated                   number=108  hash=ede98e..19624c root=7e0c4d..ba747d elapsed="103.724µs"
INFO [01-28|19:16:04.304] Imported new potential chain segment     number=339  hash=bb54e7..c50276 blocks=1   txs=1   mgas=0.163  elapsed=6.047ms      mgasps=26.956   triediffs=1.34MiB    triedirty=226.83KiB
INFO [01-28|19:16:04.304] Chain head was updated                   number=339  hash=bb54e7..c50276 root=c57a09..f10660 elapsed="113.663µs"
--- FAIL: TestOutOfGasInStorageCacheFlush (14.35s)

📣 Thoughts on this report? Let Codecov know! | Powered by Codecov

Copy link
Member

@pmikolajczyk41 pmikolajczyk41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I understand correctly: when the DA writer returns ErrMessageTooLarge, we shouldn't fallback, even if other DA systems (calldata, anytrust) are available - we should just rebuild batches for lower limit; therefore, I think we actually should enable batch poster falling back to e.g. calldata, and ensure that it didn't do so

Comment on lines 1129 to 1131
// Verify follower synced
_, err = l2B.Client.BlockNumber(ctx)
Require(t, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q1: how does this ensure that the second node actually synced?
Q2: why don't we have a similar check in the 2nd phase?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 89dd964

Comment on lines 1119 to 1127
// Create L1 blocks to trigger batch posting
for i := 0; i < 30; i++ {
SendWaitTestTransactions(t, ctx, builder.L1.Client, []*types.Transaction{
builder.L1Info.PrepareTx("Faucet", "User", 30000, big.NewInt(1e12), nil),
})
}

// Wait for batch to post
time.Sleep(time.Second * 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment 1: we can use AdvanceL1 utility instead of a loop here
Comment 2 (covering also the follower node syncing): could we reuse checkBatchPosting method here? I think it would be beneficial for us to have the batch posting check as standardized as possible. If I'm not mistaken, currently checkBatchPosting relies on MaxDelay=0, but maybe we can force posting with builder.L2.ConsensusNode.BatchPoster.MaybePostSequencerBatch(ctx) and some other config flag?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 89dd964

if payloadSize > 6_000 {
t.Errorf("Phase 2: CustomDA payload size %d exceeds expected max ~5KB", payloadSize)
}
} else if daprovider.IsBrotliMessageHeaderByte(headerByte) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have just else here and fail in case there's a batch that is not AltDA; here and in the first phase loop

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 89dd964

Tristan-Wilson and others added 2 commits January 28, 2026 19:21
The test for ErrMessageTooLarge batch resizing had several issues
identified in code review. This commit addresses them to make the test
more robust and better verify the intended behavior.

Key changes:
- Enable fallback (DisableDapFallbackStoreDataOnChain = false) instead
  of disabling it, so the test proves the batch poster *chooses* not to
  fall back when receiving ErrMessageTooLarge, rather than being unable to
- Replace manual L1 block advancement loops with AdvanceL1 helper
- Use WaitForTx to properly verify follower sync instead of just checking
  BlockNumber (which only confirms the node is responding)
- Add follower sync verification to Phase 2 (was missing)
- Fail immediately on any non-CustomDA batch type instead of only
  tracking brotli batches
- Allow one oversized batch in Phase 2 to handle the race condition
  between changing the max size limit and batches already in flight
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants