Skip to content

feat(rpc): derive proposal scan head from head L1 origin#100

Merged
davidtaikocha merged 5 commits intomainfrom
eop-update
Jan 27, 2026
Merged

feat(rpc): derive proposal scan head from head L1 origin#100
davidtaikocha merged 5 commits intomainfrom
eop-update

Conversation

@davidtaikocha
Copy link
Copy Markdown
Collaborator

ref: taikoxyz/taiko-geth#505

Summary

  • Start proposal scan from the L2 block referenced by the stored head L1 origin to avoid tip preconfirmations.
  • Return NotFound when the head L1 origin is missing.
  • Add tests for missing head L1 origin and uncertain-at-head behavior.

@claude
Copy link
Copy Markdown

claude bot commented Jan 27, 2026

Claude finished @davidtaikocha's task —— View job


Code Review Complete

I've reviewed this PR which introduces logic to derive proposal scan head from the stored L1 origin. Here's my detailed feedback:


Strengths

  1. Good test coverage - The PR includes two comprehensive tests (returns_not_found_when_head_l1_origin_missing and returns_uncertain_when_match_at_head_without_mapping) that cover the new edge cases
  2. Clear error handling - New error types (ProposalLastBlockUncertain and GethNotFound) with descriptive messages
  3. Database error resilience - The is_missing_table_error helper properly handles missing table scenarios
  4. Well-structured logic - The separation of concerns between finding blocks and resolving them is clean

🔴 Critical Issues

1. Resource exhaustion risk in find_last_block_number_by_batch_id (crates/rpc/src/eth/eth.rs:72-139)

The function scans backwards from head_number but has no safeguards against scanning the entire chain:

while let Some(block) = current_block {
    // ... scan logic
    current_block = self.provider.block_by_number_or_tag(...)?;
}

Problem: If the batch ID doesn't exist or is very old, this will scan backwards indefinitely, potentially loading millions of blocks into memory.

Fix: Add a scan depth limit:

const MAX_SCAN_DEPTH: u64 = 1000; // or configurable
let min_scan_block = head_number.saturating_sub(MAX_SCAN_DEPTH);

while let Some(block) = current_block {
    let block_number = block.header().number();
    if block_number <= min_scan_block {
        break; // Stop scanning if we've gone too far
    }
    // ... rest of logic
}

2. Logic error: early break on missing proposal ID (crates/rpc/src/eth/eth.rs:114-118)

let Some(proposal_id) = decode_shasta_proposal_id(block.header().extra_data().as_ref()).map(U256::from) else {
    break;
};

Problem: This breaks on the first block without a valid proposal ID, but older blocks in the scan range might still have the target batch_id. This could cause false negatives.

Fix: Use continue instead of break, or track that we've left the "proposal region":

let Some(proposal_id) = ... else {
    break; // Only if we know all older blocks also lack proposals
};

3. Database transaction not committed (crates/rpc/src/eth/eth.rs:76-92)

let head_lookup = provider.into_tx().get::<StoredL1HeadOriginTable>(STORED_L1_HEAD_ORIGIN_KEY);

The into_tx() converts the provider into a transaction, but there's no explicit commit or handling of the transaction lifecycle. While read-only transactions may auto-commit, this pattern is inconsistent with the write paths in auth.rs which explicitly call tx.commit().

Fix: Be explicit about transaction handling or document why it's not needed for read operations.


⚠️ High Priority Issues

4. Inconsistent error handling between cached and scanned results (crates/rpc/src/eth/eth.rs:143-162)

if let Ok(provider) = self.provider.database_provider_ro() {
    let batch_lookup = provider.into_tx().get::<BatchToLastBlock>(batch_id.to());
    if let Ok(Some(block_number)) = batch_lookup {
        return Ok(U256::from(block_number));
    }
    if let Err(error) = batch_lookup && !Self::is_missing_table_error(&error) {
        return Err(EthApiError::InternalEthError.into());
    }
}

Problem: If the cached lookup returns an error (other than missing table), it falls back to scanning. This could mask database corruption or other serious issues. The scan might succeed even though there's a real database problem.

Fix: Return early on database errors rather than falling through to the scan.

5. Missing validation of head_number (crates/rpc/src/eth/eth.rs:94-100)

let mut current_block = self.provider.block_by_number_or_tag(BlockNumberOrTag::Number(head_number))...?;
if current_block.is_none() {
    return Ok(LastBlockSearchResult::NotFound);
}

Problem: If head_number points to a non-existent block, the function returns NotFound. However, this might indicate database corruption (the head origin points to a block that doesn't exist), which should be an error, not just "not found".

Fix: Distinguish between "head exists but batch not found" vs "head itself is missing":

if current_block.is_none() {
    return Err(EthApiError::InternalEthError); // Head should always exist
}

📋 Medium Priority Issues

6. Inefficient block loading (crates/rpc/src/eth/eth.rs:94-136)

Each iteration calls block_by_number_or_tag which likely loads the full block including transactions, bodies, etc. You only need:

  • First transaction's input
  • Block header's extra_data and number

Fix: Use a more targeted query like header_by_number + transactions_by_block or create a specialized query.

7. Magic error code in is_missing_table_error (crates/rpc/src/eth/eth.rs:57)

info.code == -30798

Problem: Magic number with no explanation. What database system uses -30798? Is this MDBX-specific?

Fix: Define a constant:

const MDBX_NOTFOUND: i32 = -30798;
if info.code == MDBX_NOTFOUND || ...

8. Potential panic in decode_shasta_proposal_id (crates/primitives/src/extra_data.rs:17-18)

buf[2..].copy_from_slice(&extra[1..7]);

Problem: If extra.len() >= SHASTA_EXTRA_DATA_LEN (7) but extra is exactly 7 bytes, extra[1..7] is 6 bytes, which fits in buf[2..] (6 bytes). However, the slice operations assume exact sizes without additional bounds checking.

Fix: Already has length check, but could be more defensive:

if extra.len() < SHASTA_EXTRA_DATA_LEN {
    return None;
}
let bytes = &extra[1..7]; // Guaranteed to be 6 bytes
buf[2..8].copy_from_slice(bytes);

🔵 Code Quality & Best Practices

9. Inconsistent variable naming (crates/rpc/src/eth/eth.rs:108)

let input = first_tx.input();
let input = input.as_ref();

Shadowing input twice in quick succession reduces readability.

Fix: Combine or use clearer names:

let input = first_tx.input().as_ref();

10. Missing documentation on new DB table (crates/db/src/model.rs:79-82)

The new BatchToLastBlock table has no doc comments explaining:

  • What it maps (batch ID to last block number?)
  • When it's populated
  • Why it's needed

Fix: Add doc comments.

11. Test setup is overly complex (crates/rpc/src/eth/eth.rs:235-269)

The create_taiko_test_provider_factory function is ~35 lines and mixes multiple concerns. Consider extracting to a test utility module if used across multiple test files.


🟢 Minor Observations

  1. Removed code from consensus/validation.rs - The removed code (lines 37 in the diff) suggests simplification of extra_data handling. This is good, but ensure no functionality was lost.

  2. New dependencies in rpc/Cargo.toml - The PR adds dependencies but they're not visible in the provided diff. Verify these are necessary and don't bloat the binary.

  3. Error message clarity (crates/rpc/src/eth/error.rs:9-10) - The error message is verbose but clear. Consider if "BatchToLastBlockID" should be "BatchToLastBlock" for consistency with the table name.


🧪 Test Coverage Assessment

Good:

  • Tests cover the "not found" case when head L1 origin is missing
  • Tests cover the "uncertain at head" case
  • Tests validate the proposal ID parsing

Missing:

  • No test for successful batch ID lookup (happy path)
  • No test for the backwards scan stopping at genesis (block 0)
  • No test for the cached lookup path in resolve_last_block_number_by_batch_id
  • No test for malformed extra_data during scan
  • No test for scan behavior when first transaction is not an anchor

🔒 Security Considerations

  1. DoS vector: The unbounded backward scan (issue docs(repo): update some docs after repository transfer #1) is a potential DoS vector. An attacker could query non-existent batch IDs to force expensive chain scans.

  2. Database error information leakage: Returning InternalEthError is good, but ensure stack traces or detailed errors aren't leaked to RPC clients.


📊 Performance Considerations

  1. Database reads in hot path: The scan performs O(n) database reads where n is the distance from head to the target block. Consider:

    • Adding an index for batch_id → block_number
    • Implementing exponential backoff (check blocks 1, 2, 4, 8, ... back from head)
    • Caching recent scans
  2. Memory usage: Loading full blocks during scan could consume significant memory. If scanning 1000 blocks at ~1MB each, that's 1GB of memory.


Recommendations Summary

Must Fix Before Merge:

  1. Add scan depth limit (docs(repo): update some docs after repository transfer #1)
  2. Fix early break on missing proposal ID (chore(main): release 0.3.0 #2)
  3. Fix inconsistent error handling in cached lookup (chore(main): release 0.4.0 #4)
  4. Validate head_number existence properly (chore(ci): remove workflows before transition #5)

Should Fix Before Merge:
6. Optimize block loading (#6)
7. Document magic error code (#7)
10. Document new DB table (#10)

Consider for Follow-up:


@davidtaikocha davidtaikocha merged commit 3f2d51d into main Jan 27, 2026
5 checks passed
@davidtaikocha davidtaikocha deleted the eop-update branch January 27, 2026 06:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants