Skip to content

Feature/custom block range#115

Merged
erwin-wee merged 7 commits intomainfrom
feature/custom-block-range
Aug 21, 2025
Merged

Feature/custom block range#115
erwin-wee merged 7 commits intomainfrom
feature/custom-block-range

Conversation

@erwin-wee
Copy link
Collaborator

@erwin-wee erwin-wee commented Aug 21, 2025

Summary by CodeRabbit

  • New Features
    • Added --start-block and --end-block options to define a custom block range for test data.
  • Improvements
    • Validates block-range inputs, with sensible defaults and boundary clamping when using latest blocks.
    • More realistic EVM trace replay by selecting a random block instead of always “latest.”
    • Enhanced HTTP error logging to include the request body.
  • Documentation
    • README updated with usage details for the new block-range options.
  • Chores
    • Updated dependencies (Locust, orjson) and lint config.
    • Docker image now includes curl.

@coderabbitai
Copy link

coderabbitai bot commented Aug 21, 2025

Caution

Review failed

The pull request is closed.

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between bdbb137 and 8e8c7fe.

📒 Files selected for processing (1)
  • chainbench/user/http.py (1 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/custom-block-range

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@erwin-wee erwin-wee added the enhancement New feature or request label Aug 21, 2025
@erwin-wee erwin-wee merged commit aab843f into main Aug 21, 2025
6 checks passed
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
chainbench/test_data/evm.py (1)

527-533: Bug: get_random_block_range can produce a start before start_block_number; also off-by-one on inclusive ranges.

When n >= (end - start), the code picks end from [end - n, end] and returns [end - n, end]. If end == end - n, start falls below the configured range. Additionally, inclusive ranges should use +1 when computing window size.

Use a window-based calculation that respects inclusive bounds:

-        if n >= (self.end_block_number - self.start_block_number):
-            end = rng.random.randint(self.end_block_number - n, self.end_block_number)
-            return BlockRange(end - n, end)
-        else:
-            start = rng.random.randint(self.start_block_number, self.end_block_number - n)
-            return BlockRange(start, start + n)
+        total = self.end_block_number - self.start_block_number + 1
+        window = max(1, min(n, total))
+        start_min = self.start_block_number
+        start_max = self.end_block_number - window + 1
+        start = rng.random.randint(start_min, start_max)
+        return BlockRange(start, start + window - 1)
🧹 Nitpick comments (8)
Dockerfile (1)

22-22: Reduce image size and tighten runtime surface (apt best practices).

Install only what's needed at runtime and clean up apt lists to avoid cache bloat. Unless you specifically need interactive tooling in production, drop htop/nano. Also add --no-install-recommends.

Apply this diff:

-RUN apt-get update && apt-get install -y tini htop nano curl
+RUN set -eux; \
+    apt-get update; \
+    apt-get install -y --no-install-recommends tini curl; \
+    rm -rf /var/lib/apt/lists/*

Additionally, to keep debug tools available when needed, you can gate them behind a build-arg:

ARG DEBUG_TOOLS=0
RUN if [ "$DEBUG_TOOLS" = "1" ]; then apt-get update && apt-get install -y --no-install-recommends htop nano && rm -rf /var/lib/apt/lists/*; fi

Note: README states “Poetry 1.7+”, while this Dockerfile sets POETRY_VERSION=1.5.0. Consider aligning:

-ENV POETRY_VERSION=1.5.0
+ENV POETRY_VERSION=1.7.1
chainbench/util/event.py (1)

60-73: Enrich help text to match README and main.py validation; add inclusivity note.

The new flags look good. Small help-text improvements reduce ambiguity and keep CLI self-contained.

Apply this diff:

-    parser.add_argument(
-        "--start-block",
-        type=int,
-        default=None,
-        help="Starting block number to be used for fetching test data.",
-        include_in_web_ui=False,
-    )
+    parser.add_argument(
+        "--start-block",
+        type=int,
+        default=None,
+        help="Starting block number (inclusive) for test data. Must be used together with --end-block. Ignored if --use-latest-blocks is set.",
+        include_in_web_ui=False,
+    )
-    parser.add_argument(
-        "--end-block",
-        type=int,
-        default=None,
-        help="Last block number to be used for fetching test data.",
-        include_in_web_ui=False,
-    )
+    parser.add_argument(
+        "--end-block",
+        type=int,
+        default=None,
+        help="Ending block number (inclusive) for test data. Must be >= --start-block. Ignored if --use-latest-blocks is set.",
+        include_in_web_ui=False,
+    )

Optional: consider metavar="N" for both to make help output cleaner.

chainbench/util/cli.py (1)

201-207: Guard against half-specified custom ranges when building the command.

Even though main.py validates the pair, tightening here avoids accidental misuse if LocustOptions is constructed elsewhere. Only emit both flags when both are present.

-        if self.use_latest_blocks:
-            command += " --use-latest-blocks True"
-        else:
-            if self.start_block is not None:
-                command += f" --start-block {self.start_block}"
-            if self.end_block is not None:
-                command += f" --end-block {self.end_block}"
+        if self.use_latest_blocks:
+            command += " --use-latest-blocks True"
+        elif self.start_block is not None and self.end_block is not None:
+            command += f" --start-block {self.start_block} --end-block {self.end_block}"
chainbench/user/protocol/evm.py (1)

88-92: Clarify block identifier for trace_replayBlockTransactions across different clients

The JSON-RPC method you’re targeting behaves differently depending on the Ethereum client in use:

  • Erigon and OpenEthereum/Parity
    trace_replayBlockTransactions (Parity-style) accepts a hex block number (Quantity) or a block tag (earliest/latest/pending), not a block hash.
  • Nethermind
    – Exposes trace_replayBlockTransactions(blockParameter, traceTypes), where blockParameter may be a block number, tag, or hash.
  • Geth (go-ethereum)
    – Upstream does not implement trace_replayBlockTransactions; instead it provides
    debug_traceBlockByNumber (accepts block number/tag)
    debug_traceBlockByHash (accepts block hash)
    – Some forks (e.g., core-geth) add Parity-style aliases mapping trace_replayBlockTransactions to those debug methods.

Given this, your current implementation using a hex block number is correct for Erigon and OpenEthereum/Parity, and also works with Nethermind if you only need numbers. However, if you intend to support block‐hash–only RPCs (or want full client interoperability), you should optionally generate a block hash and detect which form to send.

Locations needing attention:

• File chainbench/user/protocol/evm.py, method _trace_replay_block_transaction_params_factory

Optional refactor (if you decide to support hashes):

     def _trace_replay_block_transaction_params_factory(self, rng: RNG) -> list[str | list[str]]:
-        return [
-            hex(self.test_data.get_random_block_number(rng)),
+        # Use a block hash if targeting clients that require it (e.g., Nethermind’s hash path,
+        # Geth’s debug_traceBlockByHash, or forked clients with Parity-style RPC)
+        return [
+            self.test_data.get_random_block_hash(rng),
             ["vmTrace", "trace", "stateDiff"],
         ]
chainbench/test_data/evm.py (2)

485-508: Range calculation and clamping logic — generally solid; minor logging nit.

Good defaults (network start, latest if end missing), use-latest override, and bounds checks. Minor nit: the warning string uses implicit concatenation which is easy to miss in code reviews.

-        if self.start_block_number < 0:
-            logger.warning("start_block is before genesis block," "setting to 0.")
+        if self.start_block_number < 0:
+            logger.warning("start_block is before genesis block, setting to 0.")

495-507: Optional: clamp use-latest start to network.start_block, not to 0.

Clamping to 0 is fine for chains with genesis at 0. For networks where usable history starts later (e.g., your configured network.start_block), consider clamping to network.start_block instead to avoid empty/invalid ranges on nodes with pruned history.

-            self.start_block_number = 0
+            self.start_block_number = getattr(self.network, "start_block", 0)
chainbench/main.py (1)

218-219: Clarify option help: state precedence with --use-latest-blocks

When users pass --start-block/--end-block together with --use-latest-blocks, the custom range is effectively ignored by downstream logic. Make that explicit in the help text to prevent confusion.

-@click.option("--start-block", default=None, help="Start block for test data", type=int)
-@click.option("--end-block", default=None, help="End block for test data", type=int)
+@click.option(
+    "--start-block",
+    default=None,
+    help="Start block for test data. Ignored if --use-latest-blocks is set.",
+    type=int,
+)
+@click.option(
+    "--end-block",
+    default=None,
+    help="End block for test data. Ignored if --use-latest-blocks is set.",
+    type=int,
+)
chainbench/test_data/solana.py (1)

111-117: Minor: fix log message spacing and logger consistency

  • Messages are concatenated without a space after the comma (two adjacent string literals). Merge into a single string or add a leading space.
  • Elsewhere in this class, errors use self._logger; prefer consistent logger usage.
-            logger.warning(
-                "start_block is before earliest_available_block_number," "setting to earliest_available_block_number."
-            )
+            self._logger.warning(
+                "start_block is before earliest_available_block_number, setting to earliest_available_block_number."
+            )
-            logger.warning("end_block is after latest_block_number," "setting to latest_block_number.")
+            self._logger.warning("end_block is after latest_block_number, setting to latest_block_number.")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 16dc35d and bdbb137.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (12)
  • Dockerfile (1 hunks)
  • README.md (1 hunks)
  • chainbench/main.py (3 hunks)
  • chainbench/test_data/blockchain.py (2 hunks)
  • chainbench/test_data/ethereum.py (1 hunks)
  • chainbench/test_data/evm.py (1 hunks)
  • chainbench/test_data/solana.py (1 hunks)
  • chainbench/user/http.py (1 hunks)
  • chainbench/user/protocol/evm.py (2 hunks)
  • chainbench/util/cli.py (2 hunks)
  • chainbench/util/event.py (1 hunks)
  • pyproject.toml (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
chainbench/util/cli.py (1)
chainbench/test_data/evm.py (1)
  • start_block (59-60)
chainbench/test_data/evm.py (4)
chainbench/test_data/ethereum.py (1)
  • fetch_latest_block_number (133-134)
chainbench/test_data/solana.py (1)
  • fetch_latest_block_number (64-66)
chainbench/test_data/starknet.py (1)
  • fetch_latest_block_number (52-53)
chainbench/test_data/blockchain.py (5)
  • start_block_number (216-217)
  • start_block_number (220-221)
  • end_block_number (224-225)
  • end_block_number (228-229)
  • data (137-140)
chainbench/user/protocol/evm.py (2)
chainbench/util/rng.py (2)
  • RNG (5-9)
  • get_rng (17-27)
chainbench/test_data/blockchain.py (2)
  • get_random_block_number (58-59)
  • get_random_block_number (210-213)
chainbench/test_data/ethereum.py (2)
chainbench/test_data/blockchain.py (6)
  • _get_start_and_end_blocks (176-183)
  • end_block_number (224-225)
  • end_block_number (228-229)
  • start_block_number (216-217)
  • start_block_number (220-221)
  • data (137-140)
chainbench/test_data/solana.py (2)
  • _get_start_and_end_blocks (100-118)
  • fetch_latest_block_number (64-66)
chainbench/test_data/blockchain.py (2)
chainbench/test_data/evm.py (1)
  • start_block (59-60)
chainbench/main.py (1)
  • start (221-455)
chainbench/main.py (1)
chainbench/test_data/evm.py (1)
  • start_block (59-60)
chainbench/test_data/solana.py (4)
chainbench/test_data/blockchain.py (6)
  • _get_start_and_end_blocks (176-183)
  • end_block_number (224-225)
  • end_block_number (228-229)
  • start_block_number (216-217)
  • start_block_number (220-221)
  • data (137-140)
chainbench/test_data/ethereum.py (2)
  • _get_start_and_end_blocks (136-142)
  • fetch_latest_block_number (133-134)
chainbench/test_data/evm.py (2)
  • _get_start_and_end_blocks (484-507)
  • fetch_latest_block_number (462-463)
chainbench/test_data/starknet.py (1)
  • fetch_latest_block_number (52-53)
🪛 GitHub Actions: Checks
chainbench/user/http.py

[error] 64-64: Mypy error: Item 'None' of 'FastRequest | None' has no attribute 'body' [union-attr]. Step: 'poetry run mypy .'

🔇 Additional comments (10)
pyproject.toml (3)

22-22: orjson ^3.11.2 looks good.

No obvious breaking changes expected for common dumps/loads usage. Keep an eye on manylinux wheels availability in CI for 3.11.x.


49-49: flake8 extend-ignore W503 aligns with Black.

Ignoring W503 is standard with Black’s formatting of binary operators. Consistent with E203 ignore.


15-15: Verify environment setup and Locust bump to ^2.39.0

It looks like the automated checks failed because locust isn’t installed in the current environment and poetry isn’t available. Before proceeding, please ensure your project dependencies are properly installed so you can confirm plugin/API compatibility:

• Install dependencies (via Poetry or pip):

poetry install           # or pip install -e .

• Confirm installed versions:

python -c "import locust; print('locust', locust.__version__); from importlib.metadata import version; print('locust-plugins', version('locust-plugins'))"

• Grep for potentially impacted APIs:

rg -nP -C2 --type=py '\bResponseContextManager\b|\bFastHttpUser\b|\bevents\.\w+\.add_listener' chainbench

• Dry-run the CLI to catch missing-attribute regressions:

poetry run chainbench start --help >/dev/null

Optionally run a quick smoke test (e.g., a headless single-user run) to catch any runtime issues. Verify that locust-plugins ^4.5.3 and your usages of ResponseContextManager, FastHttpUser, and events.*.add_listener remain fully compatible with Locust 2.39.0.

README.md (1)

131-133: Update README block-range docs to match code (inclusive and >=)

The validation in chainbench/main.py only raises an error when end_block < start_block (line 261), which means end_block == start_block is permitted and the range is inclusive. The documentation should be updated to reflect this behavior.

  • File: README.md
    Lines: 131–133
- `--start-block`: Starting block number to use when generating test data. Must be used together with `--end-block`.
- `--end-block`: Ending block number to use when generating test data. Must be greater than `--start-block`. Note: if `--use-latest-blocks` is set, this custom range is ignored.
+ `--start-block`: Starting block number to use when generating test data. Must be used together with `--end-block`. The interval is inclusive.
+ `--end-block`: Ending block number to use when generating test data. Must be greater than or equal to (`>=`) `--start-block`. Note: if `--use-latest-blocks` is set, this custom range is ignored.
chainbench/util/cli.py (1)

148-149: All LocustOptions instantiations are centralized in chainbench/main.py
Verified that the only call to LocustOptions(…) occurs at line 373 of chainbench/main.py, so the start/end block validation in main.start cannot be bypassed. Approving these additions as-is.

chainbench/user/protocol/evm.py (1)

405-407: Call site updated to pass RNG — LGTM.

The factory signature change is reflected correctly at the call site.

chainbench/test_data/blockchain.py (2)

219-222: Property setter for start_block_number — LGTM.

Simple and clear encapsulation around the internal BlockRange.


227-230: Property setter for end_block_number — LGTM.

Consistent with start setter and keeps the state centralized.

chainbench/main.py (2)

254-256: Parameter additions LGTM

Typed option plumbing for start_block/end_block looks correct and aligns with LocustOptions changes.


400-402: Forwarding options to LocustOptions is correct

Passing start_block/end_block through to LocustOptions maintains a single source of truth for rendering Locust CLI arguments.

Comment on lines +257 to +263
if start_block is not None or end_block is not None:
if start_block is None or end_block is None:
raise ValueError("Both start-block and end-block are required for specifying custom block range.")
else:
if end_block < start_block:
raise ValueError("end-block must be greater than start-block.")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use Click exceptions; block equality; and forbid mixing with --use-latest-blocks

  • Prefer click.* exceptions over ValueError so Click formats usage errors cleanly.
  • Consider allowing a single-block range (end == start) unless there’s a strong reason not to.
  • If --use-latest-blocks is set, reject or warn when custom block options are provided to avoid silently ignoring user input.
-    if start_block is not None or end_block is not None:
-        if start_block is None or end_block is None:
-            raise ValueError("Both start-block and end-block are required for specifying custom block range.")
-        else:
-            if end_block < start_block:
-                raise ValueError("end-block must be greater than start-block.")
+    # Validate custom block range usage vs. --use-latest-blocks
+    if use_latest_blocks and (start_block is not None or end_block is not None):
+        raise click.UsageError(
+            "--start-block/--end-block cannot be used together with --use-latest-blocks."
+        )
+    # If either custom block option is provided, require both
+    if (start_block is not None) != (end_block is not None):
+        raise click.UsageError("Both --start-block and --end-block must be provided together.")
+    # Ordering check (allow equality for single-block tests)
+    if start_block is not None and end_block is not None and end_block < start_block:
+        raise click.BadOptionUsage("--end-block", "--end-block must be greater than or equal to --start-block.")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if start_block is not None or end_block is not None:
if start_block is None or end_block is None:
raise ValueError("Both start-block and end-block are required for specifying custom block range.")
else:
if end_block < start_block:
raise ValueError("end-block must be greater than start-block.")
# Validate custom block range usage vs. --use-latest-blocks
if use_latest_blocks and (start_block is not None or end_block is not None):
raise click.UsageError(
"--start-block/--end-block cannot be used together with --use-latest-blocks."
)
# If either custom block option is provided, require both
if (start_block is not None) != (end_block is not None):
raise click.UsageError("Both --start-block and --end-block must be provided together.")
# Ordering check (allow equality for single-block tests)
if start_block is not None and end_block is not None and end_block < start_block:
raise click.BadOptionUsage("--end-block", "--end-block must be greater than or equal to --start-block.")
🤖 Prompt for AI Agents
In chainbench/main.py around lines 257 to 263, replace the ValueError checks
with Click exceptions, allow end_block equal to start_block (use end_block <
start_block as the failing condition), and add a guard that if the
--use-latest-blocks flag is set and either start_block or end_block is provided,
raise a click.UsageError (or click.BadParameter) to reject mixing modes; ensure
the error messages are user-friendly and use Click exceptions so Click formats
them properly.

Comment on lines 176 to 184
def _get_start_and_end_blocks(self, parsed_options: Namespace) -> BlockRange:
raise NotImplementedError
if parsed_options.start_block is not None:
self.start_block_number = parsed_options.start_block
else:
self.start_block_number = 1
if parsed_options.end_block is not None:
self.end_block_number = parsed_options.end_block
return self.data.block_range

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Base _get_start_and_end_blocks leaves end unset (0) unless provided — produces invalid ranges for subclasses that rely on super().

As written, if end_block is not provided, end remains 0 (from BlockchainData default), while start defaults to 1. Subclasses like EthBeaconTestData call super() and may return this invalid range when not using latest blocks.

Set a sane default so the base always returns a valid range (degenerate one-block range when end is not specified):

-    def _get_start_and_end_blocks(self, parsed_options: Namespace) -> BlockRange:
-        if parsed_options.start_block is not None:
-            self.start_block_number = parsed_options.start_block
-        else:
-            self.start_block_number = 1
-        if parsed_options.end_block is not None:
-            self.end_block_number = parsed_options.end_block
-        return self.data.block_range
+    def _get_start_and_end_blocks(self, parsed_options: Namespace) -> BlockRange:
+        start = parsed_options.start_block if parsed_options.start_block is not None else 1
+        end = parsed_options.end_block if parsed_options.end_block is not None else start
+        self.start_block_number = start
+        self.end_block_number = end
+        return self.data.block_range
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _get_start_and_end_blocks(self, parsed_options: Namespace) -> BlockRange:
raise NotImplementedError
if parsed_options.start_block is not None:
self.start_block_number = parsed_options.start_block
else:
self.start_block_number = 1
if parsed_options.end_block is not None:
self.end_block_number = parsed_options.end_block
return self.data.block_range
def _get_start_and_end_blocks(self, parsed_options: Namespace) -> BlockRange:
start = parsed_options.start_block if parsed_options.start_block is not None else 1
end = parsed_options.end_block if parsed_options.end_block is not None else start
self.start_block_number = start
self.end_block_number = end
return self.data.block_range
🤖 Prompt for AI Agents
In chainbench/test_data/blockchain.py around lines 176 to 184, the base
_get_start_and_end_blocks sets start_block to 1 when missing but never assigns
end_block_number unless provided, leaving end at its default (0) and producing
invalid ranges; update the method so that when parsed_options.end_block is None
you set self.end_block_number to the same value as self.start_block_number
(creating a one-block degenerate range) before returning self.data.block_range;
ensure both start_block_number and end_block_number are always assigned in this
method.

Comment on lines +137 to 143
super()._get_start_and_end_blocks(parsed_options)
if parsed_options.use_latest_blocks:
start_block_number = end_block_number - self.data.size.blocks_len + 1
else:
start_block_number = 1
logger.info("Using blocks from %s to %s as test data", start_block_number, end_block_number)
return BlockRange(start_block_number, end_block_number)
self.end_block_number = self.fetch_latest_block_number()
self.start_block_number = self.end_block_number - self.data.size.blocks_len + 1
logger.info("Using blocks from %s to %s as test data", self.start_block_number, self.end_block_number)
return self.data.block_range

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

End block can remain 0 when not using latest and no end_block is provided.

This method defers to the base, which (as currently implemented) doesn’t set end unless provided. Unless the base is fixed, EthBeaconTestData may end up with an invalid range (start > end). Make this method robust by defaulting end to latest when end is unset; still respect the use-latest override.

     def _get_start_and_end_blocks(self, parsed_options: Namespace) -> BlockRange:
-        super()._get_start_and_end_blocks(parsed_options)
-        if parsed_options.use_latest_blocks:
-            self.end_block_number = self.fetch_latest_block_number()
-            self.start_block_number = self.end_block_number - self.data.size.blocks_len + 1
-        logger.info("Using blocks from %s to %s as test data", self.start_block_number, self.end_block_number)
-        return self.data.block_range
+        super()._get_start_and_end_blocks(parsed_options)
+        # Default end to latest if not provided
+        if getattr(parsed_options, "end_block", None) is None:
+            self.end_block_number = self.fetch_latest_block_number()
+        # Override with latest window if requested
+        if parsed_options.use_latest_blocks:
+            self.end_block_number = self.fetch_latest_block_number()
+            self.start_block_number = self.end_block_number - self.data.size.blocks_len + 1
+        logger.info("Using blocks from %s to %s as test data", self.start_block_number, self.end_block_number)
+        return self.data.block_range

If you adopt the base-class fix suggested in blockchain.py, the first conditional becomes unnecessary, but keeping it is harmless and makes this code resilient.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
super()._get_start_and_end_blocks(parsed_options)
if parsed_options.use_latest_blocks:
start_block_number = end_block_number - self.data.size.blocks_len + 1
else:
start_block_number = 1
logger.info("Using blocks from %s to %s as test data", start_block_number, end_block_number)
return BlockRange(start_block_number, end_block_number)
self.end_block_number = self.fetch_latest_block_number()
self.start_block_number = self.end_block_number - self.data.size.blocks_len + 1
logger.info("Using blocks from %s to %s as test data", self.start_block_number, self.end_block_number)
return self.data.block_range
def _get_start_and_end_blocks(self, parsed_options: Namespace) -> BlockRange:
super()._get_start_and_end_blocks(parsed_options)
# Default end to latest if not provided
if getattr(parsed_options, "end_block", None) is None:
self.end_block_number = self.fetch_latest_block_number()
# Override with latest window if requested
if parsed_options.use_latest_blocks:
self.end_block_number = self.fetch_latest_block_number()
self.start_block_number = self.end_block_number - self.data.size.blocks_len + 1
logger.info("Using blocks from %s to %s as test data", self.start_block_number, self.end_block_number)
return self.data.block_range
🤖 Prompt for AI Agents
In chainbench/test_data/ethereum.py around lines 137 to 143, the
end_block_number can remain 0 if the base class didn't set it, causing an
invalid range (start > end); ensure the method defaults end to the latest block
when end_block_number is falsy unless use_latest explicitly overrides, by
calling fetch_latest_block_number to set end_block_number when not provided,
then compute start_block_number as end - self.data.size.blocks_len + 1 and
clamp/start-check to ensure start_block_number <= end_block_number; keep the
existing use_latest branch (harmless) so behavior is respected whether the
base-class fix is applied or not.

Comment on lines +101 to 119
super()._get_start_and_end_blocks(parsed_options)
earliest_available_block_number = self._fetch_first_available_block()

latest_block_number = self.fetch_latest_block_number()
# factor in run_time and add 10% buffer to ensure blocks used in test data are
# not removed from the ledger
earliest_available_block_number += int((parsed_options.run_time / self.BLOCK_TIME) * 1.1)
start_block_number = earliest_available_block_number

if parsed_options.use_latest_blocks:
start_block_number = end_block_number - self.data.size.blocks_len + 1

if start_block_number < earliest_available_block_number:
raise ValueError(
f"Earliest available block (with buffer) is {earliest_available_block_number}, "
f"but start block is {start_block_number}"
self.end_block_number = latest_block_number
self.start_block_number = self.end_block_number - self.data.size.blocks_len + 1
if self.start_block_number < earliest_available_block_number:
logger.warning(
"start_block is before earliest_available_block_number," "setting to earliest_available_block_number."
)

return BlockRange(start_block_number, end_block_number)
self.start_block_number = earliest_available_block_number
if self.end_block_number > latest_block_number:
logger.warning("end_block is after latest_block_number," "setting to latest_block_number.")
self.end_block_number = latest_block_number
return self.data.block_range

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Guard against start > end after clamping; handle end before earliest

If a user supplies a custom range where end < earliest_available_block_number, the current logic only clamps start upward. That can yield start_block_number > end_block_number, causing downstream range calculations to misbehave.

Add symmetric handling for end below earliest, and a final sanity check to ensure start <= end after all clamps.

     def _get_start_and_end_blocks(self, parsed_options: Namespace) -> BlockRange:
         super()._get_start_and_end_blocks(parsed_options)
         earliest_available_block_number = self._fetch_first_available_block()
         latest_block_number = self.fetch_latest_block_number()
         # factor in run_time and add 10% buffer to ensure blocks used in test data are
         # not removed from the ledger
         earliest_available_block_number += int((parsed_options.run_time / self.BLOCK_TIME) * 1.1)
         if parsed_options.use_latest_blocks:
             self.end_block_number = latest_block_number
             self.start_block_number = self.end_block_number - self.data.size.blocks_len + 1
-        if self.start_block_number < earliest_available_block_number:
-            logger.warning(
-                "start_block is before earliest_available_block_number," "setting to earliest_available_block_number."
-            )
-            self.start_block_number = earliest_available_block_number
+        # Clamp start below earliest
+        if self.start_block_number < earliest_available_block_number:
+            self._logger.warning(
+                "start_block is before earliest_available_block_number, setting to earliest_available_block_number."
+            )
+            self.start_block_number = earliest_available_block_number
+        # Clamp end below earliest as well (symmetric case)
+        if self.end_block_number < earliest_available_block_number:
+            self._logger.warning(
+                "end_block is before earliest_available_block_number, setting to earliest_available_block_number."
+            )
+            self.end_block_number = earliest_available_block_number
-        if self.end_block_number > latest_block_number:
-            logger.warning("end_block is after latest_block_number," "setting to latest_block_number.")
-            self.end_block_number = latest_block_number
+        # Clamp end above latest
+        if self.end_block_number > latest_block_number:
+            self._logger.warning("end_block is after latest_block_number, setting to latest_block_number.")
+            self.end_block_number = latest_block_number
+        # Final sanity: ensure start <= end after all clamps
+        if self.start_block_number > self.end_block_number:
+            self._logger.warning(
+                "start_block is after end_block after clamping, setting start_block to end_block."
+            )
+            self.start_block_number = self.end_block_number
         return self.data.block_range

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In chainbench/test_data/solana.py around lines 101 to 119, after computing
earliest_available_block_number and applying the existing start clamp, also
clamp end_block_number to be no less than earliest_available_block_number (e.g.,
if end_block_number < earliest_available_block_number set end_block_number =
earliest_available_block_number), and then add a final sanity check to ensure
start_block_number <= end_block_number; if start_block_number >
end_block_number, reset start_block_number to
max(earliest_available_block_number, end_block_number -
self.data.size.blocks_len + 1) so the range is valid.

"""Check the response for errors."""
if response.status_code != 200:
self.logger.error(f"Request failed with {response.status_code} code")
self.logger.error(f"Request body of failed request: {response.request.body}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Fix mypy error and avoid leaking sensitive request bodies in logs.

response.request can be None in Locust 2.39 (FastRequest | None), causing mypy union-attr error and potential AttributeError at runtime. Also, logging raw request bodies at error level risks exposing secrets. Guard for None, decode safely, truncate, and consider redaction.

Apply this diff:

-            self.logger.error(f"Request body of failed request: {response.request.body}")
+            # Guard against Optional[FastRequest], avoid AttributeError and excessive/secret logging
+            req = getattr(response, "request", None)
+            body = getattr(req, "body", None) if req is not None else None
+            if body:
+                try:
+                    preview = body.decode("utf-8", "replace") if isinstance(body, (bytes, bytearray)) else str(body)
+                except Exception:
+                    preview = "<unprintable-body>"
+                max_len = 2048
+                self.logger.error(
+                    f"Request body of failed request (truncated to {max_len} chars): {preview[:max_len]}"
+                )

Optionally, add basic redaction for common secret fields and make body logging opt-in (env toggle). I can provide a follow-up patch if you want this hardened further.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self.logger.error(f"Request body of failed request: {response.request.body}")
# Guard against Optional[FastRequest], avoid AttributeError and excessive/secret logging
req = getattr(response, "request", None)
body = getattr(req, "body", None) if req is not None else None
if body:
try:
preview = (
body.decode("utf-8", "replace")
if isinstance(body, (bytes, bytearray))
else str(body)
)
except Exception:
preview = "<unprintable-body>"
max_len = 2048
self.logger.error(
f"Request body of failed request (truncated to {max_len} chars): {preview[:max_len]}"
)
🧰 Tools
🪛 GitHub Actions: Checks

[error] 64-64: Mypy error: Item 'None' of 'FastRequest | None' has no attribute 'body' [union-attr]. Step: 'poetry run mypy .'

@coderabbitai coderabbitai bot mentioned this pull request Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant