fix: Fix incomplete renaming of tool-specific methods, and imports in benchall.py. by PacificViking · Pull Request #5 · y-scope/log-archival-bench

PacificViking · 2025-07-28T06:09:16Z

Description

In #1, we renamed some of the asset directories but didn't make the corresponding changes to the imports in benchall.py. In addition, we didn't rename all the tool-specific methods for consistency.

Checklist

The PR satisfies the contribution guidelines.
This is a breaking change and that has been indicated in the PR title, OR this isn't a
breaking change.
Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Ran scripts/benchall.py successfully.

Summary by CodeRabbit

Refactor
- Updated benchmark class names for consistency across modules.
- Enhanced error handling by allowing attachment on error during benchmark runs.

coderabbitai · 2025-07-28T06:09:28Z

Walkthrough

This change renames benchmark classes in four asset modules for consistency and updates all corresponding instantiations and imports. The benchmark orchestration script is updated to reflect the new class names, adjusts the import paths, adds a new parameter to the run function for conditional Docker attachment on error, and modifies the benchmark execution loop to call run with attach_on_error=True while leaving previous calls commented out.

Changes

Cohort / File(s)	Change Summary
Benchmark class renaming (ClickHouse) `assets/clickhouse/main.py`	Renamed class `clickhouse_native_json_bench` to `clickhouse_bench` and updated instantiation in `main()`.
Benchmark class renaming (CLP) `assets/clp/main.py`	Renamed class `clp_s_bench` to `clp_bench` and updated instantiation in `main()`.
Benchmark class renaming (Presto CLP) `assets/presto_clp/main.py`	Renamed class `clp_presto_bench` to `presto_clp_bench` and updated instantiation in `main()`.
Benchmark class renaming (Presto Parquet) `assets/presto_parquet/main.py`	Renamed class `parquet_bench` to `presto_parquet_bench` and updated instantiation in `main()`.
Benchmark orchestration and imports `scripts/benchall.py`	Updated imports and benchmark list to use new class names and paths; modified `run` function to add `attach_on_error` parameter; changed error handling to use this parameter; replaced unconditional `run` calls with calls using `attach_on_error=True`; left previous calls and dataset filtering commented out.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~7 minutes

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d6d1ceb and 3b4aa8e.

📒 Files selected for processing (2)

assets/presto_parquet/main.py (2 hunks)
scripts/benchall.py (4 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (1)

assets/presto_parquet/main.py (1)

src/template.py (1)

Benchmark (55-420)

🪛 Ruff (0.12.2)

scripts/benchall.py

41-41: Trailing comma missing

Add trailing comma

(COM812)

44-44: Trailing comma missing

Add trailing comma

(COM812)

55-55: Missing return type annotation for public function run

Add return type annotation: None

(ANN201)

55-55: Missing type annotation for function argument bencher

(ANN001)

55-55: Missing type annotation for function argument kwargs

(ANN001)

55-55: Missing type annotation for function argument bench_target

(ANN001)

55-55: Boolean default positional argument in function definition

(FBT002)

55-55: Missing type annotation for function argument attach

(ANN001)

55-55: Boolean default positional argument in function definition

(FBT002)

55-55: Missing type annotation for function argument attach_on_error

(ANN001)

🔇 Additional comments (5)

assets/presto_parquet/main.py (2)

18-18: LGTM! Class renamed for consistency.

The class name change from parquet_bench to presto_parquet_bench aligns with the naming conventions established for other benchmark classes in the project.

148-148: LGTM! Instantiation updated correctly.

The class instantiation properly reflects the new class name, maintaining consistency with the class definition change.

scripts/benchall.py (3)

3-4: LGTM! Imports updated correctly.

The import statements have been properly updated to reflect the renamed benchmark classes, maintaining consistency across the codebase.

Also applies to: 6-6, 9-9

72-72: LGTM! Error handling logic enhanced.

The updated condition properly incorporates the new attach_on_error parameter, allowing attachment on error even when attach is False.

84-85: LGTM! Execution flow updated appropriately.

The main execution loop has been updated to use attach_on_error=True, which provides better debugging capabilities when benchmarks encounter errors.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🔭 Outside diff range comments (2)

assets/overhead_test/Dockerfile (1)
5-11: Add cleanup flags to keep the Ubuntu image slim

After switching to ubuntu:jammy, the layer now includes the full APT cache, which bloats the final image by ~100 MB. Append an apt-get clean plus removal of /var/lib/apt/lists (and optionally --no-install-recommends) so the benchmark container pulls faster and uses less disk.
-RUN apt-get update \
-    && DEBIAN_FRONTEND=noninteractive apt-get install -y \
-    tmux \
-    vim
+RUN apt-get update \
+    && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
+       tmux \
+       vim \
+    && apt-get clean \
+    && rm -rf /var/lib/apt/lists/*
assets/presto_clp/main.py (1)
19-23: Class name violates PEP 8 – prefer PascalCase

Python classes should use CapWords style. Renaming presto_clp_bench to PrestoClpBench improves readability and matches the rest of the ecosystem.
-class presto_clp_bench(Benchmark):
+class PrestoClpBench(Benchmark):
If you adopt this change, remember to update the instantiation at line 105 as well:
-bench = presto_clp_bench(sys.argv[1])
+bench = PrestoClpBench(sys.argv[1])

♻️ Duplicate comments (3)

assets/template/Dockerfile (1)

5-11: Mirror the image-size cleanup used elsewhere

Same remark as in assets/overhead_test/Dockerfile: add --no-install-recommends, apt-get clean and cache removal to avoid shipping unnecessary packages and APT metadata.
Keeping all Dockerfiles consistent also simplifies maintenance.

assets/clp/main.py (1)

8-10: Consistent class casing

Like the Presto benchmark, clp_bench should be ClpBench to follow PEP 8.
(This is the same naming concern; no functional impact.)

assets/clickhouse/main.py (1)

16-19: Apply PEP 8 class-name casing

Change clickhouse_bench → ClickhouseBench (and update usages) for consistency with standard Python conventions and the other benchmarks once renamed.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22783a0 and 0834906.

📒 Files selected for processing (9)

assets/clickhouse/main.py (2 hunks)
assets/clp/main.py (2 hunks)
assets/elasticsearch/Dockerfile (1 hunks)
assets/gzip/Dockerfile (1 hunks)
assets/gzip/main.py (1 hunks)
assets/overhead_test/Dockerfile (1 hunks)
assets/presto_clp/main.py (2 hunks)
assets/template/Dockerfile (1 hunks)
scripts/benchall.py (5 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (3)

assets/clp/main.py (1)

src/template.py (1)

Benchmark (55-420)

assets/gzip/main.py (1)

src/template.py (2)

docker_execute (240-272)

bench_start (293-307)

assets/presto_clp/main.py (1)

src/template.py (1)

Benchmark (55-420)

🪛 Checkov (3.2.334)

assets/elasticsearch/Dockerfile

[LOW] 1-24: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)

[LOW] 1-24: Ensure that a user for the container has been created

(CKV_DOCKER_3)

🪛 Hadolint (2.12.0)

assets/gzip/Dockerfile

[warning] 8-8: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>

(DL3008)

[info] 8-8: Delete the apt-get lists after installing something

(DL3009)

[info] 8-8: Avoid additional packages by specifying --no-install-recommends

(DL3015)

🪛 Ruff (0.12.2)

assets/gzip/main.py

40-40: Function call with shell=True parameter identified, security issue

(S604)

42-42: Missing return type annotation for public function search

Add return type annotation: int

(ANN201)

42-42: Missing type annotation for function argument query

(ANN001)

46-46: Use of assert detected

(S101)

48-48: Function call with shell=True parameter identified, security issue

(S604)

50-50: Missing return type annotation for public function bench_search

Add return type annotation: None

(ANN201)

50-50: Boolean default positional argument in function definition

(FBT002)

50-50: Missing type annotation for function argument cold

(ANN001)

56-56: Function call with shell=True parameter identified, security issue

(S604)

scripts/benchall.py

41-41: Trailing comma missing

Add trailing comma

(COM812)

55-55: Missing return type annotation for public function run

Add return type annotation: None

(ANN201)

55-55: Missing type annotation for function argument bencher

(ANN001)

55-55: Missing type annotation for function argument kwargs

(ANN001)

55-55: Missing type annotation for function argument bench_target

(ANN001)

55-55: Boolean default positional argument in function definition

(FBT002)

55-55: Missing type annotation for function argument attach

(ANN001)

55-55: Boolean default positional argument in function definition

(FBT002)

55-55: Missing type annotation for function argument attach_on_error

(ANN001)

🔇 Additional comments (11)

assets/gzip/Dockerfile (1)

5-5: Verify compatibility with the new base image.

The change from the custom ghcr.io/y-scope/clp/clp-core-dependencies-x86-ubuntu-jammy:main image to the standard ubuntu:jammy image removes dependencies that may have been included in the custom image. Ensure that all required tools and libraries for the gzip benchmark are still available.

assets/elasticsearch/Dockerfile (1)

1-1: Consistent base image update aligns with project standardization.

The change from the custom CLP image to ubuntu:jammy is consistent with similar changes across other asset Dockerfiles. This standardization simplifies maintenance and reduces dependencies on custom images.

assets/gzip/main.py (4)

40-40: Appropriate use of shell=True for output redirection.

The addition of shell=True is necessary for the shell redirection (>) to work properly within the Docker container. This change aligns with the base image update in the Dockerfile, which may have affected shell behaviour.

48-48: Consistent shell parameter usage for redirection commands.

The shell=True parameter is correctly added to handle the output redirection. This maintains consistency with the other shell commands in the file.

56-56: Proper shell handling for decompression command.

The addition of shell=True ensures the gunzip command with output redirection executes correctly in the Docker environment.

40-84: Security consideration: shell=True usage is justified.

While static analysis flags the use of shell=True as a security concern, it's necessary here for shell features like output redirection. The commands use controlled file paths and don't include user input, mitigating the security risk.

The security risk is acceptable in this context because:

Commands use predefined file paths (GZIP_FILE_PATH, DECOMPRESSED_FILE_PATH)

No user input is directly interpolated into shell commands

Shell features (redirection) are required for functionality

scripts/benchall.py (5)

3-9: Import fixes successfully address renamed benchmark classes.

The updated imports correctly reflect the renamed benchmark classes, which aligns perfectly with the PR objective "Fix renamed assets import in benchall.py". The changes maintain consistency across the benchmarking framework.

36-53: Benchmark list updated to use new class names.

The benchmarks list correctly references the renamed benchmark functions, ensuring the orchestration script works with the updated class names.

55-55: Enhanced error handling with attach_on_error parameter.

The addition of the attach_on_error parameter provides better debugging capabilities by allowing Docker container attachment specifically when errors occur, even if general attachment is disabled.

73-77: Improved exception handling logic.

The updated condition if attach or attach_on_error: properly implements the new error-specific attachment behaviour, enhancing debugging capabilities without always requiring full attachment.

83-86: No change needed for dataset restriction
The data/ directory currently contains only the mongod subdirectory, so the if dataset_name != 'mongod' check has no effect on today’s benchmarks.

• If you add more datasets in the future, you may revisit or remove this filter then.

coderabbitai · 2025-07-28T06:13:17Z

assets/gzip/Dockerfile

 RUN apt-get update \
-    && DEBIAN_FRONTEND=noninteractive apt-get install -y \
+    && DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \
    tmux \
    vim \
    gzip


🧹 Nitpick (assertive)

Consider Docker best practices for package management.

While not critical for functionality, consider implementing these Docker best practices:

Pin package versions for reproducible builds

Clean up apt cache to reduce image size

Use --no-install-recommends to avoid unnecessary packages

RUN apt-get update \ - && DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \ - tmux \ - vim \ - gzip + && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \ + tmux=3.2a-4build1 \ + vim=2:8.2.3458-2ubuntu2.2 \ + gzip=1.10-4ubuntu4.1 \ + && apt-get clean \ + && rm -rf /var/lib/apt/lists/*

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

RUN apt-get update \

&& DEBIAN_FRONTEND=noninteractive apt-get install -y \

&& DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \

tmux \

vim \

gzip

RUN apt-get update \

&& DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \

tmux=3.2a-4build1 \

vim=2:8.2.3458-2ubuntu2.2 \

gzip=1.10-4ubuntu4.1 \

&& apt-get clean \

&& rm -rf /var/lib/apt/lists/*

🧰 Tools

🪛 Checkov (3.2.334)

[LOW] 1-12: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)

[LOW] 1-12: Ensure that a user for the container has been created

(CKV_DOCKER_3)

🪛 Hadolint (2.12.0)

[warning] 8-8: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>

(DL3008)

[info] 8-8: Delete the apt-get lists after installing something

(DL3009)

[info] 8-8: Avoid additional packages by specifying --no-install-recommends

(DL3015)

🤖 Prompt for AI Agents

In assets/gzip/Dockerfile around lines 8 to 12, improve the package installation by pinning package versions to ensure reproducible builds, adding the --no-install-recommends flag to avoid installing unnecessary packages, and cleaning up the apt cache after installation to reduce the final image size. Modify the RUN command to include these changes accordingly.

coderabbitai · 2025-07-28T06:13:17Z

assets/gzip/Dockerfile

 # Install necessary packages
 RUN apt-get update \
-    && DEBIAN_FRONTEND=noninteractive apt-get install -y \
+    && DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \


🧹 Nitpick (assertive)

Consider removing the --allow-downgrades flag if not necessary.

The addition of --allow-downgrades suggests potential package version conflicts. This flag should only be used if package downgrades are explicitly required for compatibility.

- && DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \ + && DEBIAN_FRONTEND=noninteractive apt-get install -y \

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

&& DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \

&& DEBIAN_FRONTEND=noninteractive apt-get install -y \

🤖 Prompt for AI Agents

In assets/gzip/Dockerfile at line 9, the use of the --allow-downgrades flag in the apt-get install command may be unnecessary and could introduce package version conflicts. Review the package requirements and if no downgrades are explicitly needed for compatibility, remove the --allow-downgrades flag from the apt-get install command to avoid unintended downgrades.

--allow-downgrades allows the benchmark to run on the specified version even when ubuntu has a newer version.

@PacificViking Thank you for the clarification! I understand now that the --allow-downgrades flag is intentionally used to ensure the benchmark can run with specific package versions for consistency, even when Ubuntu's repositories have newer versions available. This makes perfect sense in a benchmarking context where version consistency is crucial for reliable results.

I appreciate you explaining the rationale behind this decision - it helps me provide better suggestions in the future.

/| /| ( :v: ) |(_)|

✏️ Learnings added

Learnt from: PacificViking PR: y-scope/log-archival-bench#5 File: assets/gzip/Dockerfile:0-0 Timestamp: 2025-07-30T13:55:45.493Z Learning: In benchmarking Docker files, the --allow-downgrades flag in apt-get install commands is intentionally used to ensure benchmarks can run on specified package versions even when Ubuntu has newer versions, maintaining consistency for reliable benchmark results.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0834906 and d6d1ceb.

📒 Files selected for processing (4)

assets/clickhouse/main.py (2 hunks)
assets/clp/main.py (2 hunks)
assets/presto_clp/main.py (2 hunks)
scripts/benchall.py (5 hunks)

🧰 Additional context used

🧬 Code Graph Analysis (2)

assets/presto_clp/main.py (1)

src/template.py (1)

Benchmark (55-420)

assets/clickhouse/main.py (1)

src/template.py (1)

Benchmark (55-420)

🪛 Ruff (0.12.2)

scripts/benchall.py

41-41: Trailing comma missing

Add trailing comma

(COM812)

55-55: Missing return type annotation for public function run

Add return type annotation: None

(ANN201)

55-55: Missing type annotation for function argument bencher

(ANN001)

55-55: Missing type annotation for function argument kwargs

(ANN001)

55-55: Missing type annotation for function argument bench_target

(ANN001)

55-55: Boolean default positional argument in function definition

(FBT002)

55-55: Missing type annotation for function argument attach

(ANN001)

55-55: Boolean default positional argument in function definition

(FBT002)

55-55: Missing type annotation for function argument attach_on_error

(ANN001)

🔇 Additional comments (9)

assets/presto_clp/main.py (2)

19-19: LGTM! Class renamed for consistency.

The class name change from clp_presto_bench to presto_clp_bench aligns with the standardization effort across benchmark modules and follows a consistent naming pattern.

105-105: LGTM! Instantiation updated correctly.

The class instantiation correctly uses the new class name presto_clp_bench.

assets/clp/main.py (2)

8-8: LGTM! Class renamed for consistency.

The class name change from clp_s_bench to clp_bench simplifies the naming and aligns with the standardization effort across benchmark modules.

64-64: LGTM! Instantiation updated correctly.

The class instantiation correctly uses the new class name clp_bench.

assets/clickhouse/main.py (2)

16-16: LGTM! Class renamed for consistency.

The class name change from clickhouse_native_json_bench to clickhouse_bench simplifies the naming and aligns with the standardization effort across benchmark modules.

157-157: LGTM! Instantiation updated correctly.

The class instantiation correctly uses the new class name clickhouse_bench.
scripts/benchall.py (3)

3-9: LGTM! Import statements updated correctly.

All import statements have been properly updated to reflect the renamed benchmark classes across the different modules.

73-73: LGTM! Exception handling logic improved.

The updated condition correctly implements the new attach_on_error functionality, allowing Docker attachment on errors even when attach is False.

83-86: Clarify the ‘mongod’-only filter in benchall.py

I noticed that under scripts/benchall.py (lines 83–86), the code now skips all datasets except ‘mongod’:
if dataset_name != 'mongod':  # only use mongod for now
    continue
run(bencher, kwargs, bench_target, attach_on_error=True)
However, your data/ directory contains multiple datasets (cockroachdb, elasticsearch, mongod, postgresql, spark-event-logs).
Please confirm whether restricting to ‘mongod’ is a deliberate, permanent change. If it’s only for testing, consider one of the following:

Add a TODO comment indicating it’s temporary.

Make the filter configurable (e.g. via a CLI flag or config file).

Document the rationale in the README.

[scripts/benchall.py:83–86]

coderabbitai · 2025-07-28T19:17:22Z

scripts/benchall.py

+        (clp_bench, {}),
+        (clickhouse_bench, {
            'manual_column_names': False,
            'keys': [],
            'additional_order_by': [],
            'timestamp_key': True
            }),
-        (clp_presto_bench, {
+        (presto_clp_bench, {
            'dataset_variation': "cleaned_log"
            }),


🧹 Nitpick (assertive)

LGTM! Benchmark list updated correctly.

The benchmarks list has been properly updated to use the new class names, maintaining the same configuration parameters.

Consider adding a trailing comma after the dictionary on line 44 for consistency:

(presto_clp_bench, { 'dataset_variation': "cleaned_log" - }), + }),

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 Ruff (0.12.2)

41-41: Trailing comma missing

Add trailing comma

(COM812)

44-44: Trailing comma missing

Add trailing comma

(COM812)

🤖 Prompt for AI Agents

In scripts/benchall.py around lines 36 to 45, add a trailing comma after the dictionary on line 44 to maintain consistency in the list formatting. This means placing a comma after the closing brace of the dictionary assigned to 'dataset_variation' in the presto_clp_bench tuple.

coderabbitai · 2025-07-28T19:17:22Z

scripts/benchall.py

    ]

-def run(bencher, kwargs, bench_target, attach=False):
+def run(bencher, kwargs, bench_target, attach=False, attach_on_error=False):


🛠️ Refactor suggestion

Add type annotations and consider parameter naming.

The function signature has been extended appropriately with the attach_on_error parameter. However, several improvements could be made for code quality.

Apply this diff to add type annotations and improve the function signature:

-def run(bencher, kwargs, bench_target, attach=False, attach_on_error=False): +def run(bencher: type, kwargs: dict, bench_target: Path, attach: bool = False, attach_on_error: bool = False) -> None:

You'll need to add this import at the top of the file:

+from typing import Optional

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools

🪛 Ruff (0.12.2)

55-55: Missing return type annotation for public function run

Add return type annotation: None

(ANN201)

55-55: Missing type annotation for function argument bencher

(ANN001)

55-55: Missing type annotation for function argument kwargs

(ANN001)

55-55: Missing type annotation for function argument bench_target

(ANN001)

55-55: Boolean default positional argument in function definition

(FBT002)

55-55: Missing type annotation for function argument attach

(ANN001)

55-55: Boolean default positional argument in function definition

(FBT002)

55-55: Missing type annotation for function argument attach_on_error

(ANN001)

🤖 Prompt for AI Agents

In scripts/benchall.py at line 55, the function run lacks type annotations and could benefit from clearer parameter naming. Add appropriate type annotations to the function signature for all parameters, including bencher, kwargs, bench_target, attach, and attach_on_error. Also, rename parameters if needed to improve clarity and maintain consistency. Import any necessary typing modules at the top of the file to support these annotations.

kirkrodrigues · 2025-07-30T17:27:25Z

scripts/benchall.py

There's a reference to clp_presto in a comment on line 60.

kirkrodrigues · 2025-07-30T17:27:58Z

scripts/benchall.py

+from assets.clickhouse.main import clickhouse_bench
 from assets.sparksql.main import sparksql_bench
-from assets.parquet.main import parquet_bench
+from assets.presto_parquet.main import parquet_bench


Can we rename parquet_bench to presto_parquet_bench for consistency?

kirkrodrigues · 2025-07-30T17:29:48Z

scripts/benchall.py

+        if dataset_name != 'mongod': # only use mongod for now
+            continue


Did you mean to commit this?

kirkrodrigues · 2025-07-30T17:30:02Z

scripts/benchall.py

-        run(bencher, kwargs, bench_target)
+        if dataset_name != 'mongod': # only use mongod for now
+            continue
+        #run(bencher, kwargs, bench_target)


Can we delete this?

I have a change for it but I can't cherry pick it from the initial commit, and it doesn't fit in the scope of this PR. I'll open a separate PR for it when these get merged (or I can put it here).

Bill-hbrhbr

Will be fixed one by one in future PRs.

PacificViking requested a review from a team as a code owner July 28, 2025 06:09

coderabbitai bot reviewed Jul 28, 2025

View reviewed changes

PacificViking changed the title ~~Fix renamed assets import in benchall.py~~ fix: Fix renamed assets import in benchall.py Jul 28, 2025

PacificViking changed the title ~~fix: Fix renamed assets import in benchall.py~~ fix: Fix renamed assets import in benchall.py. Jul 28, 2025

fix assets renaming

d6d1ceb

PacificViking force-pushed the john_fix_assets_rename branch from 0834906 to d6d1ceb Compare July 28, 2025 19:14

coderabbitai bot reviewed Jul 28, 2025

View reviewed changes

coderabbitai bot mentioned this pull request Jul 30, 2025

build: Standardize asset containers to use official Ubuntu base images. #3

Open

3 tasks

kirkrodrigues changed the title ~~fix: Fix renamed assets import in benchall.py.~~ fix(: Fix incomplete renaming of imports in benchall.py. Jul 30, 2025

kirkrodrigues changed the title ~~fix(: Fix incomplete renaming of imports in benchall.py.~~ fix: Fix incomplete renaming of imports in benchall.py. Jul 30, 2025

kirkrodrigues changed the title ~~fix: Fix incomplete renaming of imports in benchall.py.~~ fix: Fix incomplete renaming of tool-specific methods, and imports in benchall.py. Jul 30, 2025

kirkrodrigues reviewed Jul 30, 2025

View reviewed changes

make changes according to code review

3b4aa8e

Bill-hbrhbr requested changes Sep 24, 2025

View reviewed changes

-RUN apt-get update \
-    && DEBIAN_FRONTEND=noninteractive apt-get install -y \
-    && DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \
-    tmux \
-    vim \
-    gzip
+RUN apt-get update \
+    && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
+    tmux=3.2a-4build1 \
+    vim=2:8.2.3458-2ubuntu2.2 \
+    gzip=1.10-4ubuntu4.1 \
+    && apt-get clean \
+    && rm -rf /var/lib/apt/lists/*

	&& DEBIAN_FRONTEND=noninteractive apt-get install -y --allow-downgrades \
	&& DEBIAN_FRONTEND=noninteractive apt-get install -y \

		if dataset_name != 'mongod': # only use mongod for now
		continue

Conversation

PacificViking commented Jul 28, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Validation performed

Summary by CodeRabbit

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PacificViking Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

kirkrodrigues Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kirkrodrigues Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

kirkrodrigues Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

PacificViking Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

kirkrodrigues Jul 30, 2025

Choose a reason for hiding this comment

Uh oh!

PacificViking Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

Bill-hbrhbr left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PacificViking commented Jul 28, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 28, 2025 •

edited

Loading

coderabbitai bot Jul 28, 2025 •

edited

Loading

kirkrodrigues Jul 30, 2025 •

edited

Loading