Skip to content

fix(consume): consume_direct.sh files contained invalid commands because value after --filter flag was not enclosed in parenthesis (issue reported by flcl42) #1987

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 22 additions & 2 deletions src/ethereum_clis/clis/geth.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,8 +124,28 @@ def _consume_debug_dump(
fixture_path: Path,
debug_output_path: Path,
):
debug_fixture_path = debug_output_path / "fixtures.json"
consume_direct_call = " ".join(command[:-1]) + f" {debug_fixture_path}"
# our assumption is that each command element is a string
assert all(isinstance(x, str) for x in command), (
f"Not all elements of 'command' list are strings: {command}"
)

# remove element that holds path to cached_downloads json file
command = command[:-1]

# for the now last element (value of --run) ensure it is wrapped in quotations (only relevant for blocktest) # noqa: E501
if "blocktest" in command:
if command[-1][0] not in {"'", '"'}:
command[-1] = '"' + command[-1] + '"'

# instead add path to fixtures.json file
debug_fixture_path = str(debug_output_path / "fixtures.json")
# but only after fixing the unescaped brackets
debug_fixture_path.replace("[", r"\[").replace("]", r"\]")
command.append(debug_fixture_path)

# now turn list into a command string
consume_direct_call = " ".join(command)

consume_direct_script = textwrap.dedent(
f"""\
#!/bin/bash
Expand Down
20 changes: 19 additions & 1 deletion src/ethereum_clis/clis/nethermind.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,13 +55,31 @@ def _consume_debug_dump(
result: subprocess.CompletedProcess,
debug_output_path: Path,
):
consume_direct_call = " ".join(command)
# our assumption is that each command element is a string
assert all(isinstance(x, str) for x in command), (
f"Not all elements of 'command' list are strings: {command}"
)

# ensure that the --filter flag value is wrapped in double-quotes
consume_direct_call = ""
prev_command_was_filter_flag = False
for s in command:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afaik, all parameters that contain spaces need to be enclosed in double-quotes, not only the ones after filter, and not only for the nethermind commands.

Should we look into pre-processing the command parameter before it's passed to this function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick gpt-query yields that we should use shlex.quote on every parameter to handle this automatically:

import shlex

args = ["my command", "--option", "value with spaces", "file$(rm -rf /)"]

command = " ".join(shlex.quote(arg) for arg in args)
print(command)

Output:

'my command' --option 'value with spaces' 'file$(rm -rf /)'

And shlex is included in the default libraries so no new package dependencies.

Copy link
Collaborator Author

@felix314159 felix314159 Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not heard of shlex before, so I think it will be harder to read the code if someone sees shlex.quote() because chances are the reader is not familar with that function. But the current solution is just a few lines of trivial code. I also saw in shlex docs that that function seems to be incompatible with windows, I know that windows support itsn't a prio but why make potential future support harder for no reason.

We could add pre-processing to command but we need separate logic for fixing the sh command produced by geth's evm anyway, so its simpler to have separate fixes for nethtest and geth and then decide to not add support for more execution clients. For geth's evm specifically we create faulty .sh files for blockchain tests, I will add another commit to fix it too. All problems are just results from our decision to allow weird chars like :, [ and ] in filenames which makes escaping of names ugly and error-prone.

Copy link
Collaborator Author

@felix314159 felix314159 Aug 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on this takes me longer than expected because I can't get logging to work for some reason. Edit: found temporary workaround

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from our decision to allow weird chars like :, [ and ] in filenames

Just for clarification, they are part of the test names, not file names, this is a pytest intrinsic issue.

I also saw in shlex docs that that function seems to be incompatible with windows

This will end up in a .sh file anyway, which needs bash to run. If required it we could build a windows script in the future but I feel like that's out of scope for this PR.

I think it will be harder to read the code if someone sees shlex.quote() because chances are the reader is not familar with that function.

That's true, I haven't heard of it before but (a) this is deep into the weeds of the code anyway, I would be more concerned if this was part of a test, and (b) it's really easy to look it up.

Copy link
Collaborator Author

@felix314159 felix314159 Aug 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point with the .sh lol.. I would not be against shlex if I didn't already build a solution. Can re-do this PR and start from scratch (and use shlex) if you want

if prev_command_was_filter_flag:
if s[0] != '"':
s = '"' + s + '"'
prev_command_was_filter_flag = False
consume_direct_call += s + " "
if s.strip() == "--filter":
prev_command_was_filter_flag = True
consume_direct_call = consume_direct_call.strip()

consume_direct_script = textwrap.dedent(
f"""\
#!/bin/bash
{consume_direct_call}
"""
)

dump_files_to_directory(
str(debug_output_path),
{
Expand Down
Loading