Skip to content

Conversation

@ishan372or
Copy link

Description

What is this PR

  • Bug fix
  • Addition of a new feature
  • Other (documentation reorganization)

Why is this PR needed?

Some subprocess calls in the rclone utilities relied on shell=True, which has security and portability implications and can behave inconsistently across platforms, especially on Windows. Issue #639 recommends reducing the use of shell=True where possible and handling platform-specific requirements explicitly.

What does this PR do?

  • Refactors rclone-related subprocess calls to prefer argument-list execution (shell=False)

  • Adds explicit Windows-only fallbacks to shell=True where PATH resolution requires a shell

  • Preserves existing behavior while improving safety and clarity

  • Does not introduce any user-facing or API changes

References

How has this PR been tested?

  • Ran pytest locally on Windows.

  • Test collection and unit tests run successfully.

  • Integration and TUI tests requiring rclone fail locally due to missing rclone installation and Textual environment differences on Windows.

  • No test behavior was modified; CI is expected to validate rclone-dependent tests.

Is this a breaking change?

No.
This PR does not break any existing functionality and only refactors internal subprocess execution.

Does this PR require an update to the documentation?

No.
There are no user-facing changes that require documentation updates.

Checklist:

  • The code has been tested locally
  • Tests have been added to cover all new functionality
  • The documentation has been updated to reflect any changes
  • The code has been formatted with pre-commit (not applicable for docs-only changes)

@JoeZiminski
Copy link
Member

Hi @ishan372or thanks for this, this makes sense. To summarise and add some notes because I never quite understood this:

  • shell=False will typically error out with FileNotFoundError on Windows. This is because shell=False will just search a set of paths for a rclone.exe. Typically these do not include the conda Scripts directory (when you install conda on Windows it suggests not adding it to the system path, which actually becomes the source of many issues). When shell=True will run rclone through cmd.exe which can check other locations / run .bat set up scripts so rclone can be found. On Linux, I think this is less of an issue but in theory still could be a problem if the environment is not set up properly.
  • If shell=True then the command should be a single line. If shell=False then it should be split into a list.

To be honest, things are so flakey with shell=False that I am tempted to continue to use shell=True unless the security implications are serious. Would you be interested in researching how shell=True can lead to security issues and check if any are applicable in the datashuttle case? I think that because users are running on their own machines, and we do not give them the option to write the commands themselves, then we are not running any security risk. My understanding is if we had an input box 'enter the command' and then executed this command with shell=True on some remote server containing sensitive information, then we could be at risk. But we don't do anything like this. If this is the case, I think we can continue to use shell=True and make a note of this somewhere, either in the code or documentation. Thanks for your work on this!

@ishan372or
Copy link
Author

Hi @JoeZiminski, i am happy to see the if there are any serious security implications in datashuttle for using shell=True.
I will report back with a summary and recommendation on whether to keep using shell=True. Thanks for the guidance.

@ishan372or
Copy link
Author

@JoeZiminski Thanks for the patience. I confirmed shell=True is low risk here since we don't expose arbitrary command input to users—the inputs are internal paths we control. I also looked into that without shell=True, datashuttle definitely fails for Windows users unless they manually add rclone to their global PATH (which most Conda users won't have done).

I initially thought keeping the fallback logic (trying shell=False first) might be safer or cleaner for some environments, but I agree that since shell=True is required for Windows stability anyway, the fallback just adds unnecessary complexity. I will revert to a clean shell=True implementation and add the explanatory note about the security context and Windows requirements as you suggested. If thats fine i will push the changes.

@JoeZiminski
Copy link
Member

Hi @ishan372or thanks a lot for this, that's great to know we came to the same conclusion, I think then this is a low-risk case and we can continue to use shell=True as you say. I'll also bring this up with the time to triple-check. I agree with your proposed way to proceed, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants