Skip to content

Comments

Fix generate_sub_moved_events corrupting paths with unbounded str.replace#1158

Merged
BoboTiG merged 2 commits intogorakhargosh:masterfrom
bysiber:fix/sub-moved-events-replace-all
Feb 20, 2026
Merged

Fix generate_sub_moved_events corrupting paths with unbounded str.replace#1158
BoboTiG merged 2 commits intogorakhargosh:masterfrom
bysiber:fix/sub-moved-events-replace-all

Conversation

@bysiber
Copy link
Contributor

@bysiber bysiber commented Feb 20, 2026

generate_sub_moved_events uses str.replace() to convert destination paths back to source paths when generating subevents for directory moves. Since str.replace() without a count argument replaces all occurrences of the substring, this corrupts paths when the directory name being moved also appears elsewhere in the subtree.

Example:

Moving /tmp/bar/ (containing data/bar/file.txt) to /tmp/foo/:

full_path = "/tmp/foo/data/bar/file.txt"
dest_dir_path = "/tmp/foo"
src_dir_path = "/tmp/bar"

# Before (WRONG) — replaces both occurrences of "foo":
full_path.replace(dest_dir_path, src_dir_path)
# → depending on overlap, can mangle the interior path components

# This is especially visible with short or common directory names.

Fix: replace only the leading prefix using string slicing instead of str.replace():

renamed_path = src_dir_path + full_path[len(dest_dir_path):] if src_dir_path else ""

This ensures only the prefix is substituted, leaving the rest of the path intact.

str.replace() without a count argument replaces ALL occurrences of
the destination directory name in the path, not just the leading
prefix. If the directory name appears in a subdirectory name within
the tree, sub-paths get silently corrupted.

For example, moving /tmp/bar (containing /tmp/bar/data/bar/file.txt)
to /tmp/foo would produce src_path /tmp/foo/data/foo/file.txt instead
of /tmp/foo/data/bar/file.txt.

Fix: use string slicing to replace only the prefix.
@BoboTiG
Copy link
Collaborator

BoboTiG commented Feb 20, 2026

Thanks @bysiber!

Can you update the changelog and add a test?

Add a test that exercises the case where a directory name appears
multiple times in the path, verifying only the prefix is replaced.
Also add a changelog entry.
@bysiber
Copy link
Contributor Author

bysiber commented Feb 20, 2026

Added a regression test and a changelog entry. The test creates a directory tree where the directory name appears twice in the path (e.g. data/data/file.txt) and verifies only the prefix gets swapped.

@BoboTiG BoboTiG merged commit dc34524 into gorakhargosh:master Feb 20, 2026
28 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants