Skip to content

Unify behavior of source code search#332

Merged
disinvite merged 6 commits intoisledecomp:masterfrom
disinvite:walk-dir-compatibility
Mar 8, 2026
Merged

Unify behavior of source code search#332
disinvite merged 6 commits intoisledecomp:masterfrom
disinvite:walk-dir-compatibility

Conversation

@disinvite
Copy link
Copy Markdown
Collaborator

@disinvite disinvite commented Mar 6, 2026

This started with fixing a problem observed in #331 and evolved from there.

Fixing compatibility issues in walk_source_dir

walk_source_dir iterates the root source code directory for a decomp project. #279 changed this to use Path.walk() instead of os.walk(). The problem is that Path.walk() is only available on Python 3.12 and higher, but we still support a few versions earlier than that. This was missed because no unit tests targeted walk_source_dir. We didn't hear anything from users so hopefully the explanation is that no one is using reccmp with older Python versions.

This restores the older os.walk() implementation so we can keep compatibility with Python 3.10 and 3.11. We now have unit tests to verify the behavior of walk_source_dir.

Returning source code paths in a consistent order

decomplint already supported multiple source code search paths at the command line, and we will add this option in #331. In the main reccmp script, there is the potential for data to be overwritten if we read metadata for the same address in multiple files. Therefore, we need to ensure:

  1. Paths are deduplicated so we do not scan the same file twice
  2. Paths are scanned in the same order across platforms

The second point is an issue because pathlib.PosixPath is sorted in case-sensitive order, but pathlib.WindowsPath is not. The paths are now always sorted in case-insensitive order so the output will be the same across platforms.

To bring this all together, we now have a source_code_search function. Tests are included.

@disinvite disinvite changed the title Restore compatibility for walk_source_dir Unify behavior of source code search Mar 7, 2026
@disinvite disinvite marked this pull request as ready for review March 7, 2026 17:10
Copy link
Copy Markdown
Collaborator

@jonschz jonschz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, the test coverage is nice. I didn't check for regressions in great detail, though

@disinvite
Copy link
Copy Markdown
Collaborator Author

Looks good, the test coverage is nice. I didn't check for regressions in great detail, though

The only new thing should be the ordering of paths, but I don't expect a change in behavior unless both are true:

  1. The paths/files in a project aren't named consistently (i.e. not all lowercase).
  2. The same address is used in an annotation across multiple files.

I checked with a few projects on Windows and didn't see a diff. No diff for LEGO1 between Mac and Windows either.

@disinvite disinvite merged commit 3897cfd into isledecomp:master Mar 8, 2026
17 checks passed
@disinvite disinvite deleted the walk-dir-compatibility branch March 8, 2026 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants