Expand and fix auto-generated file filtering in is_valid_file()#2289
Conversation
Split exact generated filenames from suffix matches so lockfiles are matched by basename, while generated assets such as minified files and source maps use suffix matching. Normalize path separators before basename matching so nested paths are handled consistently.
Review Summary by QodoExpand auto-generated file filtering with lockfiles and assets
WalkthroughsDescription• Expanded lockfile detection with 8 additional package managers • Split exact filename matching from suffix-based filtering logic • Added minified and source map file filtering (.min.js, .min.css, .js.map, .ts.map, .css.map) • Normalized path separators for consistent nested path handling Diagramflowchart LR
A["is_valid_file function"] --> B["Check exact lockfile names"]
A --> C["Check generated asset suffixes"]
B --> D["Normalize path separators"]
D --> E["Match basename against set"]
C --> F["Match file endings"]
E --> G["Return False if matched"]
F --> G
File Changes1. pr_agent/algo/language_handler.py
|
Code Review by Qodo
1. Single quotes in is_valid_file()
|
JiwaniZakir
left a comment
There was a problem hiding this comment.
The switch from endswith to an exact basename comparison in is_valid_file() correctly fixes a subtle bug where the old code would have falsely excluded files like not-a-package-lock.json. The filename.replace('\\', '/').split('/')[-1] approach works but os.path.basename() (after normalizing separators) would be more idiomatic and immediately readable to Python developers.
The auto_generated_suffixes endswith check still operates on the raw, un-normalized filename rather than the extracted basename. On Windows paths, foo\\bar.min.js ends with .min.js so it works incidentally, but for consistency with the exact-name check above it, it would be cleaner to apply endswith to the same normalized basename value rather than the full path string.
The new exact-match set is case-sensitive, which means Cargo.Lock or GEMFILE.LOCK (plausible on case-insensitive filesystems or when filenames come from certain APIs) would slip through. It may be worth either documenting the assumption that filenames are lowercased upstream, or applying .lower() before the set lookup.
|
Thanks @PeterDaveHello ! |
Split exact generated filenames from suffix matches so lockfiles are matched by basename, while generated assets such as minified files and source maps use suffix matching. Normalize path separators before basename matching so nested paths are handled consistently.
GitHub Copilot PR summary: