Skip to content

Expand and fix auto-generated file filtering in is_valid_file()#2289

Merged
naorpeled merged 1 commit intoqodo-ai:mainfrom
PeterDaveHelloKitchen:update-auto-generated-file-filters
Mar 31, 2026
Merged

Expand and fix auto-generated file filtering in is_valid_file()#2289
naorpeled merged 1 commit intoqodo-ai:mainfrom
PeterDaveHelloKitchen:update-auto-generated-file-filters

Conversation

@PeterDaveHello
Copy link
Copy Markdown
Contributor

Split exact generated filenames from suffix matches so lockfiles are matched by basename, while generated assets such as minified files and source maps use suffix matching. Normalize path separators before basename matching so nested paths are handled consistently.

GitHub Copilot PR summary:

This pull request updates the logic in the is_valid_file function to more accurately filter out auto-generated files and certain file types. The main improvement is a more comprehensive and robust check for files that should be ignored, such as lock files and minified or source map files.

File filtering improvements:

  • Expanded the list of auto-generated files to ignore by using a set (auto_generated_files_exact) that now includes additional lock files like pnpm-lock.yaml, go.sum, .terraform.lock.hcl, uv.lock, Cargo.lock, Pipfile.lock, mix.lock, pubspec.lock, and bun.lockb. The check now matches the exact filename rather than just suffixes.
  • Added a tuple (auto_generated_suffixes) to filter out files ending with .min.js, .min.css, .js.map, .ts.map, and .css.map, ensuring minified and source map files are ignored.

Split exact generated filenames from suffix matches so lockfiles are
matched by basename, while generated assets such as minified files and
source maps use suffix matching. Normalize path separators before
basename matching so nested paths are handled consistently.
@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

Review Summary by Qodo

Expand auto-generated file filtering with lockfiles and assets

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Expanded lockfile detection with 8 additional package managers
• Split exact filename matching from suffix-based filtering logic
• Added minified and source map file filtering (.min.js, .min.css, .js.map, .ts.map, .css.map)
• Normalized path separators for consistent nested path handling
Diagram
flowchart LR
  A["is_valid_file function"] --> B["Check exact lockfile names"]
  A --> C["Check generated asset suffixes"]
  B --> D["Normalize path separators"]
  D --> E["Match basename against set"]
  C --> F["Match file endings"]
  E --> G["Return False if matched"]
  F --> G
Loading

Grey Divider

File Changes

1. pr_agent/algo/language_handler.py ✨ Enhancement +10/-4

Enhanced auto-generated file filtering logic

• Converted lockfile list to set auto_generated_files_exact with 13 lockfile types
• Added auto_generated_suffixes tuple for minified and source map files
• Implemented path separator normalization using replace('\\', '/').split('/')[-1] for basename
 extraction
• Separated exact filename matching from suffix-based filtering into distinct checks

pr_agent/algo/language_handler.py


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown
Contributor

qodo-free-for-open-source-projects bot commented Mar 26, 2026

Code Review by Qodo

🐞 Bugs (0) 📘 Rule violations (1) 📎 Requirement gaps (0) 📐 Spec deviations (0)

Grey Divider


Action required

1. Single quotes in is_valid_file() 📘 Rule violation ⚙ Maintainability
Description
New string literals use single quotes, conflicting with the repo’s Ruff style requirement to prefer
double quotes. This may cause CI/lint failures or inconsistent formatting across the codebase.
Code

pr_agent/algo/language_handler.py[R23-32]

+    auto_generated_files_exact = {
+        'package-lock.json', 'yarn.lock', 'pnpm-lock.yaml', 'composer.lock', 'Gemfile.lock',
+        'poetry.lock', 'go.sum', '.terraform.lock.hcl', 'uv.lock',
+        'Cargo.lock', 'Pipfile.lock', 'mix.lock', 'pubspec.lock', 'bun.lockb',
+    }
+    auto_generated_suffixes = ('.min.js', '.min.css', '.js.map', '.ts.map', '.css.map')
+    if filename.replace('\\', '/').split('/')[-1] in auto_generated_files_exact:
+        return False
+    if filename.endswith(auto_generated_suffixes):
+        return False
Evidence
PR Compliance ID 6 requires Ruff style including double quotes for strings; the added code
introduces multiple single-quoted string literals (e.g., in auto_generated_files_exact,
auto_generated_suffixes, and filename.replace('\\', '/')).

AGENTS.md
pr_agent/algo/language_handler.py[23-32]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new code in `is_valid_file()` introduces single-quoted string literals, but the repository Ruff style requires using double quotes for strings.

## Issue Context
This may trigger Ruff formatting/lint failures and creates inconsistent quoting style in the modified section.

## Fix Focus Areas
- pr_agent/algo/language_handler.py[23-32]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Copy link
Copy Markdown

@JiwaniZakir JiwaniZakir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The switch from endswith to an exact basename comparison in is_valid_file() correctly fixes a subtle bug where the old code would have falsely excluded files like not-a-package-lock.json. The filename.replace('\\', '/').split('/')[-1] approach works but os.path.basename() (after normalizing separators) would be more idiomatic and immediately readable to Python developers.

The auto_generated_suffixes endswith check still operates on the raw, un-normalized filename rather than the extracted basename. On Windows paths, foo\\bar.min.js ends with .min.js so it works incidentally, but for consistency with the exact-name check above it, it would be cleaner to apply endswith to the same normalized basename value rather than the full path string.

The new exact-match set is case-sensitive, which means Cargo.Lock or GEMFILE.LOCK (plausible on case-insensitive filesystems or when filenames come from certain APIs) would slip through. It may be worth either documenting the assumption that filenames are lowercased upstream, or applying .lower() before the set lookup.

@naorpeled naorpeled merged commit c7da733 into qodo-ai:main Mar 31, 2026
2 checks passed
@naorpeled
Copy link
Copy Markdown
Collaborator

Thanks @PeterDaveHello !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants