Check for python content was too broad by rkoumis · Pull Request #2949 · kevoreilly/CAPEv2

rkoumis · 2026-03-18T17:29:21Z

too many files can contain 'import', for example .eml files
reverts part of baa0f5f

- too many files can contain 'import', for example .eml files - reverts part of baa0f5f

gemini-code-assist · 2026-03-18T17:29:35Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses an issue where the file type detection mechanism was incorrectly classifying various files as Python scripts due to an overly broad check for the presence of the 'import' keyword. By removing this generic content check, the system will now rely on more specific indicators like file extensions (.py, .pyc) or explicit 'Python script' type identification, leading to more accurate file categorization within the analyzer.

Highlights

Python File Detection Refinement: The logic for identifying Python files in the choose_package function was updated to remove the overly broad check for the byte string 'import' within file content.
Reversion of Overly Broad Check: This change partially reverts a previous commit (baa0f5f) that introduced the generic 'import' content check, which was causing incorrect file classifications.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request addresses an issue where the check for Python content was too broad, leading to incorrect file type classifications. The change removes the problematic b"import" in file_content condition, which is a valid fix. I am suggesting an improvement to this change by replacing the removed condition with a more specific content-based check for a shebang line. This would allow for the detection of Python scripts that lack a standard file extension, but in a more reliable manner that avoids the original issue of false positives.

analyzer/windows/lib/core/packages.py

rkoumis · 2026-03-18T20:39:44Z

analyzer/windows/lib/api/process.py

 )

 if sys.platform == "win32":
+    from ctypes import windll


Tried to run some of these windows tests under linux, which isn't really a thing, but this helped (a bit)

rkoumis · 2026-03-18T20:39:57Z

analyzer/windows/lib/core/packages.py

    elif "Macromedia Flash" in file_type or file_name.endswith((".swf", ".fws")):
        return "swf"
-    elif file_name.endswith((".py", ".pyc")) or "Python script" in file_type:
+    elif file_name.endswith((".py", ".pyc")) or "Python script" in file_type or (file_content.startswith(b'#!/') and b'python' in file_content.split(b'\n', 1)[0]):


As suggested by copilot

rkoumis · 2026-03-18T20:40:39Z

analyzer/windows/tests/test_data/test_email.eml

+
+How's it going?
+
+Did you import that file I sent you last week?


I made sure to include the word "import" in this email

- slight tweaks to make it easier to run the tests under Linux

josh-feather · 2026-03-19T11:00:59Z

@doomedraven FYI, this was causing problems for users who were submitting .eml files and relying on the package autodetection.

Check for python content was too broad

9083e89

- too many files can contain 'import', for example .eml files - reverts part of baa0f5f

gemini-code-assist bot reviewed Mar 18, 2026

View reviewed changes

analyzer/windows/lib/core/packages.py Outdated Show resolved Hide resolved

rkoumis commented Mar 18, 2026

View reviewed changes

Added tests for choose_package

8e9520f

- slight tweaks to make it easier to run the tests under Linux

rkoumis force-pushed the python-check-is-too-broad branch from 4ad0682 to 8e9520f Compare March 18, 2026 20:45

Reformatting

1929082

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check for python content was too broad#2949

Check for python content was too broad#2949
rkoumis wants to merge 3 commits intokevoreilly:masterfrom
rkoumis:python-check-is-too-broad

rkoumis commented Mar 18, 2026

Uh oh!

gemini-code-assist bot commented Mar 18, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

rkoumis Mar 18, 2026

Uh oh!

rkoumis Mar 18, 2026

Uh oh!

rkoumis Mar 18, 2026

Uh oh!

josh-feather commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		How's it going?

		Did you import that file I sent you last week?

Conversation

rkoumis commented Mar 18, 2026

Uh oh!

gemini-code-assist bot commented Mar 18, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

rkoumis Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

rkoumis Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

rkoumis Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

josh-feather commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants