Skip to content

Conversation

yzh119
Copy link
Collaborator

@yzh119 yzh119 commented Oct 6, 2025

📌 Description

Duplicate of #1869 but created from flashinfer/workflow-update-codeowner to make sure we have permission.

This PR introduces a GitHub Action that automatically creates pull requests to update the CODEOWNERS file.

  • Scheduled: Runs weekly by default
  • Manual: Can be triggered on-demand through the Actions tab

This PR also adds functionality to the codeowner_analyzer.py:

  • --depth: Maximum directory depth for module detection (default: 3)
  • --top-n: Number of top owners to include in CODEOWNERS file (default: 3)

Example can be found at https://github.com/flashinfer-ai/flashinfer/pull/1871/files

🔍 Related Issues

🚀 Pull Request Checklist

Thank you for contributing to FlashInfer! Before we review your pull request, please make sure the following items are complete.

✅ Pre-commit Checks

  • I have installed pre-commit by running pip install pre-commit (or used your preferred method).
  • I have installed the hooks with pre-commit install.
  • I have run the hooks manually with pre-commit run --all-files and fixed any reported issues.

If you are unsure about how to set up pre-commit, see the pre-commit documentation.

🧪 Tests

  • Tests have been added or updated as needed.
  • All tests are passing (unittest, etc.).

Reviewer Notes

cc @sricketts @yongwww @nvmbreughe @bkryu @dierksen

yzh119 and others added 13 commits October 5, 2025 20:16
This reverts commit 3934511.
Testing if the token has permission to create PRs from pull_request trigger.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Use workflow_dispatch for testing instead to ensure secrets are accessible.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Copy link
Contributor

Summary of Changes

Hello @yzh119, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the codeowner_analyzer.py script by introducing new configuration options for module detection and CODEOWNERS file generation. It provides greater flexibility by allowing users to specify the maximum directory depth for identifying modules and to control the number of top owners included in the output, making the automated CODEOWNERS update process more adaptable to different project structures.

Highlights

  • New Configuration Parameters: Introduced max_depth and top_n_owners parameters to the CodeOwnersAnalyzer class, allowing for more granular control over module detection and owner selection.
  • Configurable Module Depth: The module detection logic in get_modules now respects the max_depth parameter, limiting how many parent directories are considered as modules.
  • Dynamic Top Owners Selection: The generate_codeowners_file method now uses the top_n_owners parameter to determine how many top owners are included in the generated CODEOWNERS file, instead of a hardcoded value.
  • Command-Line Arguments: Added --depth and --top-n command-line arguments to the script, enabling users to easily configure these new parameters when running the codeowner_analyzer.py script.
Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/update-codeowners.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds configurability for directory depth and the number of top owners in the codeowner_analyzer.py script. The changes are logical and correctly plumb the new parameters through the script. However, I've found a high-severity issue where negative values for the new command-line arguments can lead to incorrect and unexpected behavior. I've also noted that tests for this new functionality are missing. Please see my detailed comments for suggestions on how to address these points.

Comment on lines +784 to +795
parser.add_argument(
"--depth",
type=int,
default=3,
help="Maximum directory depth for module detection (default: 3)",
)
parser.add_argument(
"--top-n",
type=int,
default=3,
help="Number of top owners to include in CODEOWNERS file (default: 3)",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The new command-line arguments --depth and --top-n are defined to accept any integer. This allows negative values, which can lead to incorrect behavior:

  • A negative --depth will cause min(len(path_parts), self.max_depth) to likely return a negative number, resulting in no modules being detected without any warning.
  • A negative --top-n will be interpreted as a slice from the end of the owners list due to Python's slicing behavior (e.g., -1 selects all but the last owner). This is not the intended behavior for selecting the top N owners and will produce an incorrect CODEOWNERS file.

To prevent this, you should validate that these arguments are non-negative integers. You can do this by creating a custom type function for argparse.

Here is an example of a validator function you could add before main():

import argparse

def non_negative_int(value):
    """Custom argparse type for non-negative integers."""
    try:
        ivalue = int(value)
        if ivalue < 0:
            raise argparse.ArgumentTypeError(f"{value} is an invalid non-negative int value")
        return ivalue
    except ValueError:
        raise argparse.ArgumentTypeError(f"{value} is not an integer")

You can then use it in add_argument like this:
type=non_negative_int

Comment on lines +784 to +795
parser.add_argument(
"--depth",
type=int,
default=3,
help="Maximum directory depth for module detection (default: 3)",
)
parser.add_argument(
"--top-n",
type=int,
default=3,
help="Number of top owners to include in CODEOWNERS file (default: 3)",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While adding these new configuration options is a great improvement, the pull request is missing tests to verify the new functionality. Adding tests is crucial for ensuring the features work as expected and to prevent future regressions.

Please consider adding unit tests that cover:

  • The --depth argument correctly limits the module hierarchy being generated.
  • The --top-n argument correctly selects the specified number of top owners.
  • Edge cases, such as --depth=0 or --top-n=0.

@nvmbreughe
Copy link
Contributor

LGTM

env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
python scripts/codeowner_analyzer.py \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use --allowed-users-file here to explicitly control the group of trusted codeowners?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, do you want this list to be public or private?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think public is ok. This will give people an opportunity to either ask to be removed (if they no longer want to be a codeowner) or ask to be included (if they want to be but aren't for some reason).

@nvmbreughe
Copy link
Contributor

nvmbreughe commented Oct 6, 2025

Should we add tests/ to the codeowners anyway (noticed it got intentionally excluded)? If a test fails, it may be good to find the person who is most knowledgeable about it. Sometimes it's easy to find the associated module, sometimes it may not be.

Just a nitpick, since we could also use git blame.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants