-
Notifications
You must be signed in to change notification settings - Fork 22
feat: add support for Co-authored-by commit trailers with username correlation #373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
e5ddef5
0a5bba8
f206280
9798816
4454fe7
211543e
1fabe84
b14feb0
306e4fc
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -84,20 +84,23 @@ This action can be configured to authenticate with GitHub App Installation or Pe | |
|
|
||
| #### Other Configuration Options | ||
|
|
||
| | field | required | default | description | | ||
| | ------------------- | ----------------------------------------------- | ----------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | ||
| | `GH_ENTERPRISE_URL` | False | "" | The `GH_ENTERPRISE_URL` is used to connect to an enterprise server instance of GitHub. github.com users should not enter anything here. | | ||
| | `ORGANIZATION` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the GitHub organization which you want the contributor information of all repos from. ie. github.com/github would be `github` | | ||
| | `REPOSITORY` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the repository and organization which you want the contributor information from. ie. `github/contributors` or a comma separated list of multiple repositories `github/contributor,super-linter/super-linter` | | ||
| | `START_DATE` | False | Beginning of time | The date from which you want to start gathering contributor information. ie. Aug 1st, 2023 would be `2023-08-01`. | | ||
| | `END_DATE` | False | Current Date | The date at which you want to stop gathering contributor information. Must be later than the `START_DATE`. ie. Aug 2nd, 2023 would be `2023-08-02` | | ||
| | `SPONSOR_INFO` | False | False | If you want to include sponsor information in the output. This will include the sponsor count and the sponsor URL. This will impact action performance. ie. SPONSOR_INFO = "False" or SPONSOR_INFO = "True" | | ||
| | `LINK_TO_PROFILE` | False | True | If you want to link usernames to their GitHub profiles in the output. ie. LINK_TO_PROFILE = "True" or LINK_TO_PROFILE = "False" | | ||
| | field | required | default | description | | ||
| | ----------------------- | ----------------------------------------------- | ----------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | ||
| | `GH_ENTERPRISE_URL` | False | "" | The `GH_ENTERPRISE_URL` is used to connect to an enterprise server instance of GitHub. github.com users should not enter anything here. | | ||
| | `ORGANIZATION` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the GitHub organization which you want the contributor information of all repos from. ie. github.com/github would be `github` | | ||
| | `REPOSITORY` | Required to have `ORGANIZATION` or `REPOSITORY` | | The name of the repository and organization which you want the contributor information from. ie. `github/contributors` or a comma separated list of multiple repositories `github/contributor,super-linter/super-linter` | | ||
| | `START_DATE` | False | Beginning of time | The date from which you want to start gathering contributor information. ie. Aug 1st, 2023 would be `2023-08-01`. | | ||
| | `END_DATE` | False | Current Date | The date at which you want to stop gathering contributor information. Must be later than the `START_DATE`. ie. Aug 2nd, 2023 would be `2023-08-02` | | ||
| | `SPONSOR_INFO` | False | False | If you want to include sponsor information in the output. This will include the sponsor count and the sponsor URL. This will impact action performance. ie. SPONSOR_INFO = "False" or SPONSOR_INFO = "True" | | ||
| | `LINK_TO_PROFILE` | False | True | If you want to link usernames to their GitHub profiles in the output. ie. LINK_TO_PROFILE = "True" or LINK_TO_PROFILE = "False" | | ||
| | `ACKNOWLEDGE_COAUTHORS` | False | True | If you want to include co-authors from commit messages as contributors. Co-authors are identified via the `Co-authored-by:` trailer in commit messages. The action will extract GitHub usernames from GitHub noreply emails (e.g., `[email protected]`) or use the full email address for other email domains. This will impact action performance as it requires scanning all commits. ie. ACKNOWLEDGE_COAUTHORS = "True" or ACKNOWLEDGE_COAUTHORS = "False" | | ||
|
|
||
| **Note**: If `start_date` and `end_date` are specified then the action will determine if the contributor is new. A new contributor is one that has contributed in the date range specified but not before the start date. | ||
|
|
||
| **Performance Note:** Using start and end dates will reduce speed of the action by approximately 63X. ie without dates if the action takes 1.7 seconds, it will take 1 minute and 47 seconds. | ||
|
|
||
| **Co-authors Note:** When `ACKNOWLEDGE_COAUTHORS` is enabled, the action will scan commit messages for `Co-authored-by:` trailers and include those users as contributors. For GitHub noreply email addresses (e.g., `[email protected]`), the username will be extracted. For other email addresses (e.g., `[email protected]`), the full email address will be used as the contributor identifier. See [GitHub's documentation on creating commits with multiple authors](https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors). | ||
|
|
||
| ### Example workflows | ||
|
|
||
| **Be sure to change at least these values: `<YOUR_ORGANIZATION_GOES_HERE>`, `<YOUR_GITHUB_HANDLE_HERE>`** | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,6 +1,7 @@ | ||||||||||||||||||||||||
| # pylint: disable=broad-exception-caught | ||||||||||||||||||||||||
| """This file contains the main() and other functions needed to get contributor information from the organization or repository""" | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| import re | ||||||||||||||||||||||||
| from typing import List | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| import auth | ||||||||||||||||||||||||
|
|
@@ -27,6 +28,7 @@ def main(): | |||||||||||||||||||||||
| end_date, | ||||||||||||||||||||||||
| sponsor_info, | ||||||||||||||||||||||||
| link_to_profile, | ||||||||||||||||||||||||
| acknowledge_coauthors, | ||||||||||||||||||||||||
| ) = env.get_env_vars() | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| # Auth to GitHub.com | ||||||||||||||||||||||||
|
|
@@ -46,7 +48,13 @@ def main(): | |||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| # Get the contributors | ||||||||||||||||||||||||
| contributors = get_all_contributors( | ||||||||||||||||||||||||
| organization, repository_list, start_date, end_date, github_connection, ghe | ||||||||||||||||||||||||
| organization, | ||||||||||||||||||||||||
| repository_list, | ||||||||||||||||||||||||
| start_date, | ||||||||||||||||||||||||
| end_date, | ||||||||||||||||||||||||
| github_connection, | ||||||||||||||||||||||||
| ghe, | ||||||||||||||||||||||||
| acknowledge_coauthors, | ||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| # Check for new contributor if user provided start_date and end_date | ||||||||||||||||||||||||
|
|
@@ -60,6 +68,7 @@ def main(): | |||||||||||||||||||||||
| end_date=start_date, | ||||||||||||||||||||||||
| github_connection=github_connection, | ||||||||||||||||||||||||
| ghe=ghe, | ||||||||||||||||||||||||
| acknowledge_coauthors=acknowledge_coauthors, | ||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||
| for contributor in contributors: | ||||||||||||||||||||||||
| contributor.new_contributor = contributor_stats.is_new_contributor( | ||||||||||||||||||||||||
|
|
@@ -103,6 +112,7 @@ def get_all_contributors( | |||||||||||||||||||||||
| end_date: str, | ||||||||||||||||||||||||
| github_connection: object, | ||||||||||||||||||||||||
| ghe: str, | ||||||||||||||||||||||||
| acknowledge_coauthors: bool, | ||||||||||||||||||||||||
| ): | ||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||
| Get all contributors from the organization or repository | ||||||||||||||||||||||||
|
|
@@ -113,6 +123,8 @@ def get_all_contributors( | |||||||||||||||||||||||
| start_date (str): The start date of the date range for the contributor list. | ||||||||||||||||||||||||
| end_date (str): The end date of the date range for the contributor list. | ||||||||||||||||||||||||
| github_connection (object): The authenticated GitHub connection object from PyGithub | ||||||||||||||||||||||||
| ghe (str): The GitHub Enterprise URL to use for authentication | ||||||||||||||||||||||||
| acknowledge_coauthors (bool): Whether to acknowledge co-authors from commit messages | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| Returns: | ||||||||||||||||||||||||
| all_contributors (list): A list of ContributorStats objects | ||||||||||||||||||||||||
|
|
@@ -130,7 +142,14 @@ def get_all_contributors( | |||||||||||||||||||||||
| all_contributors = [] | ||||||||||||||||||||||||
| if repos: | ||||||||||||||||||||||||
| for repo in repos: | ||||||||||||||||||||||||
| repo_contributors = get_contributors(repo, start_date, end_date, ghe) | ||||||||||||||||||||||||
| repo_contributors = get_contributors( | ||||||||||||||||||||||||
| repo, | ||||||||||||||||||||||||
| start_date, | ||||||||||||||||||||||||
| end_date, | ||||||||||||||||||||||||
| ghe, | ||||||||||||||||||||||||
| acknowledge_coauthors, | ||||||||||||||||||||||||
| github_connection, | ||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||
| if repo_contributors: | ||||||||||||||||||||||||
| all_contributors.append(repo_contributors) | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
|
|
@@ -140,20 +159,91 @@ def get_all_contributors( | |||||||||||||||||||||||
| return all_contributors | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| def get_contributors(repo: object, start_date: str, end_date: str, ghe: str): | ||||||||||||||||||||||||
| def get_coauthors_from_message( | ||||||||||||||||||||||||
| commit_message: str, github_connection: object = None | ||||||||||||||||||||||||
| ) -> List[str]: | ||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||
| Extract co-author identifiers from a commit message. | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| Co-authored-by trailers follow the format: | ||||||||||||||||||||||||
| Co-authored-by: Name <email> | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| For GitHub noreply emails ([email protected]), extracts the username. | ||||||||||||||||||||||||
| For @github.com emails, extracts the username (part before @). | ||||||||||||||||||||||||
| For other emails, uses GitHub Search Users API to find the username, or falls back to email. | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| Args: | ||||||||||||||||||||||||
| commit_message (str): The commit message to parse | ||||||||||||||||||||||||
| github_connection (object): The authenticated GitHub connection object from PyGithub | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| Returns: | ||||||||||||||||||||||||
| List[str]: List of co-author identifiers (GitHub usernames or email addresses) | ||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||
| # Match Co-authored-by trailers - case insensitive | ||||||||||||||||||||||||
| # Format: Co-authored-by: Name <email> | ||||||||||||||||||||||||
| pattern = r"Co-authored-by:\s*[^<]*<([^>]+)>" | ||||||||||||||||||||||||
| matches = re.findall(pattern, commit_message, re.IGNORECASE) | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| identifiers = [] | ||||||||||||||||||||||||
| for email in matches: | ||||||||||||||||||||||||
| # Check if it's a GitHub noreply email format: [email protected] | ||||||||||||||||||||||||
| noreply_pattern = r"^(\d+\+)?([^@]+)@users\.noreply\.github\.com$" | ||||||||||||||||||||||||
| noreply_match = re.match(noreply_pattern, email) | ||||||||||||||||||||||||
| if noreply_match: | ||||||||||||||||||||||||
| # For GitHub noreply emails, extract just the username | ||||||||||||||||||||||||
| identifiers.append(noreply_match.group(2)) | ||||||||||||||||||||||||
| elif email.endswith("@github.com"): | ||||||||||||||||||||||||
| # For @github.com emails, extract the username (part before @) | ||||||||||||||||||||||||
| username = email.split("@")[0] | ||||||||||||||||||||||||
| identifiers.append(username) | ||||||||||||||||||||||||
|
Comment on lines
+195
to
+198
|
||||||||||||||||||||||||
| else: | ||||||||||||||||||||||||
| # For other emails, try to find GitHub username using Search Users API | ||||||||||||||||||||||||
| if github_connection: | ||||||||||||||||||||||||
| try: | ||||||||||||||||||||||||
| # Search for users by email | ||||||||||||||||||||||||
| search_result = github_connection.search_users(f"email:{email}") | ||||||||||||||||||||||||
| if search_result.totalCount > 0: | ||||||||||||||||||||||||
| # Use the first matching user's login | ||||||||||||||||||||||||
| identifiers.append(search_result[0].login) | ||||||||||||||||||||||||
| else: | ||||||||||||||||||||||||
| # If no user found, fall back to email address | ||||||||||||||||||||||||
| identifiers.append(email) | ||||||||||||||||||||||||
| except Exception: | ||||||||||||||||||||||||
| # If API call fails, fall back to email address | ||||||||||||||||||||||||
| identifiers.append(email) | ||||||||||||||||||||||||
|
Comment on lines
+200
to
+213
|
||||||||||||||||||||||||
| else: | ||||||||||||||||||||||||
| # If no GitHub connection available, use the full email address | ||||||||||||||||||||||||
| identifiers.append(email) | ||||||||||||||||||||||||
| return identifiers | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| def get_contributors( | ||||||||||||||||||||||||
| repo: object, | ||||||||||||||||||||||||
| start_date: str, | ||||||||||||||||||||||||
| end_date: str, | ||||||||||||||||||||||||
| ghe: str, | ||||||||||||||||||||||||
| acknowledge_coauthors: bool, | ||||||||||||||||||||||||
| github_connection: object, | ||||||||||||||||||||||||
| ): | ||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||
| Get contributors from a single repository and filter by start end dates if present. | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| Args: | ||||||||||||||||||||||||
| repo (object): The repository object from PyGithub | ||||||||||||||||||||||||
| start_date (str): The start date of the date range for the contributor list. | ||||||||||||||||||||||||
| end_date (str): The end date of the date range for the contributor list. | ||||||||||||||||||||||||
| ghe (str): The GitHub Enterprise URL to use for authentication | ||||||||||||||||||||||||
| acknowledge_coauthors (bool): Whether to acknowledge co-authors from commit messages | ||||||||||||||||||||||||
| github_connection (object): The authenticated GitHub connection object from PyGithub | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| Returns: | ||||||||||||||||||||||||
| contributors (list): A list of ContributorStats objects | ||||||||||||||||||||||||
| """ | ||||||||||||||||||||||||
| all_repo_contributors = repo.contributors() | ||||||||||||||||||||||||
| contributors = [] | ||||||||||||||||||||||||
| # Track usernames already added as contributors | ||||||||||||||||||||||||
| contributor_usernames = set() | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| try: | ||||||||||||||||||||||||
| for user in all_repo_contributors: | ||||||||||||||||||||||||
| # Ignore contributors with [bot] in their name | ||||||||||||||||||||||||
|
|
@@ -187,6 +277,19 @@ def get_contributors(repo: object, start_date: str, end_date: str, ghe: str): | |||||||||||||||||||||||
| "", | ||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||
| contributors.append(contributor) | ||||||||||||||||||||||||
| contributor_usernames.add(user.login) | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| # Get co-authors from commit messages if enabled | ||||||||||||||||||||||||
| if acknowledge_coauthors: | ||||||||||||||||||||||||
| coauthor_contributors = get_coauthor_contributors( | ||||||||||||||||||||||||
| repo, | ||||||||||||||||||||||||
| start_date, | ||||||||||||||||||||||||
| end_date, | ||||||||||||||||||||||||
| ghe, | ||||||||||||||||||||||||
| github_connection, | ||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||
| contributors.extend(coauthor_contributors) | ||||||||||||||||||||||||
|
||||||||||||||||||||||||
| contributors.extend(coauthor_contributors) | |
| # Avoid adding duplicate contributors for the same username within this repository | |
| filtered_coauthors = [] | |
| for coauthor in coauthor_contributors: | |
| username = getattr(coauthor, "username", None) or getattr( | |
| coauthor, "login", None | |
| ) | |
| if username and username not in contributor_usernames: | |
| filtered_coauthors.append(coauthor) | |
| contributor_usernames.add(username) | |
| contributors.extend(filtered_coauthors) |
Copilot
AI
Dec 31, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bot accounts listed as co-authors via Co-authored-by: trailers will be included in the contributor list, while bot accounts that are regular contributors are filtered out (line 250). Consider applying the same bot filtering logic to co-authors for consistency by checking if "[bot]" is in the username before adding them to coauthor_counts.
| for username in coauthors: | |
| for username in coauthors: | |
| # Skip bot accounts for consistency with regular contributor filtering | |
| if "[bot]" in username.lower(): | |
| continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation is incomplete. It mentions that GitHub noreply emails extract the username and other emails use the full email address, but it doesn't document that @github.com email addresses also extract the username (part before @), or that the action attempts to use the GitHub Search Users API to find usernames for other email addresses before falling back to the email address.