Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
0ebb170
#4036: Clone tool/core into cycle dir. Symlink into python/lib
chrisbillowsMO May 21, 2025
2fc7ecc
#4036: Configure rose to set previous cycle point env var
chrisbillowsMO May 21, 2025
9ebd0f2
#4036: Add git logs to report. Includes dev run local option.
chrisbillowsMO May 21, 2025
3ae7fa9
#4036: Unindent line
chrisbillowsMO May 21, 2025
7ee9dce
#4036: Import DKRZ container vars. Test subprocess cmd
chrisbillowsMO May 21, 2025
529f350
#4036: Run fetch_packages cmd on dkrz
chrisbillowsMO May 21, 2025
57b55fd
#4036: Correct func call
chrisbillowsMO May 21, 2025
237b814
#4036: Pass site env file into script
chrisbillowsMO May 21, 2025
8e221af
#4036: Use site env in command
chrisbillowsMO May 21, 2025
e300455
#4036: Tweak command
chrisbillowsMO May 21, 2025
290ef02
#4036: Update env-file path
chrisbillowsMO May 21, 2025
bc6163b
#4036: Call command in rose conf. Call with sing env in python
chrisbillowsMO May 21, 2025
78b2d16
#4036: Run only version command in rose-app.conf
chrisbillowsMO May 21, 2025
bdcfae2
#4036: Make generate report multisite. Tweak env files. DKRZ get vers…
chrisbillowsMO May 27, 2025
d10ddc8
#4036: Run rose config dump
chrisbillowsMO May 27, 2025
dbb0557
#4036: Remove development/local version
chrisbillowsMO May 28, 2025
aad0813
#4036: WIP commit. Split sha/commit fetching into modules. Refactor l…
chrisbillowsMO May 30, 2025
f7f0918
Merge branch 'main' into 4036_add_info_to_rtw_status_report
chrisbillowsMO May 30, 2025
8f3f8b2
#4036: WIP commit. Refactored MO/git path working
chrisbillowsMO May 30, 2025
9ed4b06
#4036: Add WIP commits from home machine. Add test main for dkrz (#4079)
chrisbillows Jun 1, 2025
cfd8048
#4036: Merge branch 'main' into 4036_add_info_to_rtw_status_report
chrisbillowsMO Jun 2, 2025
15ebd45
#4036: DKRZ site test passing
chrisbillowsMO Jun 2, 2025
52f8380
#4036: dkrz tweaks. run every 10mins for testing
chrisbillowsMO Jun 9, 2025
5a6f339
4036: Minor DKRZ fixes (running). Create container versions bash script
chrisbillowsMO Jun 9, 2025
848514c
Merge branch 'main' into 4036_add_info_to_rtw_status_report
chrisbillowsMO Jun 9, 2025
469d32b
#4036: Fix shellcheck errors
chrisbillowsMO Jun 9, 2025
c5bdcda
#4036: Fix dkrz bash script
chrisbillowsMO Jun 10, 2025
87eb896
#4036: Unify imports
chrisbillowsMO Jun 10, 2025
f06ce76
#4036: Refactor sha extraction code
chrisbillowsMO Jun 10, 2025
8e6c5fe
Merge branch 'main' into 4036_add_info_to_rtw_status_report
chrisbillowsMO Jun 11, 2025
625f0ad
#4036: Various minor refactors, minor reformatting etc.
chrisbillowsMO Jun 11, 2025
baaa803
#4036: Improve jinja template layout
chrisbillowsMO Jun 11, 2025
6cf94cf
#4036: Codacy tweaks
chrisbillowsMO Jun 11, 2025
5581222
Merge branch 'main' into 4036_add_info_to_rtw_status_report
chrisbillowsMO Jun 11, 2025
ee0fe53
Merge branch 'main' into 4036_add_info_to_rtw_status_report
chrisbillowsMO Jun 16, 2025
26c3cce
#4036: WIP commit adding GitHub API calls
chrisbillowsMO Jun 17, 2025
a7b4a50
#4036: Add docstrings to tests. Skips some tests while adding gh API
chrisbillowsMO Jun 18, 2025
e7d1148
#4036: MO working
chrisbillowsMO Jun 19, 2025
db931d9
Merge branch 'main' into 4036_add_info_to_rtw_status_report
chrisbillowsMO Jun 19, 2025
55c5d81
#4036: Tweak container regex. Working on dkrz.
chrisbillowsMO Jun 20, 2025
0a1dd53
#4036: Add debug logs to report. Working on MO
chrisbillowsMO Jun 25, 2025
5c046d4
#4036: Fix backslashes. Handle missing esmv debugs. Add report warning
chrisbillowsMO Jun 26, 2025
0f7581c
#4036: Remove development hacks. Add recipe display name
chrisbillowsMO Jun 27, 2025
6f23b85
#4036: Tweaks for DKRZ
chrisbillowsMO Jun 27, 2025
df74b38
#4036: Add documentation
chrisbillowsMO Jun 27, 2025
cabfdbd
Merge branch 'main' into 4036_add_info_to_rtw_status_report
chrisbillowsMO Jun 27, 2025
d7d371e
#4036: Revert cycle to once per day
chrisbillowsMO Jun 27, 2025
801bb16
Merge branch '4036_add_info_to_rtw_status_report' of github.com:ESMVa…
chrisbillowsMO Jun 27, 2025
0e243e0
#4036: Adjust previous cylc env var
chrisbillowsMO Jun 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions doc/sphinx/source/utils/RTW/user_guide/workflow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,19 @@ The |RTW| performs the following steps:
Runs each cycle for every recipe defined in the |RTW| after ``process``
has completed

``generate_report``
:Description
Generate a by recipe HTML summary of the ``process`` and ``compare``
jobs results.
:Runs on:
Localhost
:Executes:
The ``generate_html_report.py`` script from the |Rose| app, and other
helper scripts depending on ``SITE``.
:Details:
Runs for every cycle. The report is output to the |Cylc| share/cycle
directory.

``housekeeping``
:Description:
Removes the logs and data (including recipe outputs)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
"""Fetch commit details from the GitHub API."""

import os

import requests

GITHUB_API_URL = "https://api.github.com"
GITHUB_API_PERSONAL_ACCESS_TOKEN = os.environ.get(
"GITHUB_API_PERSONAL_ACCESS_TOKEN"
)
HEADERS = {
"authorization": f"token {GITHUB_API_PERSONAL_ACCESS_TOKEN}",
# Suggested here:
# https://docs.github.com/en/rest/commits/commits?apiVersion=2022-11-28#get-a-commit--parameters
# Explanation here:
# https://docs.github.com/en/rest/overview/resources-in-the-rest-api#http-HEADERS
"accept": "application/vnd.github+json",
}


def fetch_commit_details_from_github_api(
shas_by_package_and_day, headers=HEADERS
):
"""
Fetch commit details from the GitHub API for the given SHAs.

Parameters
----------
shas_by_package_and_day : dict[str, dict[str, str]]
A dictionary where keys are the package names and values are dictionaries
with days as keys and SHAs as values. E.g.
{"ESMValCore": {"today": "abcd123", "yesterday": "efgh456"}...}.

Returns
-------
dict[str, list[dict]]
A dictionary where keys are the package names and values are lists of
commit details for each day. E.g.
{"ESMValCore": [{"sha": "abcd123", ...}, ...], "ESMValTool": [...]}
"""
commit_details_by_package = {}
for package, shas_by_day in shas_by_package_and_day.items():
if shas_by_day.get("yesterday") is None or shas_by_day.get(
"today"
) == shas_by_day.get("yesterday"):
raw_commits = fetch_single_commit(
package, "ESMValGroup", headers, shas_by_day["today"]
)
else:
raw_commits = fetch_range_of_commits(
package,
"ESMValGroup",
headers,
newer_sha=shas_by_day["today"],
older_sha=shas_by_day["yesterday"],
)
commit_info = process_commit_info(raw_commits)
commit_details_by_package[package] = commit_info
return commit_details_by_package


def make_api_call(url, headers=None, params=None):
"""
Make a GET request to a given API url.

Parameters
----------
url : str
The URL to make the request to.
headers : dict, optional
Headers to include in the request.
params : dict, optional
Query parameters to include in the request.

Raises
------
HTTPError
If the request fails or returns with a status code other than 200.
TimeOutError
If the request times out.
ConnectionError
If there is a connection error.
TooManyRequestsError
If the API rate limit is exceeded.

Returns
-------
Response
The raw response from the API call.
"""
try:
response = requests.get(
url, headers=headers, params=params, timeout=10
)
if response.status_code != 200:
raise requests.exceptions.HTTPError(
f"Unexpected status code for url={url} headers={headers} "
f"params={params} - {response.status_code}: {response.text}"
)
except requests.exceptions.HTTPError as http_err:
raise requests.exceptions.HTTPError(
f"HTTP error occurred: {http_err}"
) from http_err
except requests.exceptions.Timeout as timeout_err:
raise requests.exceptions.Timeout(
f"Request timed out: {timeout_err}"
) from timeout_err
except requests.exceptions.ConnectionError as conn_err:
raise requests.exceptions.ConnectionError(
f"Connection error occurred: {conn_err}"
) from conn_err
return response


def fetch_single_commit(repo, owner, headers, sha):
"""
Fetch details of a single commit from the GitHub API.

Parameters:
----------
repo: str
The name of the repository. E.g. "ESMValTool"
owner: str
The owner of the repository. E.g. "ESMValGroup"
headers: dict
Headers to include in the request.
sha: str
The SHA of the commit to fetch details for.

Raises
------
HTTPError
If the commit is not found or if the request fails etc.

Returns
-------
dict
The raw commit data if found.
"""
url = f"{GITHUB_API_URL}/repos/{owner}/{repo}/commits/{sha}"
response = make_api_call(url, headers=headers)
raw_commit = response.json()
return raw_commit


def fetch_range_of_commits(repo, owner, headers, newer_sha, older_sha):
"""
Fetch details for a range of commits from the GitHub API.

The endpoint will return a range of commits in chronlogical order, from
the newer SHA to the older SHA. The function fetches batches of 10 commits
to avoid hitting the API rate limits. NOTE: The GitHub API will raise a
HTTPError if the newer SHA is not found.

Parameters:
----------
repo : str
The name of the repository. E.g. "ESMValTool"
owner : str
The owner of the repository. E.g. "ESMValGroup"
headers : dict
Headers to include in the request.
newer_sha : str
The SHA of the first commit to start fetching details for.
older_sha : str
The SHA of the commit to stop fetching at.

Raises
------
HTTPError
If the newer SHA is not found (or if the request fails etc.)
ValueError
If too many pages are fetched, indicating a potential infinite loop.

Returns
-------
list[dict]
A list of raw commit data for the range of commits from newer_sha to
older_sha, in chronological order.
"""
url = f"{GITHUB_API_URL}/repos/{owner}/{repo}/commits"
params = {
"per_page": 10,
"sha": newer_sha,
}
page = 1
range_raw_commits = []

fetched_end_sha = False

while not fetched_end_sha:
params["page"] = page
response = make_api_call(url, headers=headers, params=params)

page_raw_commits = response.json()
for raw_commit in page_raw_commits:
range_raw_commits.append(raw_commit)
if raw_commit["sha"].startswith(older_sha):
fetched_end_sha = True
break

page += 1
if page > 5:
raise ValueError(
"Too many pages fetched, likely an infinite loop. Check the "
"newer and older SHAs."
)

return range_raw_commits


def process_commit_info(raw_commit_info):
"""
Extract required commit details.

Parameters:
-----------
raw_commit_info : dict | list[dict]
Raw commit information from the GitHub API. Either a single commit or a
list of commits.

Returns
-------
list[dict]
A list of dictionaries containing processed commit information.
"""

if not isinstance(raw_commit_info, list):
raw_commit_info = [raw_commit_info]

processed_commit_info = []
for raw_commit in raw_commit_info:
processed_comit = {
"sha": raw_commit["sha"][:7],
"author": raw_commit["commit"]["author"]["name"],
"message": raw_commit["commit"]["message"],
"date": raw_commit["commit"]["author"]["date"],
"url": raw_commit["html_url"],
"author_avatar": raw_commit["author"]["avatar_url"],
}
processed_commit_info.append(processed_comit)
return processed_commit_info
Loading
Loading