Skip to content

feat: interactive testing environment dashboard on /status (#11506)#12241

Open
mekarpeles wants to merge 3 commits intomasterfrom
11506/feature/pr-deploy-dashboard
Open

feat: interactive testing environment dashboard on /status (#11506)#12241
mekarpeles wants to merge 3 commits intomasterfrom
11506/feature/pr-deploy-dashboard

Conversation

@mekarpeles
Copy link
Copy Markdown
Member

@mekarpeles mekarpeles commented Apr 1, 2026

Summary

Closes #11506

Turns /status into an interactive testing environment management dashboard, available to members of /usergroup/maintainers (or /usergroup/admin). Non-maintainers and logged-out visitors see a read-only view.

  • Commit pinning: PRs are tracked by pinned commit SHA (stored in _testing-prs.json), not branch HEAD. "Pull to Latest" is an explicit action.
  • Dashboard actions: Add PR, Remove, Toggle Active, Pull to Latest, Force Rebuild — all POST endpoints gated behind maintainer usergroup membership (401 for others).
  • Live drift display: GitHub API (no token needed — public repo) fetches current branch HEAD and commit drift on page load.
  • Jenkins integration: _trigger_rebuild() calls Jenkins via jenkins_token config key if present; no-op locally.
  • make-integration-branch.sh: New JSON mode reads active PRs from _testing-prs.json and merges pinned SHAs instead of branch HEADs. Legacy BRANCHES_FILE mode preserved as fallback.

What's not included (follow-up)

  • Wiring the rebuild button to Jenkins end-to-end (requires jenkins_token in olsystem config on testing server)
  • Bridging / deprecating the bookmarklet
  • Auto-cleanup notification of merged PRs (flagged visually; auto-remove is a follow-up)

Test plan

  • Log in as a member of /usergroup/maintainers — Testing Environment section appears with Add PR form
  • Log out / log in as non-maintainer — section hidden or read-only (no action buttons)
  • Add a PR by number — appears in table with pinned commit SHA
  • Add a PR by GitHub URL — parses correctly
  • Toggle active on a PR — active status updates
  • Pull to Latest — pinned commit updates to current branch HEAD
  • Remove a PR — removed from table
  • Confirm _testing-prs.json written to disk after each action
  • Confirm unauthenticated POST requests return 401
  • Confirm /status returns 200 on production (no testing section for non-maintainers without state file)

@mekarpeles mekarpeles force-pushed the 11506/feature/pr-deploy-dashboard branch 3 times, most recently from 0c3477e to c2eef52 Compare April 1, 2026 04:52
@mekarpeles
Copy link
Copy Markdown
Member Author

Confirmed PR input box on /status only appears for logged in patron that is in https://openlibrary.org/usergroup/maintainers group

Turns /status into an interactive deploy management dashboard for
testing.openlibrary.org, available to /usergroup/maintainers and
/usergroup/admin. Non-maintainers see a read-only view.

Changes:
- status.py: add TestingPR dataclass and _testing-prs.json state file
  (same pattern as _dev-merged_status.txt) for persistent tracking of
  deployed PRs with pinned commit SHAs. Add POST endpoints /status/add,
  /status/remove, /status/toggle, /status/pull-latest, /status/rebuild
  gated behind maintainer usergroup. GitHub API integration (no token
  needed — public repo) for live drift detection on page load. Jenkins
  rebuild trigger via jenkins_token config key (no-op if not set).
  /status/add accepts multiple PRs as space/comma separated input.
- status.html: interactive table with checkboxes, action buttons, drift
  display, and merged-PR flagging. Actions hidden for non-maintainers.
- make-integration-branch.sh: when called with a branches file (e.g.
  via bookmarklet), PRs are added to _testing-prs.json pinned to their
  fetched SHA, then the full state file is built. Existing PRs are
  updated to latest SHA (pull to latest). Falls back to legacy mode
  only if no state file and no branches file.

Closes #11506
@mekarpeles mekarpeles force-pushed the 11506/feature/pr-deploy-dashboard branch from c2eef52 to ce8d262 Compare April 1, 2026 05:12
@mekarpeles
Copy link
Copy Markdown
Member Author

Known UX Gaps & Proposed Follow-ups

Logging these while they're fresh for whoever picks up next iterations.


1. Two overlapping sections: "Testing Environment" vs "Staged"

Current behaviour: /status now shows two sections that appear to say similar things:

  • Testing Environment — reads from _testing-prs.json: the desired state (what has been requested to be deployed, with pinned commit SHAs)
  • Staged — reads from _dev-merged_status.txt: the actual last-build state (what make-integration-branch.sh most recently merged and deployed)

Why this is confusing: to a visitor they look redundant. But they diverge meaningfully during a build: Testing Environment updates immediately when you add/remove a PR, while Staged only updates once Jenkins finishes. A visitor has no way to know this distinction.

Proposed fix: Collapse into one view. The Testing Environment table gains a "Last Build" status column per PR (merge succeeded / skipped due to conflict / pending), and the Staged section is removed. This requires make-integration-branch.sh to write per-PR build results back to _testing-prs.json after each build. The "Staged" git log output (currently a <pre> dump) could be preserved as a collapsible "Build log" detail.


2. No feedback after submitting "Add PR" / triggering a rebuild

Current behaviour: clicking "Add PR" or any action button POSTs, updates the state file, fires a Jenkins rebuild (if jenkins_token is configured), and redirects back to /status. There is no indication that:

  • the action succeeded
  • a Jenkins build was triggered
  • whether the build is still running

Proposed fix: after any state-mutating POST, redirect to /status?build_triggered=1 and render a notification bar:

✓ Build triggered. [View Jenkins ↗] (opens in new tab)

Or, if jenkins_token is not configured (local dev):

✓ PRs updated. No Jenkins token configured — rebuild manually.

This requires no polling or websockets — just a query param check in the template and a link to https://jenkins.openlibrary.org/blue/organizations/jenkins/testing-deploy/activity.


3. Button label "Add PR" should be "Add PR(s)"

Minor but the input accepts multiple space/comma-separated values. The label should reflect this.


4. Bootstrapping: first-time transition from bookmarklet to dashboard

Current behaviour: the first time you visit the dashboard, _testing-prs.json doesn't exist. Adding a PR creates it — but if you add only one PR and a rebuild fires, all previously-deployed PRs (added via bookmarklet) are dropped from testing because they're not in the state file yet.

Workaround: use the multi-PR input to add all currently-deployed PRs in one submission before any rebuild fires. After that first seed, the state file is the source of truth.

Proposed fix for clarity: if _testing-prs.json doesn't exist and the user is a maintainer, show a prominent notice:

⚠️ No testing state found. Add all currently-deployed PRs before triggering a rebuild, or use the bookmarklet to seed the list automatically.


5. Bookmarklet behaviour change (intentional, worth documenting)

Old behaviour: bookmarklet replaced the entire testing set on each deploy.

New behaviour: bookmarklet adds submitted PRs to _testing-prs.json, pinned to their fetched commit SHA. PRs already in the state file have their pinned SHA updated ("pull to latest"). The full state file is then built — so the existing set is preserved and extended.

This is the intended design, but anyone using the bookmarklet expecting the old "replace everything" behaviour will be surprised. The bookmarklet deprecation path (per the issue) should eventually remove this ambiguity.

Copy link
Copy Markdown
Collaborator

@RayBB RayBB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you feel about trying to add this PR as a fastapi endpoint?

The one thing you need to know is that for testing we need to update the niginx config to be able to call the endpoints.

Replaces immediate-action model with a staged workflow:
- Add PR(s): new rows highlighted blue until next deploy
- Fetch Latest: queues a commit upgrade (strikethrough + ↑↑), not applied yet
- Enable/Disable: stages active state change (green/red rows)
- Remove: immediately removes row from state (no deploy needed)
- Perform Deploy: only action that triggers Jenkins; applies all pending
  changes atomically, records last_deploy_at, and redirects with a
  Jenkins progress link

State file format updated from bare array to
{"last_deploy_at":"","prs":[...]} with backward-compat read.
Shell script updated to handle new format.
@mekarpeles mekarpeles force-pushed the 11506/feature/pr-deploy-dashboard branch from 21c2fa4 to e1a057f Compare April 2, 2026 21:46
@mekarpeles
Copy link
Copy Markdown
Member Author

@RayBB happy for a followup of this to go into FastAPI. My first priority is landing this so we can tick #11506 of the milestone.

@mekarpeles mekarpeles marked this pull request as ready for review April 2, 2026 22:03
Copilot AI review requested due to automatic review settings April 2, 2026 22:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR turns /status into a testing-environment management dashboard for maintainers, backed by a pinned-commit PR state file (_testing-prs.json) and integrated with the existing “integration branch” build flow.

Changes:

  • Adds maintainer-gated POST endpoints to manage pinned PR commits (add/remove/enable/disable/pull-latest/deploy) and renders a new dashboard section on /status.
  • Fetches live PR drift data from the GitHub API for display on the dashboard.
  • Updates make-integration-branch.sh to build from _testing-prs.json (while preserving the legacy branches-file workflow) and updates i18n message extraction entries.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 12 comments.

File Description
scripts/make-integration-branch.sh Adds JSON-state mode and updates legacy branches-file mode to populate _testing-prs.json before building.
openlibrary/templates/status.html Adds “Testing Environment” dashboard UI and controls to the /status page.
openlibrary/plugins/openlibrary/status.py Implements state management, maintainer authorization checks, GitHub drift fetching, and Jenkins deploy trigger.
openlibrary/i18n/messages.pot Adds new translatable strings introduced in status.html and updates source references.

from openlibrary.utils import get_software_version

status_info: dict[str, Any] = {}
feature_flagso: dict[str, Any] = {}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feature_flagso looks like a typo / leftover; the module actually uses feature_flags (set in setup()), so this annotated global is unused and confusing. Rename it to feature_flags (or remove it) so the declared globals match what status.GET passes to the template.

Suggested change
feature_flagso: dict[str, Any] = {}
feature_flags: dict[str, Any] = {}

Copilot uses AI. Check for mistakes.

def _github_get(path: str) -> dict:
url = f"{_GITHUB_API_BASE}/{path}"
req = urllib.request.Request(url, headers={'Accept': 'application/vnd.github+json'})
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub API requests typically require a User-Agent header; without it, api.github.com can reject the request (and then _get_pr_info/drift will silently fall back). Add an explicit User-Agent (and optionally handle non-2xx HTTPError bodies) in _github_get.

Suggested change
req = urllib.request.Request(url, headers={'Accept': 'application/vnd.github+json'})
req = urllib.request.Request(
url,
headers={
'Accept': 'application/vnd.github+json',
'User-Agent': 'openlibrary-status',
},
)

Copilot uses AI. Check for mistakes.
Comment on lines +75 to +86
for pr_number in pr_numbers:
if pr_number not in existing:
info = _get_pr_info(pr_number)
state.prs.append(
TestingPR(
pr=pr_number,
commit=info['head_sha'],
active=True,
title=info['title'],
added_at=datetime.datetime.now(datetime.UTC).isoformat(),
added_by=user.key.split('/')[-1] if user else '',
)
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If _get_pr_info() fails, it returns head_sha: '', but status_add still persists the PR with an empty commit. That will break downstream deploy/build logic. Consider rejecting the add (400) when head_sha is missing, or skipping that PR and surfacing an error to the user.

Copilot uses AI. Check for mistakes.
try:
gh = _github_get(f"pulls/{pr.pr}")
head_sha = gh['head']['sha']
merged = gh.get('merged') or gh.get('state') == 'closed'
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merged = gh.get('merged') or gh.get('state') == 'closed' will treat any closed-but-not-merged PR as merged, which is inaccurate for the dashboard. Use GitHub’s actual merged signal (merged/merged_at) and keep closed-unmerged separate.

Suggested change
merged = gh.get('merged') or gh.get('state') == 'closed'
merged = bool(gh.get('merged') or gh.get('merged_at'))

Copilot uses AI. Check for mistakes.
Comment on lines +371 to +375
if head_sha.startswith(pr.commit) or pr.commit.startswith(head_sha[:7]):
drift = 0
else:
try:
cmp = _github_get(f"compare/{pr.short_commit}...{head_sha[:7]}")
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drift calculation is using short SHAs and prefix comparisons (head_sha.startswith(pr.commit) / compare endpoint with [:7]). Short SHAs can be ambiguous and prefix checks can produce false “up to date” results. Prefer comparing full SHAs for equality and use full SHAs in the compare call.

Suggested change
if head_sha.startswith(pr.commit) or pr.commit.startswith(head_sha[:7]):
drift = 0
else:
try:
cmp = _github_get(f"compare/{pr.short_commit}...{head_sha[:7]}")
stored_commit = pr.commit.strip()
is_full_sha = len(stored_commit) == 40
if (is_full_sha and head_sha == stored_commit) or (
not is_full_sha and head_sha.startswith(stored_commit)
):
drift = 0
else:
try:
cmp = _github_get(f"compare/{stored_commit}...{head_sha}")

Copilot uses AI. Check for mistakes.
<div class="deploy-banner">
$_("Deploy triggered!")
$if jenkins_job_url:
&mdash; <a href="$jenkins_job_url" target="_blank">$_("View Jenkins progress")</a>
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Jenkins link opens in a new tab (target="_blank") without rel="noopener noreferrer", which allows the new page to access window.opener. Add an appropriate rel attribute to prevent tabnabbing.

Suggested change
&mdash; <a href="$jenkins_job_url" target="_blank">$_("View Jenkins progress")</a>
&mdash; <a href="$jenkins_job_url" target="_blank" rel="noopener noreferrer">$_("View Jenkins progress")</a>

Copilot uses AI. Check for mistakes.
Comment on lines +72 to +76
git merge "$pinned_sha"
if [[ $(git ls-files -u) ]]; then
git merge --abort
echo "Merge conflict for PR #$pr_num (pinned $pinned_sha) — skipping"
fi
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build phase fetches pull/<pr>/head but then merges pinned_sha. If the PR was force-pushed or the pinned SHA isn’t reachable from the current head, git merge "$pinned_sha" will fail. Consider fetching the pinned SHA explicitly (or checking it exists) and skipping gracefully when it can’t be fetched.

Suggested change
git merge "$pinned_sha"
if [[ $(git ls-files -u) ]]; then
git merge --abort
echo "Merge conflict for PR #$pr_num (pinned $pinned_sha) — skipping"
fi
if ! git cat-file -e "$pinned_sha^{commit}" 2>/dev/null; then
git fetch origin "$pinned_sha" 2>/dev/null || true
fi
if ! git cat-file -e "$pinned_sha^{commit}" 2>/dev/null; then
echo "Pinned commit $pinned_sha for PR #$pr_num is unavailable — skipping"
continue
fi
if ! git merge "$pinned_sha"; then
if [[ $(git ls-files -u) ]]; then
git merge --abort 2>/dev/null || true
echo "Merge conflict for PR #$pr_num (pinned $pinned_sha) — skipping"
else
git merge --abort 2>/dev/null || true
echo "Failed to merge PR #$pr_num (pinned $pinned_sha) — skipping"
fi
fi

Copilot uses AI. Check for mistakes.
else:
prs.append({
'pr': pr_num, 'commit': sha, 'active': True, 'title': title,
'added_at': datetime.datetime.utcnow().isoformat(), 'added_by': 'bookmarklet',
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script writes added_at using datetime.utcnow().isoformat() (naive timestamp) which is inconsistent with the server-side ISO timestamps that include timezone. Use timezone-aware UTC timestamps consistently to avoid downstream comparisons/parsing issues.

Suggested change
'added_at': datetime.datetime.utcnow().isoformat(), 'added_by': 'bookmarklet',
'added_at': datetime.datetime.now(datetime.UTC).isoformat(), 'added_by': 'bookmarklet',

Copilot uses AI. Check for mistakes.
fi

echo "---"
echo "Complete; dev-merged created (SHA: $(git rev-parse --short HEAD))"
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The completion message always says "dev-merged created" even though the output branch name is configurable via $NEW_BRANCH. Consider printing the actual branch name to avoid confusion when running the script with a non-default branch name.

Suggested change
echo "Complete; dev-merged created (SHA: $(git rev-parse --short HEAD))"
echo "Complete; $NEW_BRANCH created (SHA: $(git rev-parse --short HEAD))"

Copilot uses AI. Check for mistakes.
Comment on lines +174 to +175
_trigger_rebuild()
raise web.seeother('/status?deploy_triggered=1')
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

status_deploy always redirects with deploy_triggered=1 even if _trigger_rebuild() is a no-op (no token) or the Jenkins call fails. That makes the UI banner misleading. Consider using the boolean return from _trigger_rebuild() to decide whether to show “Deploy triggered” (or show an error state when it fails).

Suggested change
_trigger_rebuild()
raise web.seeother('/status?deploy_triggered=1')
deploy_triggered = _trigger_rebuild()
redirect_url = '/status?deploy_triggered=1' if deploy_triggered else '/status'
raise web.seeother(redirect_url)

Copilot uses AI. Check for mistakes.
@RayBB
Copy link
Copy Markdown
Collaborator

RayBB commented Apr 2, 2026

@mekarpeles as part of this PR do you plan to remove this section?
image

Additional feedback:

  • I think to make it as clear as possible Active should probably be changed to Enabled
  • Long term, it would be quite helpful if we could enable and disable PRs at the same time.

@RayBB
Copy link
Copy Markdown
Collaborator

RayBB commented Apr 2, 2026

More feedback:

This is a kind of confusing symbol with the arrows
image

This link is broken (403):
image
https://jenkins.openlibrary.org/job/testing-deploy/

Perhaps it should be:
https://jenkins.openlibrary.org/job/ol-dev1-deploy%20(internal)/

Also, will note the status page does seem quite a bit slower to load than usual.

@RayBB
Copy link
Copy Markdown
Collaborator

RayBB commented Apr 2, 2026

What if, when you add a PR it's in a paused state so we don't kick off a deploy right away?

I also think it would be very helpful for me to see who is the assignee on the PR so I can quickly scan it and understand which ones I'm likely working with .

@RayBB
Copy link
Copy Markdown
Collaborator

RayBB commented Apr 2, 2026

I think when you remove a PR it doesn't trigger a deploy right? Worth noting then that there can be PRs not visible in the tool but that are still currently deployed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improving PR Deploy Strategy

4 participants