Added job to backfill ghost_uuid in sites table#1432
Conversation
ref https://linear.app/ghost/issue/BER-2964 Added job that backfills the `ghost_uuid` field recently added to the `sites` table
WalkthroughAdds a new job backfill-ghost-uuid with TypeScript implementation to fetch Ghost site UUIDs and update a MySQL database, plus a comprehensive test suite. Includes Dockerfile, docker-compose, test-run and gcloud push scripts, and README documentation. Also reorders a test import and adjusts an import grouping rule in biome.json to group NODE with BUN. Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 📝 Customizable high-level summaries are now available!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
jobs/backfill-ghost-uuid/index.test.ts (2)
147-172: Consider adding a test for timeout abort behavior.The test verifies that an
AbortSignalis passed but doesn't validate the 30-second timeout or test the abort scenario. Given that the implementation uses a 30s timeout (per index.ts:32), adding a test that mocks a delayed response exceeding the timeout would ensure the abort mechanism works correctly.Example test:
test('should return null on timeout', async () => { global.fetch = mock(() => new Promise(resolve => setTimeout(resolve, 35000)) ); const uuid = await fetchSiteGhostUUID('example.com'); expect(uuid).toBeNull(); });
275-302: Consider whether this test case is necessary.The second test updates an existing (non-null)
ghost_uuidvalue, but the job's purpose is to backfill NULL values. While the underlyingupdateSiteGhostUUIDfunction technically supports overwriting, this scenario shouldn't occur during normal backfill operations sincegetSitesWithoutGhostUUIDonly returns sites with NULLghost_uuid.jobs/backfill-ghost-uuid/run-tests.sh (2)
4-4: Consider usingdocker composeinstead ofdocker-compose.The
docker-composestandalone command is deprecated. Docker Compose V2 is now bundled with Docker CLI asdocker compose.Apply this diff:
-docker-compose up -d +docker compose up -dNote: You'll also need to update line 8 and line 21.
20-21: Ensure cleanup runs even on test failure.With the suggested
set -e, the script will exit on test failure before reaching cleanup. Consider adding a trap handler to ensuredocker compose downalways runs.Add after
set -e:set -e trap 'docker compose down' EXITThen remove the explicit
docker compose downat line 21 since the trap handles it.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
biome.json(1 hunks)jobs/__template__/index.test.ts(1 hunks)jobs/backfill-ghost-uuid/Dockerfile(1 hunks)jobs/backfill-ghost-uuid/README.md(1 hunks)jobs/backfill-ghost-uuid/docker-compose.yml(1 hunks)jobs/backfill-ghost-uuid/gcloud-push.sh(1 hunks)jobs/backfill-ghost-uuid/index.test.ts(1 hunks)jobs/backfill-ghost-uuid/index.ts(1 hunks)jobs/backfill-ghost-uuid/run-tests.sh(1 hunks)jobs/migrate-bluesky-handles/index.test.ts(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: allouis
Repo: TryGhost/ActivityPub PR: 737
File: migrate/migrations/000054_populate-outboxes-table.up.sql:56-63
Timestamp: 2025-05-27T09:09:22.110Z
Learning: In the outboxes table migration for Ghost ActivityPub, the author_id field always refers to the author of the original post, not the person performing the outbox action. For reposts (outbox_type = 1), author_id should be posts.author_id (original author), while user_id captures who made the repost via the joins through accounts and users tables.
📚 Learning: 2025-09-04T07:49:39.075Z
Learnt from: mike182uk
Repo: TryGhost/ActivityPub PR: 1240
File: jobs/migrate-bluesky-handles/Dockerfile:1-8
Timestamp: 2025-09-04T07:49:39.075Z
Learning: The jobs/migrate-bluesky-handles project uses only Bun's built-in APIs (SQL helper, fetch) and requires no external dependencies, making a simple COPY index.ts approach sufficient in the Dockerfile.
Applied to files:
jobs/migrate-bluesky-handles/index.test.tsjobs/backfill-ghost-uuid/index.tsjobs/__template__/index.test.tsjobs/backfill-ghost-uuid/Dockerfile
📚 Learning: 2025-09-25T12:39:45.755Z
Learnt from: mike182uk
Repo: TryGhost/ActivityPub PR: 1329
File: .github/renovate.json5:31-34
Timestamp: 2025-09-25T12:39:45.755Z
Learning: The jobs/update-external-accounts directory was removed from the TryGhost/ActivityPub repository and no longer exists, so Renovate configuration exclusions referencing this path are no longer needed.
Applied to files:
jobs/backfill-ghost-uuid/index.tsjobs/backfill-ghost-uuid/README.md
🧬 Code graph analysis (1)
jobs/backfill-ghost-uuid/index.test.ts (2)
jobs/backfill-ghost-uuid/index.ts (3)
fetchSiteGhostUUID(28-66)getSitesWithoutGhostUUID(68-76)updateSiteGhostUUID(78-87)src/core/result.ts (1)
Error(9-9)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build / Build Docker Images (linux/amd64, amd64)
🔇 Additional comments (13)
jobs/migrate-bluesky-handles/index.test.ts (1)
1-2: LGTM! Import formatting aligns with updated rules.The blank line addition follows the new import grouping configuration in biome.json, which now pairs NODE and BUN imports together.
biome.json (1)
26-26: LGTM! Grouping NODE and BUN imports together.This change appropriately pairs NODE and BUN imports in the same group, which aligns with the project's use of the Bun runtime.
jobs/__template__/index.test.ts (1)
1-11: LGTM! Template updated to match new import grouping.The import reorganization follows the updated biome.json rules, providing a consistent template for future jobs.
jobs/backfill-ghost-uuid/docker-compose.yml (1)
1-10: LGTM! Well-configured test environment.The MySQL test service uses image digest pinning for reproducibility and tmpfs mounting for faster ephemeral test storage. Port mapping to 3308 avoids conflicts with local MySQL instances.
jobs/backfill-ghost-uuid/gcloud-push.sh (1)
1-39: LGTM! Well-structured deployment script.The script follows best practices with strict error handling (
set -euo pipefail), validates prerequisites, and provides clear progress messages throughout the build and push process.jobs/backfill-ghost-uuid/README.md (1)
1-67: LGTM! Comprehensive job documentation.The README clearly describes the job's purpose, operational characteristics (sequential processing, rate limiting, error handling), and provides complete setup and deployment instructions.
jobs/backfill-ghost-uuid/index.ts (5)
28-66: LGTM! Robust HTTP request handling with timeout.The function properly implements:
- Abort controller with 30s timeout to prevent hanging requests
- HTTP status validation
- Response structure validation
- Error handling with logging
68-76: LGTM! Clean query implementation.The function retrieves sites without ghost_uuid using a straightforward parameterized query with ORDER BY for deterministic processing.
78-87: LGTM! Properly parameterized update query.The function uses parameterized queries to prevent SQL injection vulnerabilities.
89-156: LGTM! Well-structured main orchestration.The main function properly:
- Configures DB connection with socket or host/port fallback
- Processes sites sequentially with rate limiting (500ms delay)
- Tracks and reports success/failure counts
- Ensures connection cleanup in finally block
158-164: LGTM! Proper entry point with error handling.The entry point correctly checks
import.meta.mainand handles unhandled errors with appropriate exit codes.jobs/backfill-ghost-uuid/Dockerfile (1)
1-9: Remove the unnecessarylibdirectory from the Dockerfile—it does not exist and will cause build failure.The
libdirectory referenced in line 6 (COPY lib ./lib) does not exist in the repository. The job only importsmysql2/promise, which is a Bun built-in API. Following the same pattern as similar jobs (e.g., migrate-bluesky-handles), remove this line to allow the build to succeed.FROM oven/bun:1.3.0-alpine@sha256:37e6b1cbe053939bccf6ae4507977ed957eaa6e7f275670b72ad6348e0d2c11f WORKDIR /opt/job COPY index.ts . ENTRYPOINT ["bun", "run", "index.ts"] CMD []⛔ Skipped due to learnings
Learnt from: mike182uk Repo: TryGhost/ActivityPub PR: 1240 File: jobs/migrate-bluesky-handles/Dockerfile:1-8 Timestamp: 2025-09-04T07:49:39.075Z Learning: The jobs/migrate-bluesky-handles project uses only Bun's built-in APIs (SQL helper, fetch) and requires no external dependencies, making a simple COPY index.ts approach sufficient in the Dockerfile.Learnt from: mike182uk Repo: TryGhost/ActivityPub PR: 1240 File: jobs/migrate-bluesky-handles/Dockerfile:1-8 Timestamp: 2025-09-04T07:49:39.075Z Learning: The jobs/migrate-bluesky-handles project uses only Bun's built-in APIs (SQL helper, fetch) and requires no external dependencies, making a simple COPY index.ts approach sufficient in the Dockerfile.Learnt from: mike182uk Repo: TryGhost/ActivityPub PR: 1329 File: .github/renovate.json5:31-34 Timestamp: 2025-09-25T12:39:45.755Z Learning: The jobs/update-external-accounts directory was removed from the TryGhost/ActivityPub repository and no longer exists, so Renovate configuration exclusions referencing this path are no longer needed.jobs/backfill-ghost-uuid/index.test.ts (1)
175-243: LGTM!Comprehensive test coverage for
getSitesWithoutGhostUUID, including edge cases (empty table, no NULL values) and ordering verification.
| #!/usr/bin/env bash | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add error handling with set -e.
The script lacks set -e, so failures in intermediate commands won't halt execution. This could lead to confusing errors if, for example, docker-compose up fails but the script continues.
Apply this diff:
#!/usr/bin/env bash
+set -e📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #!/usr/bin/env bash | |
| #!/usr/bin/env bash | |
| set -e | |
🤖 Prompt for AI Agents
In jobs/backfill-ghost-uuid/run-tests.sh around lines 1 to 2, the script is
missing a safety flag so failures in intermediate commands won't stop execution;
add a shell option line immediately after the shebang to enable exiting on
errors (e.g., add set -e), optionally including set -o pipefail for safer
pipeline behavior, so the script will abort when any command fails.
| for i in {1..30}; do | ||
| if docker exec "$(docker-compose ps -q mysql-test)" mysqladmin ping -h"localhost" -P"3306" -u"root" -p"root" --silent 2>/dev/null; then | ||
| echo "MySQL is ready!" | ||
| sleep 2 | ||
| break | ||
| fi | ||
| echo "Waiting for MySQL... ($i/30)" | ||
| sleep 1 | ||
| done |
There was a problem hiding this comment.
Add error handling if MySQL never becomes ready.
The script continues executing tests even if the 30-attempt loop exhausts without MySQL becoming ready. This leads to test failures that don't clearly indicate the root cause.
Apply this diff:
echo "Waiting for MySQL to be ready..."
+ready=false
for i in {1..30}; do
if docker exec "$(docker-compose ps -q mysql-test)" mysqladmin ping -h"localhost" -P"3306" -u"root" -p"root" --silent 2>/dev/null; then
echo "MySQL is ready!"
sleep 2
+ ready=true
break
fi
echo "Waiting for MySQL... ($i/30)"
sleep 1
done
+
+if [ "$ready" = false ]; then
+ echo "ERROR: MySQL failed to become ready after 30 attempts"
+ docker-compose down
+ exit 1
+fiCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In jobs/backfill-ghost-uuid/run-tests.sh around lines 7 to 15, the script waits
up to 30 times for MySQL but doesn't handle the case where MySQL never becomes
ready; update the script to detect when the loop exhausts (no successful ping),
print a clear error message, dump relevant container logs (e.g., docker-compose
logs mysql-test) to aid debugging, and exit with a non-zero status so tests do
not continue silently.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
jobs/backfill-ghost-uuid/Dockerfile(1 hunks)jobs/backfill-ghost-uuid/index.test.ts(1 hunks)jobs/backfill-ghost-uuid/index.ts(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- jobs/backfill-ghost-uuid/Dockerfile
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: allouis
Repo: TryGhost/ActivityPub PR: 737
File: migrate/migrations/000054_populate-outboxes-table.up.sql:56-63
Timestamp: 2025-05-27T09:09:22.110Z
Learning: In the outboxes table migration for Ghost ActivityPub, the author_id field always refers to the author of the original post, not the person performing the outbox action. For reposts (outbox_type = 1), author_id should be posts.author_id (original author), while user_id captures who made the repost via the joins through accounts and users tables.
Learnt from: mike182uk
Repo: TryGhost/ActivityPub PR: 1240
File: jobs/migrate-bluesky-handles/Dockerfile:1-8
Timestamp: 2025-09-04T07:49:39.075Z
Learning: The jobs/migrate-bluesky-handles project uses only Bun's built-in APIs (SQL helper, fetch) and requires no external dependencies, making a simple COPY index.ts approach sufficient in the Dockerfile.
Learnt from: mike182uk
Repo: TryGhost/ActivityPub PR: 1240
File: jobs/migrate-bluesky-handles/Dockerfile:1-8
Timestamp: 2025-09-04T07:49:39.075Z
Learning: The jobs/migrate-bluesky-handles project uses only Bun's built-in APIs (SQL helper, fetch) and requires no external dependencies, making a simple COPY index.ts approach sufficient in the Dockerfile.
📚 Learning: 2025-09-04T07:49:39.075Z
Learnt from: mike182uk
Repo: TryGhost/ActivityPub PR: 1240
File: jobs/migrate-bluesky-handles/Dockerfile:1-8
Timestamp: 2025-09-04T07:49:39.075Z
Learning: The jobs/migrate-bluesky-handles project uses only Bun's built-in APIs (SQL helper, fetch) and requires no external dependencies, making a simple COPY index.ts approach sufficient in the Dockerfile.
Applied to files:
jobs/backfill-ghost-uuid/index.ts
🧬 Code graph analysis (1)
jobs/backfill-ghost-uuid/index.test.ts (1)
jobs/backfill-ghost-uuid/index.ts (3)
fetchSiteGhostUUID(28-66)getSitesWithoutGhostUUID(68-76)updateSiteGhostUUID(78-102)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test / Test
🔇 Additional comments (8)
jobs/backfill-ghost-uuid/index.ts (4)
68-76: LGTM!The query correctly selects sites without a
ghost_uuidand orders them by ID. The parameterized query prevents SQL injection.
78-102: LGTM!The function properly handles duplicate UUID constraints by catching
ER_DUP_ENTRYerrors and returning a specific result. The type guard for error checking is thorough and safe.
104-186: LGTM!The orchestration logic is well-structured with proper connection handling, sequential processing with rate limiting (500ms delay), comprehensive logging, and appropriate cleanup in the finally block.
188-194: LGTM!The entry point correctly guards execution and handles unhandled errors with appropriate exit codes.
jobs/backfill-ghost-uuid/index.test.ts (4)
24-58: LGTM!The test setup demonstrates excellent practices: proper connection management, table creation/cleanup, clean state between tests via
DELETE, and fetch mock restoration. The schema correctly usesCHAR(36)for UUID storage and includes appropriate constraints.
60-173: LGTM!Comprehensive test coverage for
fetchSiteGhostUUIDincluding success path, HTTP errors, malformed responses, missing fields, network errors, and URL format verification. Well-structured with appropriate mocking.
175-243: LGTM!Thorough test coverage for
getSitesWithoutGhostUUIDincluding filtering logic, empty results, and ordering verification. The tests properly validate the WHERE clause behavior.
245-361: LGTM!Excellent test coverage for
updateSiteGhostUUIDincluding null-to-UUID updates, existing UUID updates, duplicate constraint handling, and return value verification. The duplicate test correctly verifies the row remains unchanged when a constraint violation occurs.
| } catch (error) { | ||
| console.warn(`Failed to fetch site info for ${host}: ${error.message}`); | ||
|
|
||
| return null; | ||
| } |
There was a problem hiding this comment.
Type-check error before accessing .message property.
The caught error has type unknown. Accessing error.message without verifying it's an Error instance could fail if the error is a string or other type.
Apply this diff to add a type guard:
} catch (error) {
- console.warn(`Failed to fetch site info for ${host}: ${error.message}`);
+ const message = error instanceof Error ? error.message : String(error);
+ console.warn(`Failed to fetch site info for ${host}: ${message}`);
return null;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } catch (error) { | |
| console.warn(`Failed to fetch site info for ${host}: ${error.message}`); | |
| return null; | |
| } | |
| } catch (error) { | |
| const message = error instanceof Error ? error.message : String(error); | |
| console.warn(`Failed to fetch site info for ${host}: ${message}`); | |
| return null; | |
| } |
🤖 Prompt for AI Agents
In jobs/backfill-ghost-uuid/index.ts around lines 61 to 65, the catch block
accesses error.message but the caught value is typed as unknown; add a type
guard to safely extract a message — for example check if (error instanceof
Error) and use error.message, otherwise fall back to String(error) (or a default
like 'Unknown error'), and use that safeMessage in the console.warn before
returning null.
ref https://linear.app/ghost/issue/BER-2964
Added job that backfills the
ghost_uuidfield recently added to thesitestable