-
-
Notifications
You must be signed in to change notification settings - Fork 838
fix(deployments): retry transient depot build init failures #2586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The Depot build init with `depot.build.v1.BuildService.createBuild` fails surprisingly often due to transient error. This PR adds a simple retry mechanism with backoff using `p-retry`.
|
Walkthrough
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
apps/webapp/app/v3/remoteImageBuilder.server.ts (1)
18-36
: Consider adding AbortError for non-retryable errors.The retry logic is sound, but retrying all errors indiscriminately may waste resources on non-transient failures. Authentication failures (401, 403) and validation errors (400, 422) should abort immediately rather than retry.
Additionally, the
onFailedAttempt
callback could logattemptNumber
andretriesLeft
for better observability.Apply this diff to handle non-retryable errors and improve logging:
import pRetry from "p-retry"; +import { AbortError } from "p-retry"; import { logger } from "~/services/logger.server";
const result = await pRetry( - () => + async () => { + try { + return await depot.build.v1.BuildService.createBuild( - depot.build.v1.BuildService.createBuild( { projectId: builderProjectId }, { headers: { Authorization: `Bearer ${env.DEPOT_TOKEN}`, }, } ); + } catch (error: any) { + // Don't retry authentication or validation errors + if (error.code === "UNAUTHENTICATED" || error.code === "PERMISSION_DENIED" || error.code === "INVALID_ARGUMENT") { + throw new AbortError(error); + } + throw error; + } + }, { retries: 3, minTimeout: 200, maxTimeout: 2000, onFailedAttempt: (error) => { - logger.error("Failed attempt to create remote Depot build", { error }); + logger.error("Failed attempt to create remote Depot build", { + error, + attemptNumber: error.attemptNumber, + retriesLeft: error.retriesLeft, + }); }, } );Note: Adjust the error code checks based on the actual error structure returned by the Depot SDK. You may need to inspect
error.response?.status
for HTTP status codes instead.Based on learnings.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yaml
is excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (3)
apps/webapp/app/routes/api.v1.deployments.$deploymentId.progress.ts
(1 hunks)apps/webapp/app/v3/remoteImageBuilder.server.ts
(2 hunks)apps/webapp/package.json
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}
: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations
Files:
apps/webapp/app/routes/api.v1.deployments.$deploymentId.progress.ts
apps/webapp/app/v3/remoteImageBuilder.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
We use zod a lot in packages/core and in the webapp
Files:
apps/webapp/app/routes/api.v1.deployments.$deploymentId.progress.ts
apps/webapp/app/v3/remoteImageBuilder.server.ts
apps/webapp/**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
When importing from @trigger.dev/core in the webapp, never import the root package path; always use one of the documented subpath exports from @trigger.dev/core’s package.json
Files:
apps/webapp/app/routes/api.v1.deployments.$deploymentId.progress.ts
apps/webapp/app/v3/remoteImageBuilder.server.ts
{apps/webapp/app/**/*.server.{ts,tsx},apps/webapp/app/routes/**/*.ts}
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Access environment variables only via the env export from app/env.server.ts; do not reference process.env directly
Files:
apps/webapp/app/routes/api.v1.deployments.$deploymentId.progress.ts
apps/webapp/app/v3/remoteImageBuilder.server.ts
apps/webapp/app/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)
Modules intended for test consumption under apps/webapp/app/**/*.ts must not read environment variables; accept configuration via options instead
Files:
apps/webapp/app/routes/api.v1.deployments.$deploymentId.progress.ts
apps/webapp/app/v3/remoteImageBuilder.server.ts
🧬 Code graph analysis (2)
apps/webapp/app/routes/api.v1.deployments.$deploymentId.progress.ts (1)
packages/core/src/v3/apps/http.ts (1)
json
(65-75)
apps/webapp/app/v3/remoteImageBuilder.server.ts (1)
apps/webapp/app/env.server.ts (1)
env
(1203-1203)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (5)
apps/webapp/package.json (1)
169-169
: LGTM! Dependency addition supports retry logic.The
p-retry@^4.6.1
dependency is used inapps/webapp/app/v3/remoteImageBuilder.server.ts
to handle transient Depot build failures. The version is stable and the caret range allows safe updates within v4.x.apps/webapp/app/v3/remoteImageBuilder.server.ts (2)
6-7
: LGTM! Imports correctly support retry and logging.The imports follow the correct pattern:
pRetry
is imported as a default export (as per p-retry v4 API), andlogger
follows the project's logging pattern.
23-25
: LGTM! Authorization header correctly added.The Authorization header follows the same pattern used in
createBuilderProjectIfNotExists
(lines 57-59) and correctly usesenv.DEPOT_TOKEN
as per coding guidelines.apps/webapp/app/routes/api.v1.deployments.$deploymentId.progress.ts (2)
58-61
: LGTM! Logging improves observability for timeout extensions.The warning log with
error.cause
provides useful debugging context while maintaining the 204 response behavior. Adding braces also improves maintainability if additional statements are needed later.
69-72
: LGTM! Logging improves observability for remote build failures.The error log with
error.cause
provides useful debugging context for Depot build creation failures. The 500 response correctly signals a server error while maintaining a generic client-facing message.
The Depot build init with
depot.build.v1.BuildService.createBuild
fails surprisingly often due to transient errors, causing the whole deployment to fail. This PR adds a simple retry mechanism with backoff usingp-retry
. This should improve the failure rate.