Skip to content

fix(design): add --api-timeout flag, raise default 120s to 300s for image-gen#1528

Open
RagavRida wants to merge 2 commits into
garrytan:mainfrom
RagavRida:fix/design-api-timeout-override
Open

fix(design): add --api-timeout flag, raise default 120s to 300s for image-gen#1528
RagavRida wants to merge 2 commits into
garrytan:mainfrom
RagavRida:fix/design-api-timeout-override

Conversation

@RagavRida
Copy link
Copy Markdown

Closes #1519.

Summary

The five image-generation callsites in the design binary all hardcoded a 120_000ms timeout with no CLI override:

  • design/src/generate.ts:40
  • design/src/iterate.ts:85 and :133
  • design/src/variants.ts:61
  • design/src/evolve.ts:55

With default settings (gpt-4o, 1536x1024, quality: high) using the Responses API with the image_generation tool, response time pushes into the 90-180s range on slower account tiers and tips over the 120s ceiling for many users. The reporter in #1519 confirmed via curl that the endpoint itself isn't slow (~26s on direct calls) and that account/network were fine — the failure is entirely the binary's hardcoded ceiling.

The existing --timeout CLI flag is plumbed only to compare --serve / serve for the HTTP listener (design/src/cli.ts:145, 251), so it does not reach the image-gen path.

Changes

  • design/src/constants.ts (new): exports DEFAULT_IMAGE_GEN_TIMEOUT_MS = 300_000
  • apiTimeoutMs?: number option added to GenerateOptions, VariantsOptions, IterateOptions, EvolveOptions and threaded to the AbortController callsite in each
  • --api-timeout <ms> CLI flag parsed once in cli.ts and passed to all four commands. Distinct from --timeout to avoid colliding with the serve flag.
  • design/test/api-timeout.test.ts (new): pins the default constant at 300_000 and verifies generateVariant honors the override via a stubbed fetch that waits on the abort signal.

Behavior

  • Users who weren't currently timing out at 120s are unaffected — the path resolves the same image, just with more headroom before abort.
  • Users hitting the timeout get a working default (5min headroom) plus a per-invocation override: $D generate --brief "..." --api-timeout 600000 for a 10-min ceiling.
  • The variants timeout error message changed from "Timeout (120s)" to "Timeout (<n>ms)" so it stays accurate when the override is used.
  • evolve's separate vision-analysis call (30s timeout, design/src/evolve.ts:113) is unchanged — that path is fast and doesn't need the headroom.

Test results

$ bun test design/test/api-timeout.test.ts
 3 pass
 0 fail
 5 expect() calls

$ bun test design/test/ test/gen-skill-docs.test.ts test/skill-validation.test.ts
 742 pass
 2 fail   # both pre-existing playwright-env failures in feedback-roundtrip.test.ts, reproduced on clean main
 6739 expect() calls

$ bun build --compile design/src/cli.ts --outfile /tmp/design-test-build
 [437ms] compile  /tmp/design-test-build

bun run gen:skill-docs was run after editing commands.ts; no SKILL.md drift (the design flag list isn't surfaced in user-visible templates).

RagavRida added 2 commits May 15, 2026 23:11
…mage-gen

Five image-generation callsites (generate, variants, iterate x2, evolve)
hardcoded a 120_000ms ceiling with no CLI override. With default size
(1536x1024) + quality:high on gpt-4o + image_generation tool, response
time pushes into the 90-180s range on slower account tiers, tipping over
the 120s ceiling for many users (issue garrytan#1519).

- design/src/constants.ts: DEFAULT_IMAGE_GEN_TIMEOUT_MS = 300_000
- apiTimeoutMs?: number option threaded through GenerateOptions,
  VariantsOptions, IterateOptions, EvolveOptions
- --api-timeout <ms> CLI flag (distinct from --timeout, which is
  plumbed only to compare --serve / serve for the HTTP listener)
- Regression test pins the constant + verifies the AbortController
  honors the override via stubbed slow fetch

Closes garrytan#1519.
…outMs budget

callWithThreading + callFresh each received the full apiTimeoutMs, so the
worst-case wait on --api-timeout 300000 was 600s (2×300s). Now a shared
deadline is set at iteration start; callFresh receives the remaining budget
after threading fails or times out, and if the budget is already exhausted
at fallback entry a clear Timeout (Xs) error is thrown immediately.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Issue when creating images via OpenAI

1 participant