Skip to content

Image generation tool#2789

Merged
azizmejri1 merged 9 commits intodyad-sh:mainfrom
azizmejri1:image-generation-tool
Feb 27, 2026
Merged

Image generation tool#2789
azizmejri1 merged 9 commits intodyad-sh:mainfrom
azizmejri1:image-generation-tool

Conversation

@azizmejri1
Copy link
Collaborator

@azizmejri1 azizmejri1 commented Feb 25, 2026

@wwwillchen
Copy link
Collaborator

@BugBot run

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @azizmejri1, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the agent's capabilities by introducing an AI-powered image generation tool. This allows the agent to create visual assets directly, enhancing its ability to fulfill user requests for custom images, illustrations, and graphics. The changes include the core tool logic, a dedicated UI component for displaying generation progress, and updated agent instructions to guide its usage, providing a more comprehensive and interactive experience for visual content creation.

Highlights

  • New Image Generation Tool: Introduced a new generate_image tool that allows the agent to create AI-generated images based on detailed text prompts. This tool integrates with an external image generation engine and saves the output to the project's media directory.
  • Enhanced User Interface for Image Generation: Added a new React component, DyadImageGeneration, to display the status and details of image generation tasks within the chat interface. This component shows the prompt, progress, and the path where the generated image is saved.
  • Updated Agent Prompt Guidelines: Incorporated specific guidelines into the local agent's system prompt, instructing it on when and how to use the generate_image tool, including best practices for writing effective prompts and handling generated assets.
  • End-to-End Testing and Mocking: Implemented new end-to-end tests to verify the functionality of the generate_image tool in a local agent environment. A mock endpoint for image generation was also added to the fake LLM server to facilitate testing.
Changelog
  • e2e-tests/fixtures/engine/local-agent/generate-image.ts
    • Added a new fixture to define a test scenario for the generate_image tool, including a sample prompt and expected tool calls.
  • e2e-tests/local_agent_generate_image.spec.ts
    • Added an end-to-end test case to verify the functionality of the generate_image tool within the local agent mode, including setting up the environment and snapshotting messages.
  • src/tests/snapshots/local_agent_prompt.test.ts.snap
    • Updated the snapshot to reflect the inclusion of new image generation guidelines in the local agent's system prompt.
  • src/components/chat/DyadImageGeneration.tsx
    • Added a new React component, DyadImageGeneration, responsible for rendering the image generation process and results in the chat UI, including prompt, image path, and state indicators.
  • src/components/chat/DyadMarkdownParser.tsx
    • Imported the new DyadImageGeneration component.
    • Registered dyad-image-generation as a custom tag.
    • Added logic to render the DyadImageGeneration component when the dyad-image-generation custom tag is encountered in markdown.
  • src/pro/main/ipc/handlers/local_agent/tool_definitions.ts
    • Imported the generateImageTool.
    • Added generateImageTool to the TOOL_DEFINITIONS array, making it available to the local agent.
  • src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts
    • Added a new tool definition for generate_image, including its input schema, detailed description, consent preview, and XML build logic.
    • Implemented the execute function for the tool, which calls an external image generation API, saves the generated image to the .dyad/media directory, and streams XML updates to the UI.
  • src/prompts/local_agent_prompt.ts
    • Defined a new constant IMAGE_GENERATION_BLOCK containing detailed guidelines for using the generate_image tool.
    • Integrated the IMAGE_GENERATION_BLOCK into the local agent's system prompt.
  • testing/fake-llm-server/index.ts
    • Added a new POST endpoint /engine/v1/images/generations to the fake LLM server, which mocks the image generation API by returning a base64 encoded tiny PNG for testing purposes.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

gemini-code-assist[bot]

This comment was marked as resolved.

@wwwillchen
Copy link
Collaborator

@BugBot run

@github-actions
Copy link
Contributor

🔍 Dyadbot Code Review Summary

Verdict: 🤔 NOT SURE - Potential issues

Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard.

Issues Summary

Severity File Issue
🟡 MEDIUM src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:121 Missing modifiesState: true — tool available in read-only/plan modes
🟡 MEDIUM src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:135-153 No error handling — failed generation leaves UI stuck in "Generating..." state
🟢 Low Priority Notes (4 items)
  • SSRF defense-in-depth - generate_image.ts:103fetch(imageData.url) fetches arbitrary URL from API response without validation; consider restricting to HTTPS + known domains
  • No finished/success state indicator - DyadImageGeneration.tsx:44-49 — Shows "Generating..." and "Did not finish" states but no visual confirmation on success (other cards show a checkmark)
  • Consent preview may be long - generate_image.ts:127 — Detailed prompts could overflow consent banner UI; consider truncating
  • Redundant null check - generate_image.ts:80!data.data is impossible after Zod parse; simplify to data.data.length === 0
🚫 Dropped False Positives (7 items)
  • Unused revised_prompt field - Dropped: Acceptable API contract documentation even if unused in current code
  • Prompt guidelines duplicated - Dropped: Tool description and system prompt serve different purposes (tool selection vs. behavioral guidance); some overlap is expected
  • buildXml returns undefined on isComplete - Dropped: Intentional pattern — execute handles completion via ctx.onXmlComplete, returning from both would cause duplicates
  • File extension hardcoded to .png - Dropped: API model is hardcoded to gpt-image-1.5 which returns PNG; not a real concern
  • No image preview in card - Dropped: Feature enhancement request, not a defect; current pattern matches other tool cards
  • Technical error messages exposed to users - Dropped: Error messages are consumed by the agent model which wraps them in natural language, not displayed directly in UI
  • Long prompts pushing expand icon off-screen - Dropped: truncate CSS class handles this; DyadCardHeader is a shared component handling layout correctly

Generated by Dyadbot multi-agent code review

github-actions[bot]

This comment was marked as resolved.

@github-actions github-actions bot added the needs-human:review-issue ai agent flagged an issue that requires human review label Feb 25, 2026
@wwwillchen
Copy link
Collaborator

@BugBot run

@azizmejri1
Copy link
Collaborator Author

🤖 Claude Code Review Summary

PR Confidence: 4/5

All review comments have been addressed with code changes; confidence is high but image generation is a new feature that would benefit from manual testing.

Unresolved Threads

Thread Rationale Link

No unresolved threads

Resolved Threads

Issue Rationale Link
Type safety for DyadImageGeneration node prop Replaced any with a specific DyadImageGenerationNode interface, removed type assertion, used ?? instead of || View
Use async fs operations in saveGeneratedImage Changed fs import to node:fs/promises, replaced mkdirSync/writeFileSync with await fs.mkdir/await fs.writeFile to avoid blocking the event loop View
Missing modifiesState: true on generate_image tool Added modifiesState: true for consistency with other file-writing tools (write_file, copy_file, etc.) so it is properly filtered in read-only/plan modes View
No error handling — UI stuck in "Generating..." state Wrapped execute logic in try-catch; on error, calls ctx.onXmlComplete() to close the XML element before re-throwing, per Principle #4: Transparent Over Magical View
Product Principle Suggestions

No suggestions — principles were clear enough for all decisions.


🤖 Generated by Claude Code

@wwwillchen
Copy link
Collaborator

@BugBot run

@github-actions
Copy link
Contributor

🔍 Dyadbot Code Review Summary

Verdict: ✅ YES - Ready to merge

Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard.

✅ No confirmed issues found by multi-agent review. The code is well-structured and follows existing codebase patterns.

Highlights:

  • Error handling with try-catch properly closes the XML stream on failure
  • modifiesState: true correctly set for filesystem-writing tool
  • Async fs/promises used throughout
  • Typed interface for the node prop instead of any
  • E2E test with fixture and snapshot coverage
🟢 Low Priority Notes (2 items)
  • Consent preview may show long promptsgenerate_image.ts:128 — Detailed prompts could be long in the consent banner; consider truncating like other tools (e.g., .slice(0, 80) + "...")
  • Card lacks explicit aria-labelDyadImageGeneration.tsx:42 — An explicit aria-label on the card would improve screen reader experience for the combined badge + prompt text
🚫 Dropped False Positives (11 items)
  • No content type/size validation on URL download — Dropped: URL comes from trusted engine backend API; b64_json is the primary code path
  • path.join backslashes on Windows — Dropped: General codebase pattern with path.join, not specific to this PR
  • No AbortSignal for cancellation — Dropped: Feature enhancement; other tools don't pass abort signals either
  • File extension always .png — Dropped: Hardcoded model returns PNG; consistent behavior
  • revised_prompt field parsed but unused — Dropped: Standard practice to keep Zod schema aligned with API response contract
  • Duplicated guidance across system prompt and tool description — Dropped: System prompt provides behavioral guidance (when to use), tool description aids tool selection (how to use); different purposes justify overlap
  • Hardcoded model name — Dropped: Single usage, clearly visible string literal; extracting to a constant would be over-engineering
  • Node wrapper interface adds indirection — Dropped: Follows established DyadMarkdownParser node.properties pattern for consistency
  • No finished state indicator — Dropped: Consistent with majority of card components (DyadWebSearch, DyadWebCrawl, DyadEdit all skip finished state indicators)
  • No image preview in expanded card — Dropped: Feature enhancement, not a defect; other tool cards don't preview their output content either
  • Error state shows no user-visible details — Dropped: Error is re-thrown to the agent which communicates it to the user in natural language; consistent with other tool error handling patterns

Generated by Dyadbot multi-agent code review

claude and others added 5 commits February 25, 2026 22:33
Add a new `generate_image` tool that calls the Dyad engine's
`/images/generations` endpoint to create AI-generated images.
The tool saves images to `.dyad/media` and instructs the agent
to use `copy_file` to place them in the project codebase.

- New tool implementation with engineFetch, Pro-only gating
- UI card component (DyadImageGeneration) with violet accent
- System prompt guidelines encouraging image generation over placeholders
- E2E test fixture and spec
- Mock endpoint in fake LLM server
- Updated prompt snapshot

https://claude.ai/code/session_01J6tXYHo4RvQguFJF6UkHuk
- Add proper type interface for DyadImageGeneration node prop instead of any
- Convert sync fs operations (mkdirSync, writeFileSync) to async (fs/promises)
- Add modifiesState: true to generate_image tool for plan mode consistency
- Add try-catch error handling to prevent stuck "Generating..." UI state

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@wwwillchen
Copy link
Collaborator

@BugBot run

@github-actions
Copy link
Contributor

🔍 Dyadbot Code Review Summary

Verdict: ✅ YES - Ready to merge

Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard.

No new HIGH or MEDIUM issues found beyond what existing reviewers have already flagged. The existing review comments (error handling gap, missing modifiesState: true, async fs operations, type safety) are the substantive items to address.

Issues Summary

No new HIGH or MEDIUM issues to report.

🟢 Low Priority Notes (4 items)
  • Hardcoded .png file extensionsrc/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:96 — The output filename always uses .png regardless of the actual image format returned by the API. If the engine ever returns JPEG or WebP, the extension would be mismatched.
  • revised_prompt field parsed but never usedsrc/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:28 — The imageDataSchema includes revised_prompt which is parsed from the API response but never surfaced anywhere (UI, tool result, or logs).
  • Consent preview includes full prompt without truncationsrc/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:128getConsentPreview returns the entire prompt verbatim. Long, detailed prompts (which the tool encourages) could produce an unwieldy consent banner.
  • Tool DESCRIPTION overlaps with system promptsrc/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:36-57 — The tool's DESCRIPTION and the IMAGE_GENERATION_BLOCK in local_agent_prompt.ts both describe when to use the tool and the copy_file workflow, inflating token usage slightly.
🚫 Dropped False Positives (11 items)
  • URL download size limit/timeout/validation — Dropped: URL comes from trusted engine API; defense-in-depth measures are over-engineering for an initial implementation
  • No timeout on image generation API call — Dropped: Speculative without seeing engineFetch internals, which may already have timeout handling
  • Hardcoded model name — Dropped: Single-use constant; extracting it adds no real value until there's a need to change it
  • buildXml/onXmlStream dual-path inconsistency — Dropped: Intentional dual-path — buildXml handles streaming preview phase, execute's onXmlStream handles execution phase state transition
  • Redundant null check after Zod parse — Dropped: Defensive coding after !data.data is harmless and makes intent clear
  • Non-optional properties in DyadImageGenerationNode — Dropped: Overlaps with existing comment about any type on the node prop
  • Unnecessary try/catch in fake server — Dropped: Test infrastructure doesn't need the same rigor
  • No image preview in expanded card — Dropped: Feature request, not a code defect; the card correctly shows prompt and file path
  • Missing finishedLabel on DyadStateIndicator — Dropped: Only DyadCopy uses finishedLabel; all other tool cards (DyadWrite, DyadGrep, DyadEdit, etc.) follow the same no-finished-indicator pattern
  • Error state shows no error details — Dropped: Overlaps with existing comment Does it work for Intel Macs? #4 about missing error handling; error detail formatting is an enhancement on that fix
  • No aria-label on card button — Dropped: Follows the existing pattern used by all other expandable tool cards

Generated by Dyadbot multi-agent code review

@azizmejri1
Copy link
Collaborator Author

azizmejri1 commented Feb 25, 2026

This PR is ready for review .
Tests are failing because of a regression caused by PR #2713 , I addressed the issue in #2803 .

@github-actions
Copy link
Contributor

🔍 Dyadbot Code Review Summary

Verdict: ✅ YES - Ready to merge

Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard.

Issues Summary

Severity File Issue
🟡 MEDIUM src/components/chat/DyadImageGeneration.tsx:52 No success indicator when image generation completes
🟢 Low Priority Notes (2 items)
  • revised_prompt schema field parsed but never used - src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:28 - The revised_prompt field is defined in the response schema but never accessed anywhere. Consider removing it or surfacing it to the user.
  • No image preview in expanded card - src/components/chat/DyadImageGeneration.tsx:62 - When expanded, the card only shows the prompt text and file path. Rendering a thumbnail of the generated image would make the feature more useful (could be a follow-up).
🚫 Dropped False Positives (14 items)
  • Empty b64_json would produce corrupt file - Dropped: if (imageData.b64_json) correctly handles falsy values (empty string, null, undefined); non-empty garbage from the API is an API bug, not a code bug
  • SSRF risk on image URL download - Dropped: URL comes from trusted Dyad engine API response, not user input
  • No timeout/size limit on URL download - Dropped: Same pattern as all other tools in the codebase; not unique to this PR
  • Error case emits completed XML tag - Dropped: Already covered by existing comment from chatgpt-codex-connector
  • File extension hardcoded to .png - Dropped: The gpt-image-1.5 model returns PNG; theoretical concern
  • No timeout on engine API call - Dropped: Same pattern as all other tools in the codebase
  • IMAGE_GENERATION_BLOCK always included in Pro prompt - Dropped: Pro prompt and isDyadPro are expected to always align
  • Prompt guidelines duplicated across three locations - Dropped: System prompt adds unique guidance ("don't generate when SVG/icon suffices") not in tool description; overlap is minimal and intentional
  • buildXml and execute both emit XML - Dropped: They serve different phases (streaming preview vs execution lifecycle); not truly redundant
  • Typed node interface inconsistent with peer components - Dropped: The typed approach is actually better practice than any; not a negative
  • Consent preview can be excessively long - Dropped: Minor; consent banner handles text wrapping
  • No tooltip on truncated prompt - Dropped: User can expand the card to read the full prompt
  • Error state not surfaced in UI card - Dropped: Already covered by existing comment from chatgpt-codex-connector
  • File path displayed but not actionable - Dropped: Feature suggestion for v1, not blocking

Generated by Dyadbot multi-agent code review

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-agent review: 1 issue found

<DyadStateIndicator state="aborted" abortedLabel="Did not finish" />
)}
<div className="ml-auto">
<DyadExpandIcon isExpanded={isExpanded} />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM | missing-state-feedback

No success indicator when image generation completes

When image generation finishes successfully, the card shows no visual confirmation. The spinner disappears with no replacement feedback, making it unclear whether the operation succeeded. This is especially noticeable because image generation is a slow operation where the user is actively watching for completion.

Other similar tools in the codebase (e.g., DyadCopy) display a green checkmark with a completion label via DyadStateIndicator.

💡 Suggestion: Add a finished state indicator:

Suggested change
<DyadExpandIcon isExpanded={isExpanded} />
{aborted && (
<DyadStateIndicator state="aborted" abortedLabel="Did not finish" />
)}
{state === "finished" && (
<DyadStateIndicator state="finished" finishedLabel="Generated" />
)}

Copy link
Collaborator

@wwwillchen wwwillchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! It's creating very nice-looking UIs :)

Main comment is to just enable the tool to "always" (see inline comment).

Once the tests are passing, feel free to merge.

Follow-ups:

  • One nice follow-up item (can do in this PR or follow-up PR) is to show some kind of preview of the generated image in the chat .
  • Also, I ran into an error while using this feature (but it's not specific to this feature so it can be in a separate PR: #2804)

name: "generate_image",
description: DESCRIPTION,
inputSchema: generateImageSchema,
defaultConsent: "ask",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should just set this to "always" - i can't really think of a risk with image generation (e.g. it can't really cause data exfiltration) besides generating images costs money, but this tool shouldn't be called that often and users can disable tools in the settings if they really want to.

@github-actions github-actions bot added needs-human:final-check ai agent thinks everything looks good - needs final review from human and removed needs-human:review-issue ai agent flagged an issue that requires human review labels Feb 26, 2026
@wwwillchen
Copy link
Collaborator

@BugBot run

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines 17 to 18
await po.agentConsent.waitForAgentConsentBanner();
await po.agentConsent.clickAgentConsentAlwaysAllow();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test will timeout here because generate_image tool has defaultConsent: "always", so no consent banner appears

Suggested change
await po.agentConsent.waitForAgentConsentBanner();
await po.agentConsent.clickAgentConsentAlwaysAllow();
// No consent banner needed - generate_image has defaultConsent: "always"
Prompt To Fix With AI
This is a comment left during a code review.
Path: e2e-tests/local_agent_generate_image.spec.ts
Line: 17-18

Comment:
test will timeout here because `generate_image` tool has `defaultConsent: "always"`, so no consent banner appears

```suggestion
  // No consent banner needed - generate_image has defaultConsent: "always"
```

How can I resolve this? If you propose a fix, please make it concise.

@github-actions
Copy link
Contributor

🔍 Dyadbot Code Review Summary

Verdict: ✅ YES - Ready to merge

Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard.

Issues Summary

Severity File Issue
🟡 MEDIUM src/components/chat/DyadImageGeneration.tsx:62 No image preview in expanded card
🟡 MEDIUM src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:103 URL fetch has no validation (SSRF defense-in-depth)
🟢 Low Priority Notes (5 items)
  • Hardcoded model name - src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:67 - "gpt-image-1.5" is hardcoded with no comment explaining what it refers to
  • Unused revised_prompt field - src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:28 - Parsed from API response but never read or used
  • Consent preview not truncated - src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:128 - Long prompts create oversized consent preview text
  • OS-specific path separators - src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:98 - path.join produces backslashes on Windows, may confuse the LLM agent
  • File extension hardcoded as .png - src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:96 - Always saves as .png regardless of actual image format returned
🚫 Dropped False Positives (4 items)
  • E2E test waits for consent banner that won't appear - Dropped: The waitForAgentConsentBanner() is a global agent consent mechanism, not per-tool consent. The test works correctly despite defaultConsent: "always".
  • No success indicator on completion - Dropped: Already commented on by a previous reviewer (existing comment Add ollama support #7).
  • API error messages are raw/technical - Dropped: Error messages surface to the LLM agent (not directly to users), so the agent will interpret and explain them.
  • After Generation instructions duplicated - Dropped: The three locations (tool description, system prompt, execute return) serve different purposes (LLM understanding, system guidance, immediate instruction).

Generated by Dyadbot multi-agent code review

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-agent review: 2 issue(s) found

)}
{children && <div className="mt-0.5 text-foreground">{children}</div>}
</div>
</DyadCardContent>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM | missing-image-preview

Expanded card shows only a file path, not the actual generated image

When a user expands the card after image generation completes, they see the prompt text and a raw file path like .dyad/media/generated-1234-abc.png, but never the actual image. For an image generation feature, not showing the generated image is a significant UX miss — the user has to navigate to the file manually to verify the output.

💡 Suggestion: Render an inline image preview in the expanded content when the image path is available and generation is complete. For example:

{imagePath && !inProgress && (
  <img src={resolvedImagePath} alt={prompt} className="rounded-md max-h-48 mt-2" />
)}

throw new Error(`Failed to download generated image: ${response.status}`);
}
const arrayBuffer = await response.arrayBuffer();
await fs.writeFile(filePath, Buffer.from(arrayBuffer));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM | security

URL fetch has no validation — SSRF defense-in-depth concern

When the API returns a URL instead of base64, the code fetches it with the global fetch and no URL validation. While the URL comes from a trusted API (Dyad engine), as a defense-in-depth measure, consider either:

  1. Always requesting b64_json format by adding response_format: 'b64_json' to the API request body (eliminating the URL path entirely)
  2. Validating the URL scheme (HTTPS only) and hostname (no private/loopback ranges) before fetching

This also bypasses any proxy settings or request tracking that engineFetch provides — a comment explaining why raw fetch is appropriate here would help future maintainers.

@github-actions github-actions bot added needs-human:review-issue ai agent flagged an issue that requires human review and removed needs-human:final-check ai agent thinks everything looks good - needs final review from human labels Feb 26, 2026
@wwwillchen
Copy link
Collaborator

@BugBot run

@github-actions
Copy link
Contributor

🔍 Dyadbot Code Review Summary

Verdict: ✅ YES - Ready to merge

Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard.

Issues Summary

Severity File Issue
🟡 MEDIUM src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:98 relativePath uses platform-specific path separator, breaks on Windows
🟡 MEDIUM src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:155-159 Error path emits empty card with no error info in UI
🟡 MEDIUM src/components/chat/DyadImageGeneration.tsx:52-57 No success state indicator after generation completes
🟢 Low Priority Notes (1 item)
  • Redundant children content - src/components/chat/DyadImageGeneration.tsx:82 - The children prop renders the same file path already shown by the "Saved to" section in the expanded view
🚫 Dropped False Positives (8 items)
  • Hardcoded model name 'gpt-image-1.5' - Dropped: This goes through the Dyad engine at /images/generations, not directly to OpenAI. Likely an internal engine model alias, not a public OpenAI model name.
  • No image preview in the card - Dropped: Appears to be a V1 design decision. The generated image is used in the project and visible in the live preview after the agent copies it. Adding an inline preview would require additional IPC/protocol infrastructure.
  • buildXml inconsistency - Dropped: The pattern is intentional. buildXml cannot know the path attribute (only available after execution), so execute must handle XML streaming directly via onXmlStream/onXmlComplete.
  • Unused revised_prompt in schema - Dropped: Common pattern for API response schemas to include all fields for completeness, even if not all are used.
  • Dead try/catch in fake server - Dropped: Test infrastructure nitpick following the pattern of other endpoints.
  • No URL scheme validation - Dropped: URL comes from the trusted Dyad engine, not user input. Defense-in-depth is nice but not a real risk here.
  • File extension hardcoded as .png - Dropped: The b64_json path (primary) typically returns PNG from image generation APIs. The URL fallback path is unlikely to return a different format in practice.
  • Internal media path displayed verbatim - Dropped: Consistent with how other tools (file writes, copies) display file paths in the chat. The path is informational; the agent handles the copy step.

Generated by Dyadbot multi-agent code review

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-agent review: 3 issue(s) found

const timestamp = Date.now();
const fileName = `generated-${timestamp}-${hash}.png`;
const filePath = path.join(mediaDir, fileName);
const relativePath = path.join(DYAD_MEDIA_DIR_NAME, fileName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM | data-integrity

relativePath uses platform-specific path separator, breaking on Windows

path.join(DYAD_MEDIA_DIR_NAME, fileName) uses the OS-native path separator. On Windows this produces .dyad\media\generated-...png (backslashes). This path is:

  1. Returned to the LLM in the tool result string (line 154)
  2. Embedded in the XML path attribute shown in the UI
  3. Used by the LLM with copy_file and ultimately in web asset references like <img src="...">

Web paths require forward slashes. The E2E test explicitly skips Windows (testSkipIfWindows), confirming this path is untested on that platform.

💡 Suggestion: Use forward-slash joining explicitly:

const relativePath = DYAD_MEDIA_DIR_NAME + '/' + fileName;
// or: path.posix.join(DYAD_MEDIA_DIR_NAME, fileName)

ctx.onXmlComplete(
`<dyad-image-generation prompt="${escapeXmlAttr(args.prompt)}"></dyad-image-generation>`,
);
throw error;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM | error-handling

Error path emits empty card with no error details in UI

When execute() throws, the catch block calls ctx.onXmlComplete() with an empty <dyad-image-generation> tag (no path, no error info), then re-throws. Two issues:

  1. Dual cards on error: The outer buildAgentToolSet catch also calls ctx.onXmlComplete() with a <dyad-output type="error"> tag. Both are persisted, producing an empty image-generation card AND a separate error card — unlike other tools which show only the error card.

  2. No error info in the image card: The DyadImageGeneration component has no error state rendering. The card will appear as if it finished with no output (no path, no error message), since state will be finished (not aborted) because the tag is closed.

💡 Suggestion: Either remove the try/catch and let errors propagate naturally (other tools' pattern), or pass an error attribute: <dyad-image-generation prompt="..." error="Content policy violation"> and render it in the component.

)}
{aborted && (
<DyadStateIndicator state="aborted" abortedLabel="Did not finish" />
)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 MEDIUM | consistency

No success state indicator after image generation completes

The component shows Generating... (pending) and Did not finish (aborted), but has no visual confirmation when generation succeeds. Other card components like DyadCopy show a green checkmark via DyadStateIndicator with state='finished' and a finishedLabel. After waiting through what could be a long generation, the user gets no visual feedback that it worked.

💡 Suggestion: Add a finished state indicator:

{!inProgress && !aborted && state === 'finished' && (
  <DyadStateIndicator state="finished" finishedLabel="Generated" />
)}

@wwwillchen
Copy link
Collaborator

@BugBot run

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fdbb29deb2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

const buffer = Buffer.from(imageData.b64_json, "base64");
await fs.writeFile(filePath, buffer);
} else if (imageData.url) {
const response = await fetch(imageData.url);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Validate image download URL before fetching

The saveGeneratedImage path downloads imageData.url directly without any protocol or host validation, so if the engine response is misconfigured or compromised to return a URL like http://127.0.0.1/... or another internal address, the Electron main process will perform an SSRF request and persist the response into the workspace. Because local-agent tools can then read/copy that file, this creates a practical internal-data exfiltration path; enforce strict URL validation (at minimum http/https plus private-network blocking or an allowlist) before calling fetch.

Useful? React with 👍 / 👎.

@github-actions
Copy link
Contributor

🔍 Dyadbot Code Review Summary

Verdict: ✅ YES - Ready to merge

Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard.

Issues Summary

No HIGH or MEDIUM issues found beyond what existing PR comments already cover. This PR is well-structured, follows established codebase patterns (DyadCard primitives, escapeXmlAttr/escapeXmlContent, engineFetch, Zod schemas), and includes E2E tests with a fake server endpoint.

🟢 Low Priority Notes (4 items)
  • Hardcoded model name - src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:66 - gpt-image-1.5 is hardcoded with no comment explaining the choice; a named constant or comment would help future maintainers know when to update it.
  • File extension always .png - src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:96 - The filename is always .png regardless of actual image format returned by the API. If the API ever returns JPEG/WebP, the extension would be incorrect.
  • No timeout on URL download - src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:103 - The fetch(imageData.url) call has no AbortSignal timeout and buffers the entire response in memory via arrayBuffer(). A slow or oversized response could hang or exhaust memory. (Extends the existing SSRF concern at line 109.)
  • Redundant null check - src/pro/main/ipc/handlers/local_agent/tools/generate_image.ts:80 - !data.data is redundant since Zod's z.array() already guarantees data.data is a non-null array after parsing. Could simplify to if (data.data.length === 0).
🚫 Dropped False Positives (10 items)
  • Duplicate XML stream from buildXml and execute - Dropped: This is the standard pattern used by other tools in the codebase (e.g., code_search). The second onXmlStream call in execute is intentional.
  • No streaming progress during generation - Dropped: Matches the established pattern of other long-running tools (web_search, web_crawl) which show a single "in progress" state.
  • Card starts collapsed with no result indicator - Dropped: Already covered by existing comments about missing success state indicator (lines 57, 59).
  • Raw file path shown instead of image preview - Dropped: Already covered by existing comment about showing file path instead of image (line 84).
  • No feedback for non-Pro users - Dropped: Standard Pro feature gating pattern used by all Pro-only tools. Not specific to this PR.
  • Accessibility on card expand/collapse - Dropped: This is a DyadCard primitive concern, not specific to this component.
  • Overly verbose tool DESCRIPTION - Dropped: Image generation is a different kind of tool that benefits from detailed prompt-writing examples and guidance.
  • Prompt guidelines duplicate tool description - Dropped: The system prompt block has unique behavioral content ("Do NOT generate images when an existing asset, SVG, or icon library would suffice") not present in the tool description.
  • Fixture text doesn't match tool output - Dropped: Fixtures represent LLM speech, which is distinct from tool return values.
  • IMAGE_GENERATION_BLOCK not in basic agent prompt - Dropped: The isDyadPro check correctly gates tool availability; the prompt block is supplementary guidance, not gating.

Generated by Dyadbot multi-agent code review

@github-actions github-actions bot added needs-human:final-check ai agent thinks everything looks good - needs final review from human and removed needs-human:review-issue ai agent flagged an issue that requires human review labels Feb 27, 2026
@azizmejri1 azizmejri1 merged commit ce03bea into dyad-sh:main Feb 27, 2026
10 checks passed
@github-actions
Copy link
Contributor

🎭 Playwright Test Results

✅ All tests passed!

OS Passed Flaky Skipped
🍎 macOS 364 4 117
🪟 Windows 363 3 117

Total: 727 tests passed (7 flaky) (234 skipped)

⚠️ Flaky Tests

🍎 macOS

  • free_agent_quota.spec.ts > free agent quota - full flow: mode availability, quota tracking, exceeded banner, switch to build (passed after 1 retry)
  • setup_flow.spec.ts > Setup Flow > setup banner shows correct state when node.js is installed (passed after 1 retry)
  • setup_flow.spec.ts > Setup Flow > node.js install flow (passed after 2 retries)
  • smart_context_balanced.spec.ts > smart context balanced - simple (passed after 1 retry)

🪟 Windows

  • edit_code.spec.ts > edit code (passed after 2 retries)
  • reject.spec.ts > reject (passed after 1 retry)
  • setup_flow.spec.ts > Setup Flow > setup banner shows correct state when node.js is installed (passed after 1 retry)

📊 View full report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-human:final-check ai agent thinks everything looks good - needs final review from human

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants