Skip to content

Conversation

@dkotter
Copy link
Collaborator

@dkotter dkotter commented Dec 16, 2025

What?

Partially closes #13. This only handles generating featured images, I'm assuming we'll want image generation in other places as well (though could be tracked in new Issues if we want).

Adds in the UI to trigger Featured Image Generation

Why?

This builds on top of the work done in #134, which added Abilities that could be used to generate an image and/or import an image. This PR adds in the actual UI that triggers those Abilities in the Featured Image section of the edit post screen as well as a new Ability to generate an image generation prompt.

How?

  • Modifies the Image Import Ability to allow passing in additional meta data that will be stored with the image. This allows us to easily track which images are AI generated based on a specific meta key
  • Adds a new Ability that is used to generate an image generation prompt. We pass this the content of the post along with the title and content type. Can pass in other context if desired along with optional style instructions
  • Loads in the necessary admin scripts and admin script data
  • Adds in all the needed javascript to wire things up, including:
    • a Generate featured image button that shows above the standard Set featured image button
    • When that button is clicked, first make a request to the get-post-details Ability. Then pass those details to the new image-prompt-generation Ability, along with our content and context. Take the prompt this generates and pass it to our existing image-generation Ability. Finally take the base64-encoded image it returns and pass it to our existing image-import Ability to import the image into the Media Library
  • Set this image as the Featured Image and change the text of the generation button from Generate featured image to Generate new featured image
  • Add a label below the Featured Image (if set and if AI generated) that makes it clear the image was AI generated

Testing Instructions

  1. Ensure you have valid AI credentials in place for OpenAI or Google under Settings > AI Credentials
  2. Ensure you turn on the Image Generation Experiment under Settings > AI Experiments
  3. Edit a piece of content that doesn't have a Featured Image yet
  4. Ensure you see the Generate featured image button
  5. Click this button (noting this will make an API request and will cost money)
  6. Ensure a proper loading state is shown and that eventually (this can take around 90s) an image is rendered in the normal featured image section
  7. Ensure this image imported properly into the Media Library and has a proper title and description
  8. Ensure the AI label shows below the image in the featured image section

To directly test the new generate image prompt Ability, can make an API requests like the following:

curl --location 'https://example.com/wp-json/wp-abilities/v1/abilities/ai/image-prompt-generation/run' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic ****' \
--data '{
    "input": {
        "content": "This is the content of an article",
        "context": "Title: Hello World"
        "style": "Moody, dark colors"
    }
}'

Screenshots or screencast

New Generate featured image button showing above the default Set featured image button in the block editor sidebar Loading state of the new Generate featured image button AI featured image rendering properly with AI label beneath the image AI generated image shown in the media library

Test using WordPress Playground

The changes in this pull request can be previewed and tested using this WordPress Playground instance:

Click here to test this pull request.

Open WordPress Playground Preview

… closely to render the image and set/remove buttons
… an AI label below the image and only show that if the featured image has the AI meta set
…text using the post ID. Then take that context and generate an image prompt. Finally pass that prompt into our image generation function. Also modify our system instructions a bit
@dkotter dkotter added this to the 0.2.0 milestone Dec 16, 2025
@dkotter dkotter self-assigned this Dec 16, 2025
@github-actions
Copy link

github-actions bot commented Dec 16, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Unlinked Accounts

The following contributors have not linked their GitHub and WordPress.org accounts: @prabinjha, @kurtrank.

Contributors, please read how to link your accounts to ensure your work is properly credited in WordPress releases.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Unlinked contributors: prabinjha, kurtrank.

Co-authored-by: dkotter <[email protected]>
Co-authored-by: jeffpaul <[email protected]>
Co-authored-by: JasonTheAdams <[email protected]>
Co-authored-by: karmatosed <[email protected]>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@codecov
Copy link

codecov bot commented Dec 16, 2025

Codecov Report

❌ Patch coverage is 63.41463% with 60 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.50%. Comparing base (fd0d510) to head (8aeca86).
⚠️ Report is 32 commits behind head on develop.

Files with missing lines Patch % Lines
includes/Abilities/Image/Generate_Image_Prompt.php 59.34% 37 Missing ⚠️
.../Experiments/Image_Generation/Image_Generation.php 51.35% 18 Missing ⚠️
includes/Abilities/Image/Generate_Image.php 0.00% 3 Missing ⚠️
...bilities/Image/image-prompt-system-instruction.php 0.00% 2 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             develop     #146      +/-   ##
=============================================
+ Coverage      50.33%   59.50%   +9.17%     
- Complexity       366      397      +31     
=============================================
  Files             26       29       +3     
  Lines           1951     2136     +185     
=============================================
+ Hits             982     1271     +289     
+ Misses           969      865     -104     
Flag Coverage Δ
unit 59.50% <63.41%> (+9.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@dkotter
Copy link
Collaborator Author

dkotter commented Dec 22, 2025

One straightforward suggestion, but I'm also humming pretty hard at the flow of providing an AI prompt writing Ability in order to run the subsequent prompt. What problem is this extra roundtrip solving that we couldn't accomplish by compiling a prompt ourselves based on the post data? It feels like we could make due without this.

The only place I could see an Ability like that being genuinely useful is if a user had an input to do something like describe a picture, and we added a "Improve prompt" button. But even that would require more specific instructions based on the context. This feels too generic.

@JasonTheAdams I struggled coming up with an approach to turn an entire article into a useful image generation prompt (useful being key there), thus relying on AI to help us out here. I initially tried sending the entire article with a prompt along the lines of Generate a featured image for the following article but that didn't work great and leads to huge prompts. Open to suggestions though on other alternatives to solve this but given the wide-range of content someone may be using, I couldn't think of a better way to do this.

I did go back and forth on if this should be an Ability (the "generate a prompt from this context" piece) so happy to change that to be more specific to this use case instead of a globally exposed Ability if that helps at all.

@dkotter
Copy link
Collaborator Author

dkotter commented Dec 22, 2025

@dkotter perhaps a docs page for this experiment like done in https://github.com/WordPress/ai/pull/155/changes#diff-f70665388390657902e73741dea03efcdadc4bd0ceb4432959f8903ae015c042?

@jeffpaul I know James has mentioned these PRs are more WIP at the moment so wondering if that's the best example to copy from? I know our default Experiment did add a README file within the Experiments directory but we haven't followed that approach so far, though I can do that if desired. I just think we need to decide on what our standard documentation structure should be so we can ensure it matches across everything.

@JasonTheAdams JasonTheAdams self-requested a review December 29, 2025 23:15
@JasonTheAdams
Copy link
Member

I chatted with some other folks and did some research on this, @dkotter, and you're right. In fact, it's rather complex. 😅

Image prompts are strange things and they vary depending on whether the image generation model is using diffusion or a transformer. If diffusion, things like context and history are more more confusing, as those models tend to expect the prompt to be rather focused on describing the image. Transformer models are more forgiving, but still work better when the prompt focuses on the image.

In this case we're really describing two bits of work for the models:

  1. Taking a chunk of content and using it as inspiration for an image description
  2. Generating an image

Trying to get the image generation model to do both is, as you found, not particularly successful. In fact, depending on what type of model you used it could be really bad. Hahah!

So it does make sense to have a model and system prompt focused on taking content and deriving an image prompt. It will need to understand that the content is a source of inspiration and to look for things like archetypes and what not within the content. This is some prompt engineering in and of itself. We may even take more parameters for things like describing the style of the image and such, which the content itself won't provide. This gets into the territory of a content guidelines as James has explored.

At this point I'm thinking it would be useful to have an "Image prompt from content" ability that takes in:

  1. Content to use as inspiration
  2. Style instructions (could be broken up into granular parameters)

This way we can really focus the system prompt on understanding its singular purpose, which I think will improve the resulting image prompt. We may consider having default style instructions so the generated images are somewhat consistent.

@jeffpaul
Copy link
Member

jeffpaul commented Jan 7, 2026

At this point I'm thinking it would be useful to have an "Image prompt from content" ability that takes in:

Given that we'll likely have the same problem should we get into video generation, should we be more generic in naming this ability than tying it directly to "image"?

@JasonTheAdams
Copy link
Member

Given that we'll likely have the same problem should we get into video generation, should we be more generic in naming this ability than tying it directly to "image"?

I don't think so. I believe video prompts will want their own tuning and such. I think we should stick to just images for now and tackle videos later.

@dkotter
Copy link
Collaborator Author

dkotter commented Jan 8, 2026

So it does make sense to have a model and system prompt focused on taking content and deriving an image prompt. It will need to understand that the content is a source of inspiration and to look for things like archetypes and what not within the content. This is some prompt engineering in and of itself. We may even take more parameters for things like describing the style of the image and such, which the content itself won't provide. This gets into the territory of a content guidelines as James has explored.

At this point I'm thinking it would be useful to have an "Image prompt from content" ability that takes in:

  1. Content to use as inspiration
  2. Style instructions (could be broken up into granular parameters)

This way we can really focus the system prompt on understanding its singular purpose, which I think will improve the resulting image prompt. We may consider having default style instructions so the generated images are somewhat consistent.

@JasonTheAdams Okay, I've updated this PR to take all the above into consideration. There is now a more specific Ability, image-prompt-generation, that allows you to pass in content, optional context (like title, author, content type) and optional style suggestions.

This Ability will then generate an image generation prompt which we then pass directly to an image generation model.

In my testing this is all working as expected but definitely open to suggestions on the system instructions we're using here and any other default instructions we may want in place to guide things.

@jeffpaul jeffpaul modified the milestones: 0.2.0, 0.3.0 Jan 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Image Generation

3 participants