Skip to content

Conversation

@charley-oai
Copy link
Collaborator

@charley-oai charley-oai commented Jan 9, 2026

Agent wouldn't "see" attached images and would instead try to use the view_file tool:
image

In this PR, we wrap image content items in XML tags with the name of each image (now just a numbered name like [Image #1]), so that the model can understand inline image references (based on name). We also put the image content items above the user message which the model seems to prefer (maybe it's more used to definitions being before references).

We also tweak the view_file tool description which seemed to help a bit

Results on a simple eval set of images:

Before
image

After
image

[
  {
    "id": "single_describe",
    "prompt": "Describe the attached image in one sentence.",
    "images": ["image_a.png"]
  },
  {
    "id": "single_color",
    "prompt": "What is the dominant color in the image? Answer with a single color word.",
    "images": ["image_b.png"]
  },
  {
    "id": "orientation_check",
    "prompt": "Is the image portrait or landscape? Answer in one sentence.",
    "images": ["image_c.png"]
  },
  {
    "id": "detail_request",
    "prompt": "Look closely at the image and call out any small details you notice.",
    "images": ["image_d.png"]
  },
  {
    "id": "two_images_compare",
    "prompt": "I attached two images. Are they the same or different? Briefly explain.",
    "images": ["image_a.png", "image_b.png"]
  },
  {
    "id": "two_images_captions",
    "prompt": "Provide a short caption for each image (Image 1, Image 2).",
    "images": ["image_c.png", "image_d.png"]
  },
  {
    "id": "multi_image_rank",
    "prompt": "Rank the attached images from most colorful to least colorful.",
    "images": ["image_a.png", "image_b.png", "image_c.png"]
  },
  {
    "id": "multi_image_choice",
    "prompt": "Which image looks more vibrant? Answer with 'Image 1' or 'Image 2'.",
    "images": ["image_b.png", "image_d.png"]
  }
]

Fixes issue #8523

Copy link
Collaborator

@aibrahim-oai aibrahim-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. Code feels cleaner. Let's add integration tests to make sure we send the right request shape to model. We persist the right thing to the rollout file. Maybe a UI snapshot as well.

@charley-oai charley-oai merged commit 6709ad8 into main Jan 10, 2026
33 checks passed
@charley-oai charley-oai deleted the dev/ccunningham/improve-pasted-images-prompt branch January 10, 2026 05:33
@github-actions github-actions bot locked and limited conversation to collaborators Jan 10, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants