Fix: convert array content to string before LLM prompt to avoid TypeE… #1472

ieduer · 2025-06-28T01:23:04Z

Context

When using the AI helper with Gemini (or other LLMs) to analyze posts containing images, the content payload is an array combining text and image metadata. This causes the following error:

Root Cause

The generate_prompt method constructs context.messages assuming content is always a string. When it's an array, string operations like .gsub or concatenation fail.

Fix

This PR adds a simple guard clause to normalize any array content to a string by joining its elements before passing the prompt to the LLM.

Reproduction

Create a post with text and an image attachment
Run the AI helper with the default "creative" model
Observe the TypeError
After this patch, the prompt is processed correctly.

…rror

SamSaffron · 2025-06-29T23:24:42Z

@romanrizzi thoughts?

SamSaffron · 2025-06-29T23:24:54Z

thanks heaps for the PR!

romanrizzi · 2025-06-30T19:00:14Z

Hey @gptv - First of all, thanks for taking the time to contribute. Much appreciated!

The safeguard you propose will remove the upload metadata in #generate_image_caption. Without that, we won't be able to pass the image we want to caption to the LLM. In the other method, user_input is always a string because we wrap it with <input> tags.

I couldn't repro the error after using different LLMs and personas. Could you please share a backtrace of the error? I want to have a better understanding of what's happening.

ieduer · 2025-07-01T03:00:03Z

Thanks for the prompt feedback, @romanrizzi!

🔎 Full back-trace

TypeError (no implicit conversion of Array into String)
/plugins/discourse-ai/gems/.../tokenizer.rb:14:in _encode' /plugins/discourse-ai/lib/tokenizer/basic_tokenizer.rb:23:in tokenize'
/plugins/discourse-ai/lib/personas/question_consolidator.rb:37:in revised_prompt' /plugins/discourse-ai/lib/personas/question_consolidator.rb:29:in reverse_each'
/plugins/discourse-ai/lib/personas/bot.rb:58:in reply' /plugins/discourse-ai/app/jobs/regular/create_ai_reply.rb:18:in execute'
…

Logged msg[:content] right before the tokenizer:

["Please summarise …", { upload_id: 12345 }]

=> Array

QuestionConsolidator.revised_prompt passes this Array directly to Tokenizer.encode, which expects a String, hence the error. Personas that don’t include an image hash avoid this path, so the bug is sporadic.

🌱 Fix direction
• Keep the Array in generate_image_caption (no join there).
• Add a guard only where the Array reaches revised_prompt, e.g.:

prompt = prompt.is_a?(Array) ? prompt.map(&:to_s).join("\n") : prompt

romanrizzi · 2025-07-01T13:07:44Z

Perfect. Thanks, @gptv. I noticed your backtrace is related to the AI Bot, not the AI Helper. Can you confirm?

Coincidentally, I fixed that error in #1475. It was happening when a Persona uses RAG and a post has uploads.

ieduer · 2025-07-01T13:57:26Z

Confirmed – the trace I captured was indeed on the AI Bot path (RAG + upload), not the AI Helper prompt flow. I’ve just tested with your patch in #1475 applied and the error no longer occurs. Looks like your fix fully covers the issue. 🎉

Given that, I’ll close this PR to avoid redundancy. Thanks for the clarification and the quick fix!

ieduer added 2 commits June 27, 2025 18:19

Fix: convert array content to string before LLM prompt to avoid TypeE…

58080f8

…rror

Fix: normalize array content in generate_image_caption

609421f

ieduer closed this Jul 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix: convert array content to string before LLM prompt to avoid TypeE… #1472

Fix: convert array content to string before LLM prompt to avoid TypeE… #1472

ieduer commented Jun 28, 2025

Uh oh!

SamSaffron commented Jun 29, 2025

Uh oh!

SamSaffron commented Jun 29, 2025

Uh oh!

romanrizzi commented Jun 30, 2025

Uh oh!

ieduer commented Jul 1, 2025

Uh oh!

romanrizzi commented Jul 1, 2025

Uh oh!

ieduer commented Jul 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Fix: convert array content to string before LLM prompt to avoid TypeE… #1472

Fix: convert array content to string before LLM prompt to avoid TypeE… #1472

Conversation

ieduer commented Jun 28, 2025

Context

Root Cause

Fix

Reproduction

Uh oh!

SamSaffron commented Jun 29, 2025

Uh oh!

SamSaffron commented Jun 29, 2025

Uh oh!

romanrizzi commented Jun 30, 2025

Uh oh!

ieduer commented Jul 1, 2025

=> Array

Uh oh!

romanrizzi commented Jul 1, 2025

Uh oh!

ieduer commented Jul 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants