Skip to content

Code samples invalid for llama-3.2-11b-vision-instruct Workers AI Model #19185

@thomas-desmond

Description

@thomas-desmond

Existing documentation URL(s)

https://developers.cloudflare.com/workers-ai/models/llama-3.2-11b-vision-instruct/

What changes are you suggesting?

All of the code samples for the llama-3.2-11b-vision-instruct Workers AI model do not work. At its most basic, the model is for image recognition, and the sample code never handles an image.

Something similar to this image-to-text model may be better: https://developers.cloudflare.com/workers-ai/models/uform-gen2-qwen-500m/.

I was able to get the following JavaScript code sample to execute in a Worker, however, I'd want someone else to confirm the code is following the best practices for this model:

const res = await fetch("https://cataas.com/cat");
const blob = await res.arrayBuffer();
const encodedImage = [...new Uint8Array(blob)]


const response = await env.AI.run('@cf/meta/llama-3.2-11b-vision-instruct',
  {
	  image: encodedImage,
	  prompt: 'Tell me what is in the image.',
  },
);

I also have concerns about the Parameters section (https://developers.cloudflare.com/workers-ai/models/llama-3.2-11b-vision-instruct/#Parameters). It's not clear to me whether you need to have Prompt and Messages as input parameters for the model. Based on testing, you can only have one or the other, not both. But nothing states that in the documentation. And if you can only have one why would you choose one over the other?

Additional information

No response

Metadata

Metadata

Labels

content:editRequest for content editsdocumentationDocumentation editsproduct:workers-aiWorkers AI: https://developers.cloudflare.com/workers-ai/

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions