Skip to content

Commit 9c080b2

Browse files
committed
add Image support in Java file fpr multimodal
1 parent 365cfdb commit 9c080b2

File tree

1 file changed

+26
-0
lines changed
  • articles/ai-foundry/model-inference/includes/use-chat-multi-modal

1 file changed

+26
-0
lines changed

articles/ai-foundry/model-inference/includes/use-chat-multi-modal/java.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,29 @@ zone_pivot_groups: azure-ai-inference-samples
1616

1717
> [!NOTE]
1818
> Using audio inputs is only supported using Python, JavaScript, C#, or REST requests.
19+
20+
## Use chat completions with images
21+
22+
Some models can reason across text and images and generate text completions based on both kinds of input. In this section, you explore the capabilities of Some models for vision in a chat fashion:
23+
24+
> [!IMPORTANT]
25+
> Some models support only one image for each turn in the chat conversation and only the last image is retained in context. If you add multiple images, it results in an error.
26+
27+
To see this capability, download an image and encode the information as `base64` string. The resulting data should be inside of a [data URL](https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/Data_URLs):
28+
29+
Visualize the image:
30+
31+
:::image type="content" source="../../../../ai-foundry/media/how-to/sdks/small-language-models-chart-example.jpg" alt-text="A chart displaying the relative capabilities between large language models and small language models." lightbox="../../../../ai-foundry/media/how-to/sdks/small-language-models-chart-example.jpg":::
32+
33+
Now, create a chat completion request with the image:
34+
35+
The response is as follows, where you can see the model's usage statistics:
36+
37+
```console
38+
ASSISTANT: The chart illustrates that larger models tend to perform better in quality, as indicated by their size in billions of parameters. However, there are exceptions to this trend, such as Phi-3-medium and Phi-3-small, which outperform smaller models in quality. This suggests that while larger models generally have an advantage, there might be other factors at play that influence a model's performance.
39+
Model: mistral-large-2407
40+
Usage:
41+
Prompt tokens: 2380
42+
Completion tokens: 126
43+
Total tokens: 2506
44+
```

0 commit comments

Comments
 (0)