Skip to content

Commit 19ebb02

Browse files
committed
Passing image guide.
1 parent fafd4b4 commit 19ebb02

File tree

4 files changed

+186
-1
lines changed

4 files changed

+186
-1
lines changed
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
// tag::head[]
2+
package io.quarkiverse.langchain4j.samples.images;
3+
4+
import java.io.IOException;
5+
import java.nio.file.Files;
6+
import java.util.Base64;
7+
8+
import jakarta.inject.Inject;
9+
import jakarta.ws.rs.GET;
10+
import jakarta.ws.rs.Path;
11+
import jakarta.ws.rs.QueryParam;
12+
13+
import dev.langchain4j.data.image.Image;
14+
15+
@Path("/endpoint")
16+
public class Endpoint {
17+
18+
@Inject
19+
ImageAiService imageAiService;
20+
21+
// end::head[]
22+
// tag::url[]
23+
@GET
24+
@Path("/extract-menu")
25+
public String fromUrl(@QueryParam("u") String url) { // <1>
26+
return imageAiService.extractMenu(url);
27+
}
28+
// end::url[]
29+
30+
// tag::ocr[]
31+
@GET
32+
@Path("/ocr-process")
33+
public String passingImage() throws IOException {
34+
byte[] bytes = Files.readAllBytes(java.nio.file.Path.of("IMG_3283.jpg"));
35+
String b64 = Base64.getEncoder().encodeToString(bytes);
36+
Image img = Image.builder()
37+
.base64Data(b64)
38+
.mimeType("image/jpeg")
39+
.build();
40+
41+
return imageAiService.extractReceiptData(img);
42+
}
43+
// end::ocr[]
44+
45+
// tag::head[]
46+
}
47+
// end::head[]
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
// tag::head[]
2+
package io.quarkiverse.langchain4j.samples.images;
3+
4+
import dev.langchain4j.data.image.Image;
5+
import dev.langchain4j.service.SystemMessage;
6+
import dev.langchain4j.service.UserMessage;
7+
import io.quarkiverse.langchain4j.ImageUrl;
8+
import io.quarkiverse.langchain4j.RegisterAiService;
9+
10+
@RegisterAiService
11+
@SystemMessage("Extract and summarize text from the provided image.")
12+
public interface ImageAiService {
13+
// end::head[]
14+
15+
// tag::url[]
16+
@UserMessage("""
17+
Here is a menu image.
18+
Extract the list of items.
19+
""")
20+
String extractMenu(@ImageUrl String imageUrl); // <1>
21+
// end::url[]
22+
23+
// tag::ocr[]
24+
@UserMessage("""
25+
Extract the content of this receipt.
26+
Identify the vendor, date, location and paid amount and currency (euros or USD).
27+
Make sure the paid amount includes VAT.
28+
For each information, add the line from the receipt where you found it.
29+
""")
30+
String extractReceiptData(Image image); // <1>
31+
// end::ocr[]
32+
// tag::head[]
33+
}
34+
// end::head[]

docs/modules/ROOT/nav.adoc

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
* xref:guide-fault-tolerance.adoc[Fault Tolerance]
1818
* xref:guide-csv.adoc[Index CSVs in a RAG pipeline]
1919
* xref:guide-web-search.adoc[Using Tavily Web Search]
20+
* xref:guide-passing-image.adoc[Passing Images to Models]
2021
// * xref:guide-agentic-patterns.adoc[Implementing Agentic patterns]
2122
// * xref:guide-structured-output.adoc[Returning structured data from a model]
2223
// * xref:guide-streamed-responses.adoc[Using function calling]
@@ -25,7 +26,7 @@
2526
2627
// * xref:guide-local-models.adoc[Using local models]
2728
// * xref:guide-in-process-models.adoc[Using in-process models]
28-
// * xref:guide-passing-images.adoc[Passing Images to Models]
29+
2930
// * xref:guide-generating-images.adoc[Generating Images from Prompts]
3031
// Add evaluation and guardrails and testing guides
3132
// Give knowledge to AI models
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
= Extracting data from images
2+
3+
include::./includes/attributes.adoc[]
4+
include::./includes/customization.adoc[]
5+
6+
Some large language models now support vision inputs, letting you automate tasks like OCR-ing receipts, detecting objects in photos, or generating image captions.
7+
This guide shows you how to build a Quarkus microservice that sends image data—either via URL or Base64—to a vision-capable LLM (e.g., GPT-4o) using Quarkus LangChain4j.
8+
9+
== Prerequisites
10+
11+
* A Quarkus project with the `quarkus-langchain4j-openai` extension (or another model provider that supports a model with vision capabilities)
12+
* `quarkus.langchain4j.openai.api-key` set in `application.properties`
13+
* A vision-capable model (for example: `gpt-4.1-mini`, `o3`. The default model has vision capabilities, but you can specify a different one if needed)
14+
15+
[source,properties]
16+
----
17+
quarkus.langchain4j.openai.api-key=${OPENAI_API_KEY}
18+
quarkus.langchain4j.openai.chat-model.model-name=gpt-4.1-mini
19+
----
20+
21+
* Set the temperature to 0.0 for deterministic outputs, especially for tasks like OCR or object detection where precision matters:
22+
23+
[source,properties]
24+
----
25+
quarkus.langchain4j.openai.chat-model.temperature=0
26+
----
27+
28+
== Vision Capability
29+
30+
Vision-capable LLMs can process and understand images alongside text.
31+
Common use cases include:
32+
33+
* **OCR (Optical Character Recognition)** – extract text from receipts, invoices, or documents
34+
* **Object Detection** – identify and classify objects in a photo
35+
* **Image Captioning** – generate descriptive text for an image
36+
* **Visual Question Answering** – answer questions about image content
37+
38+
NOTE: Image payloads count toward the model’s context window limits.
39+
Always validate image size and format before sending.
40+
41+
42+
== Step 1. Define the AI service
43+
44+
Declare an AI Service interface to encapsulate your vision calls:
45+
46+
[source,java]
47+
----
48+
include::{examples-dir}/io/quarkiverse/langchain4j/samples/images/ImageAiService.java[tags=head]
49+
----
50+
51+
Here, `@RegisterAiService` creates the xref:ai-services.adoc[AI Service], and `@SystemMessage` supplies the global instruction for all methods in the service.
52+
53+
== Step 2. Passing an image by URL
54+
55+
Use `@ImageUrl` to mark a String parameter as a remote image URL:
56+
57+
[source,java]
58+
----
59+
include::{examples-dir}/io/quarkiverse/langchain4j/samples/images/ImageAiService.java[tags=head;url]
60+
----
61+
<1> The `@ImageUrl` annotation tells Quarkus LangChain4j to wrap this String as an image URL payload.
62+
63+
[source,java]
64+
----
65+
include::{examples-dir}/io/quarkiverse/langchain4j/samples/images/Endpoint.java[tags=head;url]
66+
----
67+
<1> This endpoint accepts `?u=<imageUrl>` and returns the extracted data
68+
69+
70+
== Step 3. Passing images as Base64 data
71+
72+
Use the `Image` data type for local or in-memory images:
73+
74+
[source,java]
75+
----
76+
include::{examples-dir}/io/quarkiverse/langchain4j/samples/images/ImageAiService.java[tags=head;ocr]
77+
----
78+
<1> The `Image` parameter carries Base64 data plus a _MIME_ type.
79+
80+
In your application code, read and encode the image:
81+
82+
[source,java]
83+
----
84+
include::{examples-dir}/io/quarkiverse/langchain4j/samples/images/Endpoint.java[tags=head;ocr]
85+
----
86+
87+
== Error-Handling Tips
88+
89+
* **Invalid URL or unreachable host:** makes sure the URL is valid and accessible.
90+
* **Oversized Base64 payload:** validate file size (e.g., `< 4 MB`) before encoding to avoid context-window errors.
91+
* **Unsupported MIME type:** check file extension and only accept `image/jpeg`, `image/png`, etc.
92+
93+
== Conclusion
94+
95+
In this guide, you learned two ways to pass images to a vision-capable LLM using Quarkus LangChain4j:
96+
97+
* By URL with `@ImageUrl`
98+
* By Base64 data with the `Image` type
99+
100+
Next steps:
101+
102+
* Combine text and image inputs in a single prompt for richer multimodal interactions
103+
* Chain image extraction into downstream workflows (e.g., store OCR results in a database)

0 commit comments

Comments
 (0)