Skip to content

Commit 0b6bc03

Browse files
Jonas-Isrbot-sdk-jsMatKuhrCharlesDuboisSAP
authored
feat: Orchestration image support (#294)
* Call Orchestration with image and multiString via executeRequestFromJsonModuleConfig * Work in Progress * Align System- nad AssintantMessage with UserMessage * Improve exceptions * Improve tests * Prepare draft * - Restrict Multi- and ImageContent to appropriate classes; - Make content() return MessageContent object * Rename addTextMessages() to addText() * Remove or hide unneeded constructors, use Message.user() etc. instead * Add simple e2e tests * Prepare draft * Small change * Refactor newly introduced classes * WIP * add test for base64 image * Small cleanups, no more unnecessary Exceptions * Small changes: ImageItem.DetailLevel is not mandatory by spec, refactor MessageContent construction * Delete MessageContent.toString() * Bit of clean up * Add/improve javadocs * Add/improve javadocs some more * Add annotations * Add explicit allArgs constructor to ImageItem * Fix codestyle etc. * Fix order in add methods * Improve unit test * Improve e2e test * Small fixes after merge * Add unit test for message construction * Fix sample app after merge * Simplify multiMessage unit test * Formatting * Add constructor for multiple strings for UserMessage and MessageContent * Small changes * Add documentation and release notes * Improve documentation * Update orchestration/src/main/java/com/sap/ai/sdk/orchestration/MessageContent.java Co-authored-by: Matthias Kuhr <[email protected]> * Minor changes * Make `.content()` @beta * change method names from `addXYZ()` to `andXYZ()` * Delete unnecessary @nonnull from ImageItem constructor * Improve tests * increase coverage * We hate Jacoco * Update docs/release-notes/release_notes.md Co-authored-by: Charles Dubois <[email protected]> * Simplify logic * Small fixes * Reduce and streamline amount of public API * Rename convenience methods to `withXyz()` * Simplify code and adapt jacoco coverage * Small change * Add release number to javadocs * Fit jacoco coverage * Rename MessageContent.contentItemList to MessageContent.items * Update docs --------- Co-authored-by: Jonas Israel <[email protected]> Co-authored-by: SAP Cloud SDK Bot <[email protected]> Co-authored-by: Matthias Kuhr <[email protected]> Co-authored-by: I538344 <[email protected]> Co-authored-by: Charles Dubois <[email protected]>
1 parent 0a9d57a commit 0b6bc03

File tree

23 files changed

+1130
-608
lines changed

23 files changed

+1130
-608
lines changed

docs/guides/ORCHESTRATION_CHAT_COMPLETION.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -259,6 +259,47 @@ try (Stream<String> stream = client.streamChatCompletion(prompt, config)) {
259259
Please find [an example in our Spring Boot application](../../sample-code/spring-app/src/main/java/com/sap/ai/sdk/app/services/OrchestrationService.java).
260260
It shows the usage of Spring Boot's `ResponseBodyEmitter` to stream the chat completion delta messages to the frontend in real-time.
261261

262+
263+
## Add images and multiple text inputs to a message
264+
265+
It's possible to add images and multiple text inputs to a message.
266+
267+
### Add images to a message
268+
269+
An image can be added to a message as follows.
270+
271+
```java
272+
var message = Message.user("Describe the following image");
273+
var newMessage = message.withImage("https://url.to/image.jpg");
274+
```
275+
276+
You can also construct a message with an image directly, using the `ImageItem` class.
277+
278+
```java
279+
var message = Message.user(new ImageItem("https://url.to/image.jpg"));
280+
```
281+
282+
Some AI models, like GPT 4o, support additionally setting the detail level with which the image is read. This can be set via the `DetailLevel` parameter.
283+
284+
```java
285+
var newMessage = message.withImage("https://url.to/image.jpg", ImageItem.DetailLevel.LOW);
286+
```
287+
Note, that currently only user messages are supported for image attachments.
288+
289+
### Add multiple text inputs to a message
290+
291+
It's also possible to add multiple text inputs to a message. This can be useful for providing additional context to the AI model. You can add additional text inputs as follows.
292+
293+
```java
294+
var message = Message.user("What is chess about?");
295+
var newMessage = message.withText("Answer in two sentences.");
296+
```
297+
298+
Note, that only user and system messages are supported for multiple text inputs.
299+
300+
Please find [an example in our Spring Boot application](../../sample-code/spring-app/src/main/java/com/sap/ai/sdk/app/services/OrchestrationService.java).
301+
302+
262303
## Set model parameters
263304

264305
Change your LLM configuration to add model parameters:

docs/release-notes/release_notes.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,14 @@
88

99
### 🔧 Compatibility Notes
1010

11-
-
11+
- `Message.content()` returns a `ContentItem` now instead of a `String`. Use `((TextItem) Message.content().items().get(0)).text()` if the corresponding `ContentItem` is a `TextItem` and the string representation is needed.
1212

1313
### ✨ New Functionality
1414

1515
- Upgrade to release 2502a of AI Core.
16-
- [Add Orchestration `LlamaGuardFilter`](../guides/ORCHESTRATION_CHAT_COMPLETION.md#chat-completion-filter).
16+
- Orchestration:
17+
- [Add `LlamaGuardFilter`](../guides/ORCHESTRATION_CHAT_COMPLETION.md#chat-completion-filter).
18+
- [Convenient methods to create messages containing images and multiple text inputs](../guides/ORCHESTRATION_CHAT_COMPLETION.md#add-images-and-multiple-text-inputs-to-a-message)
1719

1820
### 📈 Improvements
1921

orchestration/pom.xml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,11 +31,11 @@
3131
</developers>
3232
<properties>
3333
<project.rootdir>${project.basedir}/../</project.rootdir>
34-
<coverage.complexity>77%</coverage.complexity>
34+
<coverage.complexity>80%</coverage.complexity>
3535
<coverage.line>92%</coverage.line>
36-
<coverage.instruction>92%</coverage.instruction>
37-
<coverage.branch>70%</coverage.branch>
38-
<coverage.method>92%</coverage.method>
36+
<coverage.instruction>93%</coverage.instruction>
37+
<coverage.branch>71%</coverage.branch>
38+
<coverage.method>95%</coverage.method>
3939
<coverage.class>100%</coverage.class>
4040
</properties>
4141

orchestration/src/main/java/com/sap/ai/sdk/orchestration/AssistantMessage.java

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
package com.sap.ai.sdk.orchestration;
22

3+
import com.google.common.annotations.Beta;
4+
import java.util.List;
35
import javax.annotation.Nonnull;
6+
import lombok.Getter;
47
import lombok.Value;
58
import lombok.experimental.Accessors;
69

@@ -13,5 +16,16 @@ public class AssistantMessage implements Message {
1316
@Nonnull String role = "assistant";
1417

1518
/** The content of the message. */
16-
@Nonnull String content;
19+
@Nonnull
20+
@Getter(onMethod_ = @Beta)
21+
MessageContent content;
22+
23+
/**
24+
* Creates a new assistant message with the given single message.
25+
*
26+
* @param singleMessage the single message.
27+
*/
28+
public AssistantMessage(@Nonnull final String singleMessage) {
29+
content = new MessageContent(List.of(new TextItem(singleMessage)));
30+
}
1731
}
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
package com.sap.ai.sdk.orchestration;
2+
3+
/**
4+
* Represents an item in a {@link MessageContent} object.
5+
*
6+
* @since 1.3.0
7+
*/
8+
public sealed interface ContentItem permits TextItem, ImageItem {}
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
package com.sap.ai.sdk.orchestration;
2+
3+
import java.util.Locale;
4+
import javax.annotation.Nonnull;
5+
6+
/**
7+
* Represents an image item in a {@link MessageContent} object.
8+
*
9+
* @param imageUrl the URL of the image
10+
* @param detailLevel the detail level of the image (optional)
11+
* @since 1.3.0
12+
*/
13+
public record ImageItem(@Nonnull String imageUrl, @Nonnull DetailLevel detailLevel)
14+
implements ContentItem {
15+
16+
/**
17+
* Creates a new image item with the given image URL.
18+
*
19+
* @param imageUrl the URL of the image
20+
* @since 1.3.0
21+
*/
22+
public ImageItem(@Nonnull final String imageUrl) {
23+
this(imageUrl, DetailLevel.AUTO);
24+
}
25+
26+
/**
27+
* The detail level of the image.
28+
*
29+
* @since 1.3.0
30+
*/
31+
public enum DetailLevel {
32+
/** Low detail level. */
33+
LOW("low"),
34+
/** High detail level. */
35+
HIGH("high"),
36+
/** Automatic detail level. */
37+
AUTO("auto");
38+
39+
private final String level;
40+
41+
/**
42+
* Converts a string to a detail level.
43+
*
44+
* @param str the string to convert
45+
* @return the detail level
46+
* @since 1.3.0
47+
*/
48+
@Nonnull
49+
static DetailLevel fromString(@Nonnull final String str) {
50+
return DetailLevel.valueOf(str.toUpperCase(Locale.ENGLISH));
51+
}
52+
53+
/**
54+
* Get the string representation of the DetailLevel
55+
*
56+
* @return the DetailLevel as string
57+
* @since 1.3.0
58+
*/
59+
@Nonnull
60+
public String toString() {
61+
return level;
62+
}
63+
64+
DetailLevel(@Nonnull final String level) {
65+
this.level = level;
66+
}
67+
}
68+
}

orchestration/src/main/java/com/sap/ai/sdk/orchestration/Message.java

Lines changed: 46 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -2,43 +2,62 @@
22

33
import com.google.common.annotations.Beta;
44
import com.sap.ai.sdk.orchestration.model.ChatMessage;
5+
import com.sap.ai.sdk.orchestration.model.ImageContent;
6+
import com.sap.ai.sdk.orchestration.model.ImageContentImageUrl;
7+
import com.sap.ai.sdk.orchestration.model.MultiChatMessage;
8+
import com.sap.ai.sdk.orchestration.model.MultiChatMessageContent;
59
import com.sap.ai.sdk.orchestration.model.SingleChatMessage;
10+
import com.sap.ai.sdk.orchestration.model.TextContent;
11+
import java.util.LinkedList;
12+
import java.util.List;
613
import javax.annotation.Nonnull;
714

815
/** Interface representing convenience wrappers of chat message to the orchestration service. */
916
public sealed interface Message permits UserMessage, AssistantMessage, SystemMessage {
1017

1118
/**
12-
* A convenience method to create a user message.
19+
* A convenience method to create a user message from a string.
1320
*
14-
* @param msg the message content.
21+
* @param message the message content.
1522
* @return the user message.
1623
*/
1724
@Nonnull
18-
static UserMessage user(@Nonnull final String msg) {
19-
return new UserMessage(msg);
25+
static UserMessage user(@Nonnull final String message) {
26+
return new UserMessage(message);
27+
}
28+
29+
/**
30+
* A convenience method to create a user message containing only an image.
31+
*
32+
* @param imageItem the message content.
33+
* @return the user message.
34+
* @since 1.3.0
35+
*/
36+
@Nonnull
37+
static UserMessage user(@Nonnull final ImageItem imageItem) {
38+
return new UserMessage(new MessageContent(List.of(imageItem)));
2039
}
2140

2241
/**
2342
* A convenience method to create an assistant message.
2443
*
25-
* @param msg the message content.
44+
* @param message the message content.
2645
* @return the assistant message.
2746
*/
2847
@Nonnull
29-
static AssistantMessage assistant(@Nonnull final String msg) {
30-
return new AssistantMessage(msg);
48+
static AssistantMessage assistant(@Nonnull final String message) {
49+
return new AssistantMessage(message);
3150
}
3251

3352
/**
34-
* A convenience method to create a system message.
53+
* A convenience method to create a system message from a string.
3554
*
36-
* @param msg the message content.
55+
* @param message the message content.
3756
* @return the system message.
3857
*/
3958
@Nonnull
40-
static SystemMessage system(@Nonnull final String msg) {
41-
return new SystemMessage(msg);
59+
static SystemMessage system(@Nonnull final String message) {
60+
return new SystemMessage(message);
4261
}
4362

4463
/**
@@ -48,7 +67,21 @@ static SystemMessage system(@Nonnull final String msg) {
4867
*/
4968
@Nonnull
5069
default ChatMessage createChatMessage() {
51-
return SingleChatMessage.create().role(role()).content(content());
70+
final var itemList = this.content().items();
71+
if (itemList.size() == 1 && itemList.get(0) instanceof TextItem textItem) {
72+
return SingleChatMessage.create().role(role()).content(textItem.text());
73+
}
74+
final var contentList = new LinkedList<MultiChatMessageContent>();
75+
for (final ContentItem item : itemList) {
76+
if (item instanceof TextItem textItem) {
77+
contentList.add(TextContent.create().type(TextContent.TypeEnum.TEXT).text(textItem.text()));
78+
} else if (item instanceof ImageItem imageItem) {
79+
final var detail = imageItem.detailLevel().toString();
80+
final var img = ImageContentImageUrl.create().url(imageItem.imageUrl()).detail(detail);
81+
contentList.add(ImageContent.create().type(ImageContent.TypeEnum.IMAGE_URL).imageUrl(img));
82+
}
83+
}
84+
return MultiChatMessage.create().role(role()).content(contentList);
5285
}
5386

5487
/**
@@ -66,5 +99,5 @@ default ChatMessage createChatMessage() {
6699
*/
67100
@Nonnull
68101
@Beta
69-
String content();
102+
MessageContent content();
70103
}
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
package com.sap.ai.sdk.orchestration;
2+
3+
import com.sap.ai.sdk.orchestration.model.ImageContent;
4+
import com.sap.ai.sdk.orchestration.model.MultiChatMessageContent;
5+
import com.sap.ai.sdk.orchestration.model.TextContent;
6+
import java.util.List;
7+
import javax.annotation.Nonnull;
8+
9+
/**
10+
* Represents the content of a chat message.
11+
*
12+
* @param items a list of the content items
13+
* @since 1.3.0
14+
*/
15+
public record MessageContent(@Nonnull List<ContentItem> items) {
16+
@Nonnull
17+
static MessageContent fromMCMContentList(
18+
@Nonnull final List<MultiChatMessageContent> mCMContentList) {
19+
final var itemList =
20+
mCMContentList.stream()
21+
.map(
22+
content -> {
23+
if (content instanceof TextContent text) {
24+
return new TextItem(text.getText());
25+
} else {
26+
final var imageUrl = ((ImageContent) content).getImageUrl();
27+
return (ContentItem)
28+
new ImageItem(
29+
imageUrl.getUrl(),
30+
ImageItem.DetailLevel.fromString(imageUrl.getDetail()));
31+
}
32+
})
33+
.toList();
34+
return new MessageContent(itemList);
35+
}
36+
}

orchestration/src/main/java/com/sap/ai/sdk/orchestration/OrchestrationChatResponse.java

Lines changed: 30 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
import com.sap.ai.sdk.orchestration.model.CompletionPostResponse;
77
import com.sap.ai.sdk.orchestration.model.LLMChoice;
88
import com.sap.ai.sdk.orchestration.model.LLMModuleResultSynchronous;
9+
import com.sap.ai.sdk.orchestration.model.MultiChatMessage;
910
import com.sap.ai.sdk.orchestration.model.SingleChatMessage;
1011
import com.sap.ai.sdk.orchestration.model.TokenUsage;
1112
import java.util.ArrayList;
@@ -51,33 +52,47 @@ public TokenUsage getTokenUsage() {
5152
/**
5253
* Get all messages. This can be used for subsequent prompts as a message history.
5354
*
54-
* @throws UnsupportedOperationException if the MultiChatMessage type message in chat.
55+
* @throws IllegalArgumentException if the MultiChatMessage type message in chat.
5556
* @return A list of all messages.
5657
*/
5758
@Nonnull
58-
public List<Message> getAllMessages() throws UnsupportedOperationException {
59+
public List<Message> getAllMessages() throws IllegalArgumentException {
5960
final var messages = new ArrayList<Message>();
60-
6161
for (final ChatMessage chatMessage : originalResponse.getModuleResults().getTemplating()) {
6262
if (chatMessage instanceof SingleChatMessage simpleMsg) {
63-
final var message =
64-
switch (simpleMsg.getRole()) {
65-
case "user" -> new UserMessage(simpleMsg.getContent());
66-
case "assistant" -> new AssistantMessage(simpleMsg.getContent());
67-
case "system" -> new SystemMessage(simpleMsg.getContent());
68-
default -> throw new IllegalStateException("Unexpected role: " + simpleMsg.getRole());
69-
};
70-
messages.add(message);
63+
messages.add(chatMessageIntoMessage(simpleMsg));
64+
} else if (chatMessage instanceof MultiChatMessage mCMessage) {
65+
messages.add(chatMessageIntoMessage(mCMessage));
7166
} else {
72-
throw new UnsupportedOperationException(
73-
"Messages of MultiChatMessage type not supported by convenience API");
67+
throw new IllegalArgumentException(
68+
"Messages of type " + chatMessage.getClass() + " are not supported by convenience API");
7469
}
7570
}
76-
77-
messages.add(new AssistantMessage(getChoice().getMessage().getContent()));
71+
messages.add(Message.assistant(getChoice().getMessage().getContent()));
7872
return messages;
7973
}
8074

75+
@Nonnull
76+
private Message chatMessageIntoMessage(@Nonnull final SingleChatMessage simpleMsg) {
77+
return switch (simpleMsg.getRole()) {
78+
case "user" -> Message.user(simpleMsg.getContent());
79+
case "assistant" -> Message.assistant(simpleMsg.getContent());
80+
case "system" -> Message.system(simpleMsg.getContent());
81+
default -> throw new IllegalStateException("Unexpected role: " + simpleMsg.getRole());
82+
};
83+
}
84+
85+
@Nonnull
86+
private Message chatMessageIntoMessage(@Nonnull final MultiChatMessage mCMessage) {
87+
return switch (mCMessage.getRole()) {
88+
case "user" -> new UserMessage(MessageContent.fromMCMContentList(mCMessage.getContent()));
89+
case "system" -> new SystemMessage(MessageContent.fromMCMContentList(mCMessage.getContent()));
90+
default ->
91+
throw new IllegalStateException(
92+
"Unexpected role with complex message: " + mCMessage.getRole());
93+
};
94+
}
95+
8196
/**
8297
* Get the LLM response. Useful for accessing the finish reason or further data like logprobs.
8398
*

0 commit comments

Comments
 (0)