Skip to content

Commit 76fe48d

Browse files
feat: [Orchestration] Enable and test setting response_format (#328)
* Add e2e test using low-level classes * Rename to jsonSchema * Add simple unit test * Add unit test for JSON object * Improve unit test for JSON schema * Add unit test for response format text * Add e2e tests for all response formats * Add docs, release notes, and links in sample app * Small fixes * Update release notes * Rename unit test Functions * Update release_notes.md * Update docs/guides/ORCHESTRATION_CHAT_COMPLETION.md Co-authored-by: Charles Dubois <[email protected]> * Improve explanations * Shorten unit tests * Shorten unit tests more * Update sample-code/spring-app/src/main/java/com/sap/ai/sdk/app/services/OrchestrationService.java Co-authored-by: Charles Dubois <[email protected]> * Update sample-code/spring-app/src/main/java/com/sap/ai/sdk/app/services/OrchestrationService.java Co-authored-by: Charles Dubois <[email protected]> * Update sample-code/spring-app/src/main/java/com/sap/ai/sdk/app/services/OrchestrationService.java Co-authored-by: Charles Dubois <[email protected]> * Add comment about future conveient layer to docs. --------- Co-authored-by: Jonas Israel <[email protected]> Co-authored-by: Charles Dubois <[email protected]>
1 parent f3b2185 commit 76fe48d

File tree

14 files changed

+651
-25
lines changed

14 files changed

+651
-25
lines changed

docs/guides/ORCHESTRATION_CHAT_COMPLETION.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@
1212
- [Data Masking](#data-masking)
1313
- [Grounding](#grounding)
1414
- [Stream chat completion](#stream-chat-completion)
15+
- [Add images and multiple text inputs to a message](#add-images-and-multiple-text-inputs-to-a-message)
16+
- [Set a Response Format](#set-a-response-format)
1517
- [Set Model Parameters](#set-model-parameters)
1618
- [Using a Configuration from AI Launchpad](#using-a-configuration-from-ai-launchpad)
1719

@@ -300,6 +302,73 @@ Note, that only user and system messages are supported for multiple text inputs.
300302
Please find [an example in our Spring Boot application](../../sample-code/spring-app/src/main/java/com/sap/ai/sdk/app/services/OrchestrationService.java).
301303

302304

305+
## Set a Response Format
306+
307+
It is possible to set the response format for the chat completion. Available options are using `JSON_OBJECT`, `JSON_SCHEMA`, and `TEXT`, where `TEXT` is the default behavior.
308+
309+
### JSON_OBJECT
310+
311+
Setting the response format to `JSON_OBJECT` tells the AI to respond with JSON, i.e., the response from the AI will be a string consisting of a valid JSON. This does, however, not guarantee that the response adheres to a specific structure (other than being valid JSON).
312+
313+
```java
314+
var template = Message.user("What is 'apple' in German?");
315+
var templatingConfig =
316+
Template.create()
317+
.template(List.of(template.createChatMessage()))
318+
.responseFormat(
319+
ResponseFormatJsonObject.create()
320+
.type(ResponseFormatJsonObject.TypeEnum.JSON_OBJECT));
321+
var configWithTemplate = llmWithImageSupportConfig.withTemplateConfig(templatingConfig);
322+
323+
var prompt =
324+
new OrchestrationPrompt(
325+
Message.system(
326+
"You are a language translator. Answer using the following JSON format: {\"language\": ..., \"translation\": ...}"));
327+
var response = client.chatCompletion(prompt, configWithTemplate).getContent();
328+
```
329+
Note, that it is necessary to tell the AI model to actually return a JSON object in the prompt. The result might not adhere exactly to the given JSON format, but it will be a JSON object.
330+
331+
332+
### JSON_SCHEMA
333+
334+
If you want the response to not only consist of valid JSON but additionally adhere to a specific JSON schema, you can use `JSON_SCHEMA`. in order to do that, add a JSON schema to the configuration as shown below and the response will adhere to the given schema.
335+
336+
```java
337+
var template = Message.user("Whats '%s' in German?".formatted(word));
338+
var schema =
339+
Map.of(
340+
"type",
341+
"object",
342+
"properties",
343+
Map.of(
344+
"language", Map.of("type", "string"),
345+
"translation", Map.of("type", "string")),
346+
"required",
347+
List.of("language", "translation"),
348+
"additionalProperties",
349+
false);
350+
351+
// Note, that we plan to add more convenient ways to add a JSON schema in the future.
352+
var templatingConfig =
353+
Template.create()
354+
.template(List.of(template.createChatMessage()))
355+
.responseFormat(
356+
ResponseFormatJsonSchema.create()
357+
.type(ResponseFormatJsonSchema.TypeEnum.JSON_SCHEMA)
358+
.jsonSchema(
359+
ResponseFormatJsonSchemaJsonSchema.create()
360+
.name("translation_response")
361+
.schema(schema)
362+
.strict(true)
363+
.description("Output schema for language translation.")));
364+
var configWithTemplate = llmWithImageSupportConfig.withTemplateConfig(templatingConfig);
365+
366+
var prompt = new OrchestrationPrompt(Message.system("You are a language translator."));
367+
var response = client.chatCompletion(prompt, configWithTemplate).getContent();
368+
```
369+
370+
Please find [an example in our Spring Boot application](../../sample-code/spring-app/src/main/java/com/sap/ai/sdk/app/services/OrchestrationService.java)
371+
303372
## Set model parameters
304373

305374
Change your LLM configuration to add model parameters:

docs/release-notes/release_notes.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
- Orchestration:
1717
- [Add `LlamaGuardFilter`](https://github.com/SAP/ai-sdk-java/tree/main/docs/guides/ORCHESTRATION_CHAT_COMPLETION.md#chat-completion-filter).
1818
- [Convenient methods to create messages containing images and multiple text inputs](https://github.com/SAP/ai-sdk-java/tree/main/docs/guides/ORCHESTRATION_CHAT_COMPLETION.md#add-images-and-multiple-text-inputs-to-a-message)
19+
- [Enable setting the response format](https://github.com/SAP/ai-sdk-java/tree/main/docs/guides/ORCHESTRATION_CHAT_COMPLETION.md#set-a-response-format)
1920

2021
### 📈 Improvements
2122

orchestration/src/main/java/com/sap/ai/sdk/orchestration/ConfigToRequestTransformer.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,14 +49,15 @@ static TemplatingModuleConfig toTemplateModuleConfig(
4949
* To be fixed with https://github.tools.sap/AI/llm-orchestration/issues/662
5050
*/
5151
val messages = template instanceof Template t ? t.getTemplate() : List.<ChatMessage>of();
52+
val responseFormat = template instanceof Template t ? t.getResponseFormat() : null;
5253
val messagesWithPrompt = new ArrayList<>(messages);
5354
messagesWithPrompt.addAll(
5455
prompt.getMessages().stream().map(Message::createChatMessage).toList());
5556
if (messagesWithPrompt.isEmpty()) {
5657
throw new IllegalStateException(
5758
"A prompt is required. Pass at least one message or configure a template with messages or a template reference.");
5859
}
59-
return Template.create().template(messagesWithPrompt);
60+
return Template.create().template(messagesWithPrompt).responseFormat(responseFormat);
6061
}
6162

6263
@Nonnull

orchestration/src/test/java/com/sap/ai/sdk/orchestration/OrchestrationUnitTest.java

Lines changed: 123 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -44,9 +44,14 @@
4444
import com.sap.ai.sdk.orchestration.model.KeyValueListPair;
4545
import com.sap.ai.sdk.orchestration.model.LLMModuleResultSynchronous;
4646
import com.sap.ai.sdk.orchestration.model.LlamaGuard38b;
47+
import com.sap.ai.sdk.orchestration.model.ResponseFormatJsonObject;
48+
import com.sap.ai.sdk.orchestration.model.ResponseFormatJsonSchema;
49+
import com.sap.ai.sdk.orchestration.model.ResponseFormatJsonSchemaJsonSchema;
50+
import com.sap.ai.sdk.orchestration.model.ResponseFormatText;
4751
import com.sap.ai.sdk.orchestration.model.SearchDocumentKeyValueListPair;
4852
import com.sap.ai.sdk.orchestration.model.SearchSelectOptionEnum;
4953
import com.sap.ai.sdk.orchestration.model.SingleChatMessage;
54+
import com.sap.ai.sdk.orchestration.model.Template;
5055
import com.sap.cloud.sdk.cloudplatform.connectivity.ApacheHttpClient5Accessor;
5156
import com.sap.cloud.sdk.cloudplatform.connectivity.ApacheHttpClient5Cache;
5257
import com.sap.cloud.sdk.cloudplatform.connectivity.DefaultHttpDestination;
@@ -58,6 +63,7 @@
5863
import java.util.function.Function;
5964
import java.util.stream.Stream;
6065
import javax.annotation.Nonnull;
66+
import lombok.val;
6167
import org.apache.hc.client5.http.classic.HttpClient;
6268
import org.apache.hc.core5.http.ContentType;
6369
import org.apache.hc.core5.http.io.entity.InputStreamEntity;
@@ -734,46 +740,22 @@ void testMultiMessage() throws IOException {
734740
"Well, this image features the logo of SAP, a software company, set against a gradient blue background transitioning from light to dark. The main color in the image is blue.");
735741

736742
assertThat(response).isNotNull();
737-
assertThat(response.getRequestId()).isEqualTo("8d973a0d-c2cf-437b-a765-08d66bf446d8");
738-
assertThat(response.getModuleResults()).isNotNull();
739-
assertThat(response.getModuleResults().getTemplating()).hasSize(2);
740-
741743
var llmResults = (LLMModuleResultSynchronous) response.getModuleResults().getLlm();
742744
assertThat(llmResults).isNotNull();
743-
assertThat(llmResults.getId()).isEqualTo("chatcmpl-AyGx4yLYUH79TK81i21BaABoUpf4v");
744-
assertThat(llmResults.getObject()).isEqualTo("chat.completion");
745-
assertThat(llmResults.getCreated()).isEqualTo(1738928206);
746-
assertThat(llmResults.getModel()).isEqualTo("gpt-4o-mini-2024-07-18");
747-
assertThat(llmResults.getSystemFingerprint()).isEqualTo("fp_f3927aa00d");
748745
assertThat(llmResults.getChoices()).hasSize(1);
749746
assertThat(llmResults.getChoices().get(0).getMessage().getContent())
750747
.isEqualTo(
751748
"Well, this image features the logo of SAP, a software company, set against a gradient blue background transitioning from light to dark. The main color in the image is blue.");
752749
assertThat(llmResults.getChoices().get(0).getFinishReason()).isEqualTo("stop");
753750
assertThat(llmResults.getChoices().get(0).getMessage().getRole()).isEqualTo("assistant");
754-
assertThat(llmResults.getChoices().get(0).getIndex()).isZero();
755-
assertThat(llmResults.getUsage().getCompletionTokens()).isEqualTo(35);
756-
assertThat(llmResults.getUsage().getPromptTokens()).isEqualTo(250);
757-
assertThat(llmResults.getUsage().getTotalTokens()).isEqualTo(285);
758-
759751
var orchestrationResult = (LLMModuleResultSynchronous) response.getOrchestrationResult();
760-
assertThat(orchestrationResult).isNotNull();
761-
assertThat(orchestrationResult.getId()).isEqualTo("chatcmpl-AyGx4yLYUH79TK81i21BaABoUpf4v");
762-
assertThat(orchestrationResult.getObject()).isEqualTo("chat.completion");
763-
assertThat(orchestrationResult.getCreated()).isEqualTo(1738928206);
764-
assertThat(orchestrationResult.getModel()).isEqualTo("gpt-4o-mini-2024-07-18");
765-
assertThat(orchestrationResult.getSystemFingerprint()).isEqualTo("fp_f3927aa00d");
766752
assertThat(orchestrationResult.getChoices()).hasSize(1);
767753
assertThat(orchestrationResult.getChoices().get(0).getMessage().getContent())
768754
.isEqualTo(
769755
"Well, this image features the logo of SAP, a software company, set against a gradient blue background transitioning from light to dark. The main color in the image is blue.");
770756
assertThat(orchestrationResult.getChoices().get(0).getFinishReason()).isEqualTo("stop");
771757
assertThat(orchestrationResult.getChoices().get(0).getMessage().getRole())
772758
.isEqualTo("assistant");
773-
assertThat(orchestrationResult.getChoices().get(0).getIndex()).isZero();
774-
assertThat(orchestrationResult.getUsage().getCompletionTokens()).isEqualTo(35);
775-
assertThat(orchestrationResult.getUsage().getPromptTokens()).isEqualTo(250);
776-
assertThat(orchestrationResult.getUsage().getTotalTokens()).isEqualTo(285);
777759

778760
try (var requestInputStream = fileLoader.apply("multiMessageRequest.json")) {
779761
final String requestBody = new String(requestInputStream.readAllBytes());
@@ -782,4 +764,121 @@ void testMultiMessage() throws IOException {
782764
.withRequestBody(equalToJson(requestBody)));
783765
}
784766
}
767+
768+
@Test
769+
void testResponseObjectJsonSchema() throws IOException {
770+
stubFor(
771+
post(anyUrl())
772+
.willReturn(
773+
aResponse()
774+
.withBodyFile("jsonSchemaResponse.json")
775+
.withHeader("Content-Type", "application/json")));
776+
777+
var llmWithImageSupportConfig = new OrchestrationModuleConfig().withLlmConfig(GPT_4O_MINI);
778+
779+
val template = Message.user("Whats 'apple' in German?");
780+
var schema =
781+
Map.of(
782+
"type",
783+
"object",
784+
"properties",
785+
Map.of(
786+
"language", Map.of("type", "string"),
787+
"translation", Map.of("type", "string")),
788+
"required",
789+
List.of("language", "translation"),
790+
"additionalProperties",
791+
false);
792+
793+
val templatingConfig =
794+
Template.create()
795+
.template(List.of(template.createChatMessage()))
796+
.responseFormat(
797+
ResponseFormatJsonSchema.create()
798+
.type(ResponseFormatJsonSchema.TypeEnum.JSON_SCHEMA)
799+
.jsonSchema(
800+
ResponseFormatJsonSchemaJsonSchema.create()
801+
.name("translation_response")
802+
.schema(schema)
803+
.strict(true)
804+
.description("Output schema for language translation.")));
805+
val configWithTemplate = llmWithImageSupportConfig.withTemplateConfig(templatingConfig);
806+
807+
val prompt = new OrchestrationPrompt(Message.system("You are a language translator."));
808+
809+
final var message = client.chatCompletion(prompt, configWithTemplate).getContent();
810+
assertThat(message).isEqualTo("{\"translation\":\"Apfel\",\"language\":\"German\"}");
811+
812+
try (var requestInputStream = fileLoader.apply("jsonSchemaRequest.json")) {
813+
final String request = new String(requestInputStream.readAllBytes());
814+
verify(postRequestedFor(anyUrl()).withRequestBody(equalToJson(request)));
815+
}
816+
}
817+
818+
@Test
819+
void testResponseObjectJsonObject() throws IOException {
820+
stubFor(
821+
post(anyUrl())
822+
.willReturn(
823+
aResponse()
824+
.withBodyFile("jsonObjectResponse.json")
825+
.withHeader("Content-Type", "application/json")));
826+
827+
val llmWithImageSupportConfig = new OrchestrationModuleConfig().withLlmConfig(GPT_4O_MINI);
828+
829+
val template = Message.user("What is 'apple' in German?");
830+
val templatingConfig =
831+
Template.create()
832+
.template(List.of(template.createChatMessage()))
833+
.responseFormat(
834+
ResponseFormatJsonObject.create()
835+
.type(ResponseFormatJsonObject.TypeEnum.JSON_OBJECT));
836+
val configWithTemplate = llmWithImageSupportConfig.withTemplateConfig(templatingConfig);
837+
838+
val prompt =
839+
new OrchestrationPrompt(
840+
Message.system(
841+
"You are a language translator. Answer using the following JSON format: {\"language\": ..., \"translation\": ...}"));
842+
843+
final var message = client.chatCompletion(prompt, configWithTemplate).getContent();
844+
assertThat(message).isEqualTo("{\"language\": \"German\", \"translation\": \"Apfel\"}");
845+
846+
try (var requestInputStream = fileLoader.apply("jsonObjectRequest.json")) {
847+
final String request = new String(requestInputStream.readAllBytes());
848+
verify(postRequestedFor(anyUrl()).withRequestBody(equalToJson(request)));
849+
}
850+
}
851+
852+
@Test
853+
void testResponseObjectText() throws IOException {
854+
stubFor(
855+
post(anyUrl())
856+
.willReturn(
857+
aResponse()
858+
.withBodyFile("responseFormatTextResponse.json")
859+
.withHeader("Content-Type", "application/json")));
860+
861+
val llmWithImageSupportConfig = new OrchestrationModuleConfig().withLlmConfig(GPT_4O_MINI);
862+
863+
val template = Message.user("What is 'apple' in German?");
864+
val templatingConfig =
865+
Template.create()
866+
.template(List.of(template.createChatMessage()))
867+
.responseFormat(ResponseFormatText.create().type(ResponseFormatText.TypeEnum.TEXT));
868+
val configWithTemplate = llmWithImageSupportConfig.withTemplateConfig(templatingConfig);
869+
870+
val prompt =
871+
new OrchestrationPrompt(
872+
Message.system("You are a language translator. Answer using JSON."));
873+
874+
final var message = client.chatCompletion(prompt, configWithTemplate).getContent();
875+
assertThat(message)
876+
.isEqualTo(
877+
"```json\n{\n \"word\": \"apple\",\n \"translation\": \"Apfel\",\n \"language\": \"German\"\n}\n```");
878+
879+
try (var requestInputStream = fileLoader.apply("responseFormatTextRequest.json")) {
880+
final String request = new String(requestInputStream.readAllBytes());
881+
verify(postRequestedFor(anyUrl()).withRequestBody(equalToJson(request)));
882+
}
883+
}
785884
}
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
{
2+
"request_id": "f353a729-3391-4cec-bbf9-7ab39d34ebc1",
3+
"module_results": {
4+
"templating": [
5+
{
6+
"role": "user",
7+
"content": "What is 'apple' in German?"
8+
},
9+
{
10+
"role": "system",
11+
"content": "You are a language translator. Answer using the following JSON format: {\"language\": ..., \"translation\": ...}"
12+
}
13+
],
14+
"llm": {
15+
"id": "chatcmpl-Azm2iclgMiLcQHP3cGQANArkxoiGx",
16+
"object": "chat.completion",
17+
"created": 1739286048,
18+
"model": "gpt-4o-mini-2024-07-18",
19+
"system_fingerprint": "fp_f3927aa00d",
20+
"choices": [
21+
{
22+
"index": 0,
23+
"message": {
24+
"role": "assistant",
25+
"content": "{\"language\": \"German\", \"translation\": \"Apfel\"}"
26+
},
27+
"finish_reason": "stop"
28+
}
29+
],
30+
"usage": {
31+
"completion_tokens": 13,
32+
"prompt_tokens": 41,
33+
"total_tokens": 54
34+
}
35+
}
36+
},
37+
"orchestration_result": {
38+
"id": "chatcmpl-Azm2iclgMiLcQHP3cGQANArkxoiGx",
39+
"object": "chat.completion",
40+
"created": 1739286048,
41+
"model": "gpt-4o-mini-2024-07-18",
42+
"system_fingerprint": "fp_f3927aa00d",
43+
"choices": [
44+
{
45+
"index": 0,
46+
"message": {
47+
"role": "assistant",
48+
"content": "{\"language\": \"German\", \"translation\": \"Apfel\"}"
49+
},
50+
"finish_reason": "stop"
51+
}
52+
],
53+
"usage": {
54+
"completion_tokens": 13,
55+
"prompt_tokens": 41,
56+
"total_tokens": 54
57+
}
58+
}
59+
}

0 commit comments

Comments
 (0)