feat: [Orchestration] Embedding Convenience#562
Conversation
- client method for high level objects - Only support 'float' encoding format - Add javadoc - create convenience embedding response class - Add unit/integration tests - Add json payloads for req and res - Add e2e
| final var bigDecimals = (Embedding.InnerBigDecimals) container.getEmbedding(); | ||
| final var values = bigDecimals.values(); | ||
| final float[] arr = new float[values.size()]; | ||
| for (int i = 0; i < values.size(); i++) { | ||
| arr[i] = values.get(i).floatValue(); |
There was a problem hiding this comment.
This is an unfortunate consequence of the openapi generator flag <useFloatArrays>true</useFloatArrays> not being supported for
oneOf: # works without the `oneOf`
- type: array
items:
type: integerThere was a problem hiding this comment.
This is concerning:
See memory footprint comparison
Memory Usage Breakdown
float[] array:
- Each
floatuses exactly 4 bytes - Array overhead: ~12-16 bytes (object header)
- Total for 1000 elements: ~4,012 bytes
List<BigDecimal> (assuming ArrayList):
ArrayListoverhead: ~24 bytes + internal array overhead- Each
BigDecimalobject: ~40-48 bytes (object header +BigInteger+ scale + precision) - Each
BigIntegerinside: ~24-32 bytes + int array for digits - Boxing overhead from list storage
- Total for 1000 elements: ~80,000-100,000 bytes
- Is this limitation originating from the openapi generator or our creators feature?
- Do you think it's possible to fix it in our creators feature?
- Is a follow-up BLI already considered?
There was a problem hiding this comment.
- openapi generator donot support this at all
- our creator feature doesn't account for the combined use of USE_FLOAT_ARRAY feature.
PR on the way for fixing our creator feature. 😄
There was a problem hiding this comment.
Is the above PR + Cloud SDK release a requirement? No right?
There was a problem hiding this comment.
Not a requirement. Just a nice to have.
orchestration/src/main/java/com/sap/ai/sdk/orchestration/OrchestrationEmbeddingResponse.java
Show resolved
Hide resolved
| /** Azure OpenAI Text Embedding 3 Small model */ | ||
| public static final OrchestrationEmbeddingModel TEXT_EMBEDDING_3_SMALL = | ||
| new OrchestrationEmbeddingModel("text-embedding-3-small"); | ||
|
|
||
| /** Azure OpenAI Text Embedding 3 Large model */ | ||
| public static final OrchestrationEmbeddingModel TEXT_EMBEDDING_3_LARGE = | ||
| new OrchestrationEmbeddingModel("text-embedding-3-large"); | ||
|
|
||
| /** Amazon Titan Embed Text model */ | ||
| public static final OrchestrationEmbeddingModel AMAZON_TITAN_EMBED_TEXT = | ||
| new OrchestrationEmbeddingModel("amazon.titan-embed-text"); | ||
|
|
||
| /** NVIDIA LLaMA 3.2 7B NV EmbedQA model */ | ||
| public static final OrchestrationEmbeddingModel NVIDIA_LLAMA_32_NV_EMBEDQA_1B = | ||
| new OrchestrationEmbeddingModel("nvidia--llama-3.2-nv-embedqa-1b"); |
There was a problem hiding this comment.
(Major/Question)
You are introducing another set of model constants we need to maintain. Can't we use existing class(es)?
What is the process of keeping this up-to-date?
There was a problem hiding this comment.
My motivation of introducing new class was that embedding models do not share the same parameters as models in OrchestrationAiModel and are not valid for the chatCompletion calls. So, I wanted to avoid a user passing the wrong model to some endpoint.
And I am open to removing this class and extending OrchestrationAiModel. What do you think?
There was a problem hiding this comment.
About maintenance, ideally, we can use an e2e (scenario) test to maintain this. But, unfortunately, this is pointless right now because SAP Notes and the api response are inconsistent.
We need the test for maintenance either way. But I would rather create a follow up BLI because this requires sync with orchestration.
orchestration/src/main/java/com/sap/ai/sdk/orchestration/OrchestrationEmbeddingRequest.java
Show resolved
Hide resolved
|
|
||
| /** Builder step for specifying text inputs to embed. */ | ||
| @FunctionalInterface | ||
| public interface InputStep { |
There was a problem hiding this comment.
(Minor)
I'm not sure whether we used "Step" as "Builder" substitute already somewhere in the project. If not, please reconsider.
There was a problem hiding this comment.
We have in fact used it once before in TemplateConfig. I actually like the current approach in how clean it is in usage and enforces both required arguments. But again, if you have a strong opinion, I will choose a static factory. Please confirm.
orchestration/src/main/java/com/sap/ai/sdk/orchestration/OrchestrationEmbeddingRequest.java
Outdated
Show resolved
Hide resolved
orchestration/src/main/java/com/sap/ai/sdk/orchestration/OrchestrationEmbeddingRequest.java
Outdated
Show resolved
Hide resolved
# Conflicts: # docs/release_notes.md # orchestration/src/test/java/com/sap/ai/sdk/orchestration/OrchestrationUnitTest.java
# Conflicts: # docs/release_notes.md # sample-code/spring-app/src/test/java/com/sap/ai/sdk/app/controllers/OrchestrationTest.java
# Conflicts: # sample-code/spring-app/src/main/java/com/sap/ai/sdk/app/services/OrchestrationService.java
Context
AI/ai-sdk-java-backlog#313.
We would like to introduce convenience for embedding in orchestration module.
Feature scope:
OrchestrationEmbeddingModel,OrchestrationEmbeddingRequestandOrchestrationEmbeddingResponseOrchestrationClient#embed(OrchestrationEmbeddingRequest)List<float[]>. API parity with OpenAIList<float[]>. Setting generator config<useFloatArrays>true</useFloatArrays>seems ineffective. Needs investigation.Definition of Done
Error handling created / updated & covered by the tests aboveAligned changes with the JavaScript SDK