-
Notifications
You must be signed in to change notification settings - Fork 2k
Abstract API for AI model clients and portable client request options design. #263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Short story is this PR provides
* An abstract API for AI model clients
* Providing portable client request options while still allowing vendor specific options when required. Implemented only for StabilityAI/OpenAI ImageClient
Open TODOs are to
* implement the options design pattern across the code base
The following is the commit history of squashed commits as is:
Step 1 - Implementing abstract generative API
=== Refactoring ===
* Create generative package to place all abstract API components of a Generative API. This is the "core domain" in DDD parlance
* Temp Renamed ChoiceMetadata to GenerationChoiceMetadata to better identify that it is part of the Respoinse.
Note, there is to be a list of Generation instances in the response.
==== Analysis ====
The currently named GeneratioMetadata contains RateLimit, Usage.
What metadata applies to the "top" level response (e.g ChatResponse)
* Usage
* RateLimit (derived from response headers)
* PromptMetadata
What metadata applies to eachh individual generation
* GenerationChoiceMetadata (now GenerationMetadata)
=== Refactoring ===
* ChatResponse should have one single class "ResponseMetadata" Now it has two 'GenerationMetadata' and 'PromptMetadata'.
** Rename GenerationMetadata to ChatResponseMetadata.
** Move ChatRespone field PromptMetadata into ChatRespoinseMetadata.
** Remove getPromptMetadata() and withPrompotMetadata into ChatResponseMetadata
* Add PromptMetadata to AzureOpenAiGenerationMetadata.from() method
* Rename AzureOpenAiGenerationMetadata to AzureOpenChatResponseMetadata
* Rename OpenAiGenerationMetadata to OpenAiChatResponseMetadata
* Add GenerativePrompt<T>
public interface GenerativePrompt<T> {
T getInstructions(); // required input
Options getOptions();
}
* Add GenerativeResponse<T>
=== TODOs ===
* The RateLimit is only implemented for OpenAI, not in Azure OpenAI, is it available? Needs research, ask John
* The PromptMetdata is only implemented for Azure OpenAI, not OpenAI, is it available? NO
Step 2 - Implementing abstract generative api
=== Refactoring ===
* Refactor ChatResponseMetadata extends ResponseMetadata. ResponseMetadata is just a marker interface.
* Refactor Prompt class. ow is defined as
Prompt implements GenerativePrompt<List<Message>> { }
NOTE: This should change in future to either ChatPrompt in the prompt package or better, in the 'chat' package. May consider putting in 'chat' package and keeping generic Prompt in the name for stylistic purposes. Perhaps just takes getting used to.
* Add
public interface GenerativeClient<TReq extends GenerativePrompt<?>, TRes extends GenerativeResponse<?>> {
TReq generate(TReq prompt);
}
* ChatClient refactor, now defined as
public interface ChatClient extends GenerativeClient<Prompt, ChatResponse> {
// omitted.
}
* Rename GenerationChoiceMetadata to GenerationMetadata
Step 3 - Implementing abstract generative api
* Add GenerativeGeneration interface
public interface GenerativeGeneration<T> {
T getOutput();
GenerationMetadata getGenerationMetadata();
}
* Refactor class generation to be
public class Generation implements GenerativeGeneration<AssistantMessage> {
private AssistantMessage assistantMessage;
private GenerationMetadata generationMetadata;
}
* Rename GenerationMeadata to ChatGenerationMetadata
* Add marker interface public interface GenerationMetadata {}
* Move metadata package under chat package for now as it is very specific to chat use cases.
* Add JAvadocs to generative and chat package.
== TODO ==
Need more work on 'portable' ResponseMetadata and GenerationMetadata
Maybe in AbstactMessage
protected String content;
public String getContent() {
return this.content;
}
At the moment changing getContent to getText() would be more legible since it is returning 'text' as a datatype as compared to 'content' which sounds generic and 'output' is already generic. Maybe in the future we will introduce generics into the Message class, in that case getContent is better, the content returned could be an image insteady of text. NEed to see what multi-model responses would be like. Keep getContent for now.
Getting Build error on tests
java.io.FileNotFoundException: class path resource [embedding/embedding-generative-dimensions.properties] cannot be opened because it does not exist [in thread "main"]
Fix tests that broke due to bad refactorings
Step 4 - Implementing abstract generative API
Rename generative methods and classes to more generic model ones
* Change package name from 'generative' to 'model'
* Change class names that used prefix 'Generative' to 'Model'
Multiple generative-related methods and classes have been renamed to more generic 'model' equivalents. This includes changing 'GenerativeClient' to 'ModelCall', and generative methods like 'generate' were renamed to 'call'. This was done in alignment to a broader model function definition - providing flexibility for non-generative model implementations.
add text->image from openai
Rename ModelCall to ModelClient and getResultMetadata() to getMetadata()
Add Text to Image generation for OpenAI and Stability AI
* ImagePrompt is the portable prompt that takes a string as input
* ImageOptions contains the portable options between the two providers
* ImageGenerationMetadata is a marker interface, as no metadata is shared between the two providers
* OpenAiImageOptions and StabilityAiImageOptions are provider specific options
* Implement builders for the two provider ImageOptions implementations
* OpenAiImageGenerationMetadata has revisedPrompt field
* StabilityAiImageGenerationMetadata has finishReason and seed fields
* OpenAiImageClient merges object creating time options with runtime options
* Add provider specific StabilityAiImagePrompt implements ModelRequest<List<StabilityAiImageMessage>>
* StabilityAiImageMessage has text and weight fields
* StabilityAiImageClient implements ImageResponse call(ImagePrompt prompt) and add providers specific ImageResponse call(StabilityAiImagePrompt stabilityAiImagePrompt)
* StabilityAiImageClient has field `StabilityAiImageOptions options;`
* Add basic tests for StabilityAiApi and OpenAiImageApi
TODOs
* Investigate using json to copy options vs. hand coded
* Use OpenAiImageOptions in OpenAiImageClient
* Use of OpenAiImageOptions and StabilityAiImageOptions in @ConfigurationProperties
Implement StabilityAiImageClient that supports portable and StabilityAI specifc prompt
* Add ModelOptionsUtils
* Add Tests. Comment in writeToFile to store image as png file
* StabilityAiProperties contains field `StabilityAiImageOptions options`
* Add StabilityAiImageAutoConfiguration
* StabilityAiImageClient applies runtime prompt options on top of options created at instantiation time.
update autoconfig resource file to include StabilityAI autoconf
Refactor StabilityAI and OpenAI code to use same ImagePrompt
* Simplify StabilityAI and OpenAI implementations to use a common shared ImagePrompt based on a similar 'Message' data type style used in ChatClient implementations
* Change StabilityAiImageOptions `seed` property type from `Integer` to `Long`
and update seed type to Long
TODOs
=====
Image.java should have
* getBytes()
* An enum that indicates the type of payload received, a url or a base64 or byte array. Convert from url/base64 to byte array.
** No support yet in stability AI for returing directly a png. AcceptHeader media/png
** need also an enum to indiate if the byte array or base64 is png,gif,jpg, etc.
add nested config property annotation
…or better readability
|
Pushed a commit(s) to add the missing code license blocks. |
|
|
||
| @SpringBootTest(classes = OpenAiTestConfiguration.class) | ||
| @EnabledIfEnvironmentVariable(named = "OPENAI_API_KEY", matches = ".+") | ||
| public class OpenAiImageClientIT extends AbstractIT { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test is failing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It exposed a bug, now fixed. Need to figure out what to do on retry when it is an API (400) error that is fatal. Also, not seeing any logging on retry will make these cases hard to diagnose in the wild.
...ava/org/springframework/ai/openai/image/OpenAiImageClientWithImageResponseMetadataTests.java
Show resolved
Hide resolved
.../spring-ai-openai/src/test/java/org/springframework/ai/openai/image/OpenAiImageClientIT.java
Outdated
Show resolved
Hide resolved
spring-ai-core/src/main/java/org/springframework/ai/image/NewImageClient.java
Outdated
Show resolved
Hide resolved
|
merged in 243cef9 |
This is a squahed commit of the 'options-ftw' comment.
Short story is this PR provides
Open TODOs are to
Look at the commit a0f71f7 for the details of the squashed commits that went into this PR