Skip to content

Conversation

@markpollack
Copy link
Member

This is a squahed commit of the 'options-ftw' comment.

Short story is this PR provides

  • An abstract API for AI model clients
  • Providing portable client request options while still allowing vendor specific options when required.  Implemented only for StabilityAI/OpenAI ImageClient

Open TODOs are to

  • implement the options design pattern across the code base

Look at the commit a0f71f7 for the details of the squashed commits that went into this PR

Short story is this PR provides

* An abstract API for AI model clients
* Providing portable client request options while still allowing vendor specific options when required.  Implemented only for StabilityAI/OpenAI ImageClient

Open TODOs are to

* implement the options design pattern across the code base

The following is the commit history of squashed commits as is:

Step 1 - Implementing abstract generative API

=== Refactoring ===

* Create generative package to place all abstract API components of a Generative API. This is the "core domain" in DDD parlance

* Temp Renamed ChoiceMetadata to GenerationChoiceMetadata to better identify that it is part of the Respoinse.
Note, there is to be a list of Generation instances in the response.

==== Analysis ====
The currently named GeneratioMetadata contains RateLimit, Usage.

What metadata applies to the "top" level response (e.g ChatResponse)

* Usage
* RateLimit (derived from response headers)
* PromptMetadata

What metadata applies to eachh individual generation

* GenerationChoiceMetadata  (now GenerationMetadata)

=== Refactoring ===

* ChatResponse should have one single class "ResponseMetadata"  Now it has two 'GenerationMetadata' and 'PromptMetadata'.
 ** Rename GenerationMetadata to ChatResponseMetadata.
 ** Move ChatRespone field PromptMetadata into ChatRespoinseMetadata.
 ** Remove getPromptMetadata() and withPrompotMetadata into ChatResponseMetadata

* Add PromptMetadata to AzureOpenAiGenerationMetadata.from() method
* Rename AzureOpenAiGenerationMetadata to AzureOpenChatResponseMetadata
* Rename OpenAiGenerationMetadata to OpenAiChatResponseMetadata
* Add GenerativePrompt<T>

public interface GenerativePrompt<T> {
	T getInstructions(); // required input
	Options getOptions();
}
* Add GenerativeResponse<T>

=== TODOs  ===
* The RateLimit is only implemented for OpenAI, not in Azure OpenAI, is it available?  Needs research, ask John
* The PromptMetdata is only implemented for Azure OpenAI, not OpenAI, is it available?  NO

Step 2 - Implementing abstract generative api

=== Refactoring ===
* Refactor ChatResponseMetadata extends ResponseMetadata.  ResponseMetadata is just a marker interface.
* Refactor Prompt class. ow is defined as

    Prompt implements GenerativePrompt<List<Message>> { }

   NOTE: This should change in future to either ChatPrompt in the prompt package or better, in the 'chat' package.  May consider putting in 'chat' package and keeping generic Prompt in the name for stylistic purposes.  Perhaps just takes getting used to.

* Add
public interface GenerativeClient<TReq extends GenerativePrompt<?>, TRes extends GenerativeResponse<?>> {

    TReq generate(TReq prompt);
}

* ChatClient refactor, now defined as

public interface ChatClient extends GenerativeClient<Prompt, ChatResponse> {
   // omitted.
}

* Rename GenerationChoiceMetadata to GenerationMetadata

Step 3 - Implementing abstract generative api

* Add GenerativeGeneration interface

public interface GenerativeGeneration<T> {

    T getOutput();

    GenerationMetadata getGenerationMetadata();
}

* Refactor class generation to be

public class Generation implements GenerativeGeneration<AssistantMessage> {

	private AssistantMessage assistantMessage;

	private GenerationMetadata generationMetadata;

}

* Rename GenerationMeadata to ChatGenerationMetadata

* Add marker interface public interface GenerationMetadata {}

* Move metadata package under chat package for now as it is very specific to chat use cases.

* Add JAvadocs to generative and chat package.

== TODO ==
Need more work on 'portable' ResponseMetadata and GenerationMetadata
Maybe in AbstactMessage

protected String content;

public String getContent() {
		return this.content;
}

At the moment changing getContent to  getText() would be more legible since it is returning 'text' as a datatype as compared to 'content' which sounds generic and 'output' is already generic.  Maybe in the future we will introduce generics into the Message class, in that case getContent is better, the content returned could be an image insteady of text.  NEed to see what multi-model responses would be like.  Keep getContent for now.

Getting Build error on tests

 java.io.FileNotFoundException: class path resource [embedding/embedding-generative-dimensions.properties] cannot be opened because it does not exist [in thread "main"]

Fix tests that broke due to bad refactorings

Step 4 - Implementing abstract generative API

Rename generative methods and classes to more generic model ones

* Change package name from 'generative' to 'model'
* Change class names that used prefix 'Generative' to 'Model'

Multiple generative-related methods and classes have been renamed to more generic 'model' equivalents. This includes changing 'GenerativeClient' to 'ModelCall', and generative methods like 'generate' were renamed to 'call'. This was done in alignment to a broader model function definition - providing flexibility for non-generative model implementations.

add text->image from openai

Rename ModelCall to ModelClient and getResultMetadata() to getMetadata()

Add Text to Image generation for OpenAI and Stability AI

* ImagePrompt is the portable prompt that takes a string as input
* ImageOptions contains the portable options between the two providers
* ImageGenerationMetadata is a marker interface, as no metadata is shared between the two providers
* OpenAiImageOptions and StabilityAiImageOptions are provider specific options
* Implement builders for the two provider ImageOptions implementations
* OpenAiImageGenerationMetadata has revisedPrompt field
* StabilityAiImageGenerationMetadata has finishReason and seed fields
* OpenAiImageClient merges object creating time options with runtime options
* Add provider specific StabilityAiImagePrompt implements ModelRequest<List<StabilityAiImageMessage>>
* StabilityAiImageMessage has text and weight fields
* StabilityAiImageClient implements ImageResponse call(ImagePrompt prompt) and add providers specific ImageResponse call(StabilityAiImagePrompt stabilityAiImagePrompt)
* StabilityAiImageClient has field `StabilityAiImageOptions options;`
* Add basic tests for StabilityAiApi and OpenAiImageApi

TODOs
* Investigate using json to copy options vs. hand coded
* Use OpenAiImageOptions in OpenAiImageClient
* Use of OpenAiImageOptions and StabilityAiImageOptions in @ConfigurationProperties

Implement StabilityAiImageClient that supports portable and StabilityAI specifc prompt

* Add ModelOptionsUtils
* Add Tests.  Comment in writeToFile to store image as png file
* StabilityAiProperties contains field `StabilityAiImageOptions options`
* Add StabilityAiImageAutoConfiguration
* StabilityAiImageClient applies runtime prompt options on top of options created at instantiation time.

update autoconfig resource file to include StabilityAI autoconf

Refactor StabilityAI and OpenAI code to use same ImagePrompt

* Simplify StabilityAI and OpenAI implementations to use a common shared ImagePrompt based on a similar 'Message' data type style used in ChatClient implementations
* Change StabilityAiImageOptions `seed` property type from `Integer` to `Long`
and update seed type to Long

TODOs
=====
Image.java should have
* getBytes()
* An enum that indicates the type of payload received, a url or a base64 or byte array.  Convert from url/base64 to byte array.
** No support yet in stability AI for returing directly a png.  AcceptHeader media/png
** need also an enum to indiate if the byte array or base64 is png,gif,jpg, etc.

add nested config property annotation
@markpollack markpollack added this to the 0.8.0 milestone Jan 24, 2024
@markpollack markpollack changed the title Meta api and options Abstract API for AI model clients and portable client request options design. Jan 24, 2024
@tzolov
Copy link
Contributor

tzolov commented Jan 24, 2024

Pushed a commit(s) to add the missing code license blocks.


@SpringBootTest(classes = OpenAiTestConfiguration.class)
@EnabledIfEnvironmentVariable(named = "OPENAI_API_KEY", matches = ".+")
public class OpenAiImageClientIT extends AbstractIT {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test is failing

Copy link
Member Author

@markpollack markpollack Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It exposed a bug, now fixed. Need to figure out what to do on retry when it is an API (400) error that is fatal. Also, not seeing any logging on retry will make these cases hard to diagnose in the wild.

@markpollack
Copy link
Member Author

merged in 243cef9

@markpollack markpollack deleted the meta-api-and-options branch September 26, 2025 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants