-
Notifications
You must be signed in to change notification settings - Fork 2k
Define API to capture metadata from AI responses #98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Note, I made the commits granular in this PR so that 1) the changes were easier to combine or remove as necessary (Spring Boot style) and 2) so that you could follow the progression of development (thinking, direction) in this new feature. |
I also think there is additional room for improvement on this initial implementation. For example. These can be addressed iteratively. |
…sponse metadata. Closes spring-projects#98
308cf66
to
f6ec9a6
Compare
…sponse metadata. Closes spring-projects#98
f6ec9a6
to
f418362
Compare
The source of information (metadata) pulled from an AI response during an AI request (Prompt) using OpenAI's API comes from:
|
…ata collection. Closes spring-projects#98
Raising the bar! My PRs are typically a mess. |
}; | ||
|
||
default RateLimit getRateLimit() { | ||
throw new IllegalStateException("No AI provider rate limit metadata was provided"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could use the 'null object' pattern here instead of throwing an exception, since we want to promote portability and avoid devs having to write code to handle exceptions.
class RateLimit {
public static final RateLimit NULL = new RateLimit();
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
* @author John Blum | ||
* @since 0.7.0 | ||
*/ | ||
public interface Usage { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is possible to go with the LCD approach that is type safe, though we need more investigation.
Here is some analysis. The huggingface one is the most different since it supports so many models.
One thing i thought was to provide some sort of type-safe client-specific helper class that given a hashmap would return typesafe results for devs who want to access the raw return information almost as if they were using the vendor specific API.
For example, here is the 'Details' class from the hugging face client.
@JsonProperty("best_of_sequences")
private List<BestOfSequence> bestOfSequences = null;
@JsonProperty("finish_reason")
private FinishReason finishReason = null;
@JsonProperty("generated_tokens")
private Integer generatedTokens = null;
@JsonProperty("prefill")
private List<PrefillToken> prefill = new ArrayList<>();
@JsonProperty("seed")
private Long seed = null;
@JsonProperty("tokens")
private List<Token> tokens = new ArrayList<>();
It is interesting they dont' have the separation into prompt tokens and tokens that were generated, though one might be able to calculate the prompt tokens in our own client and then do the subtraction from the total to get at the generated token value (estimate).
and for the azure open ai client
@Immutable
public final class CompletionsUsage {
/*
* The number of tokens generated across all completions emissions.
*/
@Generated
@JsonProperty(value = "completion_tokens")
private int completionTokens;
/*
* The number of tokens in the provided prompts for the completions request.
*/
@Generated
@JsonProperty(value = "prompt_tokens")
private int promptTokens;
/*
* The total number of tokens processed for the completions request and response.
*/
@Generated
@JsonProperty(value = "total_tokens")
private int totalTokens;
And from the theo kanning open ai client
public class Usage {
@JsonProperty("prompt_tokens")
long promptTokens;
@JsonProperty("completion_tokens")
long completionTokens;
@JsonProperty("total_tokens")
long totalTokens;
some
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, and an excellent point. I will investigate on this topic more. Thank you for the feedback and references.
I also like the general idea of a wrapper class (around the [Hash
]Map
) for AI provider specific metadata.
* @author John Blum | ||
* @since 0.7.0 | ||
*/ | ||
public interface RateLimit { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to the comment on Usage, I'd like to compare what is out there, a bit harder to find this information from a quick google search as compared to usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK
this(id, usage, null); | ||
} | ||
|
||
protected OpenAiMetadata(String id, OpenAiUsage usage, @Nullable OpenAiRateLimit rateLimit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could avoid the @nullable if use null object pattern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Generation generation = new Generation(chatMessage.getContent(), Map.of("role", chatMessage.getRole())); | ||
generations.add(generation); | ||
} | ||
return new AiResponse(generations, OpenAiMetadata.from(chatCompletionResult)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One area for the future is to have some sort of stats collection that goes on that can be sent to a dashboard. Adding in micrometer and a grafana dashboard could be a relatively easy win to help folks get a handle on costs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. My initial reaction was to minimally start with Spring Boot Actuator, particularly as I am most familiar with Actuator. Perhaps we can loop in Micrometer team for thoughts here as well.
this.baseUrl = baseUrl; | ||
} | ||
|
||
public String getEmbeddingApiKey() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how did the need for this come about? Can one have a different API key for embedding vs generation/inference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My apologies for the confusion, but I rearranged the order of the state variable in the code and that is why this appears in my commit. Here is the original definition of OpenAiProperties
.
It was an organizational thing.
@SpringBootConfiguration | ||
@Profile("spring-ai-openai-mocks") | ||
@SuppressWarnings("unused") | ||
public class OpenAiMockTestConfiguration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this be done at a higher level based on AIClient interface vs vendor specific interfaces? The end goal is easy mocking that should also be portable across model providers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on making this (potentially) easier for our developers to use in their testing efforts. I will need to think on this more carefully. In the meantime, I was simply addressing my infrastructure and framework testing needs.
I did call out having a spring-ai-test
module in our design document as something to help developers with along these lines. I think overtime, especially with a few more implementations of this AI metadata model for different AI providers (e.g. Azure OpenAI and Huggingface) under our belt, we can iron down the reusable testing components.
OkHttpClient.Builder clientBuilder = new OkHttpClient.Builder(OpenAiService.defaultClient(apiKey, duration)); | ||
|
||
if (properties.getMetadata().isRateLimitMetricsEnabled()) { | ||
clientBuilder.addInterceptor(new OpenAiHttpResponseHeadersInterceptor()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
…sponse metadata. Closes spring-projects#98
…ata collection. Closes spring-projects#98
…data in GenerationMetadata. Now, instead of throwing an IllegalStateException, Spring AI returns a Null Object implementation of the RateLimit and Usage metadata from GenerationMetadata. In additionm, Spring AI provides abstract base classes to conveniently implement the RateLimit and Usage interfaces for new AI clients. Closes spring-projects#98
99f031f
to
7ac2060
Compare
Thank you @markpollack. I appreciate your feedback and review on this PR. I addressed most of your concerns and feedback across a few new commits already. Specifically, I did the following:
I am going to continue by building an implementation of the AI metadata API for Azure OpenAI and possibly Hugging Face. |
15e9224
to
6dd3159
Compare
I completed an initial implementation of the AI metadata API for Microsoft Azure OpenAI Service. Additionally, I rebased this PR on the latest changes from |
…sponse metadata. Closes spring-projects#98
…ata collection. Closes spring-projects#98
…data in GenerationMetadata. Now, instead of throwing an IllegalStateException, Spring AI returns a Null Object implementation of the RateLimit and Usage metadata from GenerationMetadata. In additionm, Spring AI provides abstract base classes to conveniently implement the RateLimit and Usage interfaces for new AI clients. Closes spring-projects#98
…sponse metadata. See spring-projects#98
…data in GenerationMetadata. Now, instead of throwing an IllegalStateException, Spring AI returns a Null Object implementation of the RateLimit and Usage metadata from GenerationMetadata. In additionm, Spring AI provides abstract base classes to conveniently implement the RateLimit and Usage interfaces for new AI clients. See spring-projects#98
86329df
to
9a14aca
Compare
9a14aca
to
9ea9c42
Compare
* Define GenerationMetadata property in AiResponse. * Add OpenAI implementations of AiMetadata, RateLimit and Usage interfaces. * Add REST Assured JsonPath dependency to spring-ai-openai module. * Add OkHttp dependency to spring-ai-openai module. * Add OkHttp Interceptor to parse OpenAI rate limit metadata from HTTP headers. * Add OkHttp MockWebServer dependency to spring-ai-openai module, test scope * Add Jakarta Servlet API dependency to spring-ai-openai module, test scope * Add Spring Web MVC dependency to spring-ai-open-ai module., test scope * Define OpenAI API response headers in an Enum. * Add OpenAI test configuration using mock objects. * Add integration test to assert successful extraction of OpenAI API response metadata. * Include Spring Boot auto-configuration for (conditional) OpenAI metadata collection. * Edit documentation and include information on AI metadata collected by Spring AI. * Provide AI metadata implementation for Microsoft Azure OpenAI Service. * Capture optional PromptMetadata in AiResponse. * Define metadata for an AI generation choice. * Capture AI choice metadata in Generation. * Integrate ChoiceMetadata into AiResponse returned by OpenAI. Fixes spring-projects#98
removed some of the older fields that were intended to capture metadata about requests. merged as 37a4884 |
* Define GenerationMetadata property in AiResponse. * Add OpenAI implementations of AiMetadata, RateLimit and Usage interfaces. * Add REST Assured JsonPath dependency to spring-ai-openai module. * Add OkHttp dependency to spring-ai-openai module. * Add OkHttp Interceptor to parse OpenAI rate limit metadata from HTTP headers. * Add OkHttp MockWebServer dependency to spring-ai-openai module, test scope * Add Jakarta Servlet API dependency to spring-ai-openai module, test scope * Add Spring Web MVC dependency to spring-ai-open-ai module., test scope * Define OpenAI API response headers in an Enum. * Add OpenAI test configuration using mock objects. * Add integration test to assert successful extraction of OpenAI API response metadata. * Include Spring Boot auto-configuration for (conditional) OpenAI metadata collection. * Edit documentation and include information on AI metadata collected by Spring AI. * Provide AI metadata implementation for Microsoft Azure OpenAI Service. * Capture optional PromptMetadata in AiResponse. * Define metadata for an AI generation choice. * Capture AI choice metadata in Generation. * Integrate ChoiceMetadata into AiResponse returned by OpenAI. Fixes #98
* Define GenerationMetadata property in AiResponse. * Add OpenAI implementations of AiMetadata, RateLimit and Usage interfaces. * Add REST Assured JsonPath dependency to spring-ai-openai module. * Add OkHttp dependency to spring-ai-openai module. * Add OkHttp Interceptor to parse OpenAI rate limit metadata from HTTP headers. * Add OkHttp MockWebServer dependency to spring-ai-openai module, test scope * Add Jakarta Servlet API dependency to spring-ai-openai module, test scope * Add Spring Web MVC dependency to spring-ai-open-ai module., test scope * Define OpenAI API response headers in an Enum. * Add OpenAI test configuration using mock objects. * Add integration test to assert successful extraction of OpenAI API response metadata. * Include Spring Boot auto-configuration for (conditional) OpenAI metadata collection. * Edit documentation and include information on AI metadata collected by Spring AI. * Provide AI metadata implementation for Microsoft Azure OpenAI Service. * Capture optional PromptMetadata in AiResponse. * Define metadata for an AI generation choice. * Capture AI choice metadata in Generation. * Integrate ChoiceMetadata into AiResponse returned by OpenAI. Fixes spring-projects#98
This PR defines a new, strongly-typed API in Spring AI for capturing AI metadata and metrics sent in an AI response to a Prompt from an AI provider's (REST) API.
This new API includes both AI model usage metrics, such as Prompt and Generation (completion) token counts, along with AI provider access metrics, such as rate limits for both requests and tokens.
High-level feature additions in this PR include, but are not limited to:
AiMetadata
,RateLimit
andUsage
interfaces making up the API.AiResponse
now includes (optional)AiMetadata
(AiMetadata.EMPTY
by default)MockWebServer
, SpringMockMvc
and test class specific@RestController
to mock the AI provider's API for testing purposes.For example, you can now do something like the following:
To see a complete example, have a look at the test.
In this API, I preferred strongly-typed objects (for example,
AiMetadata
) over storing key-values inMap<String, Object>
objects present inAiResponse
andGeneration
classes since it provides 1) type safety, 2) easier, more descriptive and programmatical access to allow for things like type conversion, encoding/decoding, etc and 3) more immediately apparent metadata avaiable from an AI provider that is uniformly accessible from Spring AI.While this API may be more restrictive, or only capable of supporting the lowest-common denominator (LCD), we can always include support for free-form metadata, such as in the following example, which is not uncommon in Spring when you consider the
PropertyResolver
API, for instance:Subclassing will also given users the ability to access AI provider-specific metadata.
In short, it really should not matter to the Spring AI developer whether metadata is stored internally in a
Map<?, ?>
, or by some other means.TODO:
Upon initial review and discussion with both @markpollack and @tzolov, I recommend this feature be integrated and conditionally enabled based on a Spring property (for example:
spring.ai.openai.metadata.capture-enabled
). Spring Boot's auto-configuration (by property using@ConditionalOnProperty
) can help in this regard. - DONECreate other implementations of the AI metadata interfaces: Azure OpenAI, HuggingFace, etc.
Further exploration and enhancements could include integration with and exposing this AI metadata in Spring Boot Actuator.
In addition, there maybe clearer integration points directly with Micrometer as well.