Skip to content

Conversation

@jxblum
Copy link
Contributor

@jxblum jxblum commented Nov 14, 2023

This PR defines a new, strongly-typed API in Spring AI for capturing AI metadata and metrics sent in an AI response to a Prompt from an AI provider's (REST) API.

This new API includes both AI model usage metrics, such as Prompt and Generation (completion) token counts, along with AI provider access metrics, such as rate limits for both requests and tokens.

High-level feature additions in this PR include, but are not limited to:

  • New AiMetadata, RateLimit and Usage interfaces making up the API.
  • AiResponse now includes (optional) AiMetadata (AiMetadata.EMPTY by default)
  • Implementation of the new API with OpenAI.
  • Includes new method of testing AI provider REST API endpoints using OkHttp3 MockWebServer, Spring MockMvc and test class specific @RestController to mock the AI provider's API for testing purposes.

For example, you can now do something like the following:

  AiResponse response = aiClient.generate(prompt);

  // process the AI's response (such as chat completion)

  AiMetadata metadata = response.getMetadata();

  long totalTokenCount = metadata.getUsage().getTotalTokens();

  // do something responsible with this information

To see a complete example, have a look at the test.

In this API, I preferred strongly-typed objects (for example, AiMetadata) over storing key-values in Map<String, Object> objects present in AiResponse and Generation classes since it provides 1) type safety, 2) easier, more descriptive and programmatical access to allow for things like type conversion, encoding/decoding, etc and 3) more immediately apparent metadata avaiable from an AI provider that is uniformly accessible from Spring AI.

While this API may be more restrictive, or only capable of supporting the lowest-common denominator (LCD), we can always include support for free-form metadata, such as in the following example, which is not uncommon in Spring when you consider the PropertyResolver API, for instance:

aiMetadata.getPropertyAs("propertyName", SomeType.class);

Subclassing will also given users the ability to access AI provider-specific metadata.

In short, it really should not matter to the Spring AI developer whether metadata is stored internally in a Map<?, ?>, or by some other means.

TODO:

  • Upon initial review and discussion with both @markpollack and @tzolov, I recommend this feature be integrated and conditionally enabled based on a Spring property (for example: spring.ai.openai.metadata.capture-enabled). Spring Boot's auto-configuration (by property using @ConditionalOnProperty) can help in this regard. - DONE

  • Create other implementations of the AI metadata interfaces: Azure OpenAI, HuggingFace, etc.

  • Further exploration and enhancements could include integration with and exposing this AI metadata in Spring Boot Actuator.

  • In addition, there maybe clearer integration points directly with Micrometer as well.

@jxblum
Copy link
Contributor Author

jxblum commented Nov 14, 2023

Note, I made the commits granular in this PR so that 1) the changes were easier to combine or remove as necessary (Spring Boot style) and 2) so that you could follow the progression of development (thinking, direction) in this new feature.

@jxblum jxblum changed the title Define API for capturing AI metadata from AI responses Define API to capture AI metadata from AI responses Nov 14, 2023
@jxblum
Copy link
Contributor Author

jxblum commented Nov 14, 2023

I also think there is additional room for improvement on this initial implementation. For example. These can be addressed iteratively.

jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 14, 2023
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 15, 2023
@jxblum
Copy link
Contributor Author

jxblum commented Nov 15, 2023

The source of information (metadata) pulled from an AI response during an AI request (Prompt) using OpenAI's API comes from:

  1. The Chat Completion object.
  2. Along with OpenAI's docuementation on Rate Limits.

jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 15, 2023
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 15, 2023
@jxblum jxblum changed the title Define API to capture AI metadata from AI responses Define API to capture metadata from AI responses Nov 15, 2023
@markpollack
Copy link
Member

Note, I made the commits granular in this PR so that ...

Raising the bar! My PRs are typically a mess.

};

default RateLimit getRateLimit() {
throw new IllegalStateException("No AI provider rate limit metadata was provided");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could use the 'null object' pattern here instead of throwing an exception, since we want to promote portability and avoid devs having to write code to handle exceptions.

    class RateLimit {
        public static final RateLimit NULL = new RateLimit();

        ... 
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* @author John Blum
* @since 0.7.0
*/
public interface Usage {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is possible to go with the LCD approach that is type safe, though we need more investigation.

Here is some analysis. The huggingface one is the most different since it supports so many models.

One thing i thought was to provide some sort of type-safe client-specific helper class that given a hashmap would return typesafe results for devs who want to access the raw return information almost as if they were using the vendor specific API.

For example, here is the 'Details' class from the hugging face client.

  @JsonProperty("best_of_sequences")
  private List<BestOfSequence> bestOfSequences = null;

  @JsonProperty("finish_reason")
  private FinishReason finishReason = null;

  @JsonProperty("generated_tokens")
  private Integer generatedTokens = null;

  @JsonProperty("prefill")
  private List<PrefillToken> prefill = new ArrayList<>();

  @JsonProperty("seed")
  private Long seed = null;

  @JsonProperty("tokens")
  private List<Token> tokens = new ArrayList<>();

It is interesting they dont' have the separation into prompt tokens and tokens that were generated, though one might be able to calculate the prompt tokens in our own client and then do the subtraction from the total to get at the generated token value (estimate).

and for the azure open ai client

@Immutable
public final class CompletionsUsage {

    /*
     * The number of tokens generated across all completions emissions.
     */
    @Generated
    @JsonProperty(value = "completion_tokens")
    private int completionTokens;

    /*
     * The number of tokens in the provided prompts for the completions request.
     */
    @Generated
    @JsonProperty(value = "prompt_tokens")
    private int promptTokens;

    /*
     * The total number of tokens processed for the completions request and response.
     */
    @Generated
    @JsonProperty(value = "total_tokens")
    private int totalTokens;

And from the theo kanning open ai client

public class Usage {
    @JsonProperty("prompt_tokens")
    long promptTokens;
    @JsonProperty("completion_tokens")
    long completionTokens;
    @JsonProperty("total_tokens")
    long totalTokens;

some

Copy link
Contributor Author

@jxblum jxblum Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, and an excellent point. I will investigate on this topic more. Thank you for the feedback and references.

I also like the general idea of a wrapper class (around the [Hash]Map) for AI provider specific metadata.

* @author John Blum
* @since 0.7.0
*/
public interface RateLimit {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to the comment on Usage, I'd like to compare what is out there, a bit harder to find this information from a quick google search as compared to usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

this(id, usage, null);
}

protected OpenAiMetadata(String id, OpenAiUsage usage, @Nullable OpenAiRateLimit rateLimit) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could avoid the @nullable if use null object pattern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Generation generation = new Generation(chatMessage.getContent(), Map.of("role", chatMessage.getRole()));
generations.add(generation);
}
return new AiResponse(generations, OpenAiMetadata.from(chatCompletionResult));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One area for the future is to have some sort of stats collection that goes on that can be sent to a dashboard. Adding in micrometer and a grafana dashboard could be a relatively easy win to help folks get a handle on costs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. My initial reaction was to minimally start with Spring Boot Actuator, particularly as I am most familiar with Actuator. Perhaps we can loop in Micrometer team for thoughts here as well.

this.baseUrl = baseUrl;
}

public String getEmbeddingApiKey() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how did the need for this come about? Can one have a different API key for embedding vs generation/inference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies for the confusion, but I rearranged the order of the state variable in the code and that is why this appears in my commit. Here is the original definition of OpenAiProperties.

It was an organizational thing.

@SpringBootConfiguration
@Profile("spring-ai-openai-mocks")
@SuppressWarnings("unused")
public class OpenAiMockTestConfiguration {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be done at a higher level based on AIClient interface vs vendor specific interfaces? The end goal is easy mocking that should also be portable across model providers.

Copy link
Contributor Author

@jxblum jxblum Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed on making this (potentially) easier for our developers to use in their testing efforts. I will need to think on this more carefully. In the meantime, I was simply addressing my infrastructure and framework testing needs.

I did call out having a spring-ai-test module in our design document as something to help developers with along these lines. I think overtime, especially with a few more implementations of this AI metadata model for different AI providers (e.g. Azure OpenAI and Huggingface) under our belt, we can iron down the reusable testing components.

OkHttpClient.Builder clientBuilder = new OkHttpClient.Builder(OpenAiService.defaultClient(apiKey, duration));

if (properties.getMetadata().isRateLimitMetricsEnabled()) {
clientBuilder.addInterceptor(new OpenAiHttpResponseHeadersInterceptor());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 15, 2023
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 15, 2023
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 15, 2023
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 15, 2023
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 15, 2023
…data in GenerationMetadata.

Now, instead of throwing an IllegalStateException, Spring AI returns a Null Object implementation of the RateLimit and Usage metadata from GenerationMetadata.

In additionm, Spring AI provides abstract base classes to conveniently implement the RateLimit and Usage interfaces for new AI clients.

Closes spring-projects#98
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 15, 2023
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 15, 2023
@jxblum
Copy link
Contributor Author

jxblum commented Nov 15, 2023

Note, I made the commits granular in this PR so that ...

Raising the bar! My PRs are typically a mess.

Thank you @markpollack. I appreciate your feedback and review on this PR.

I addressed most of your concerns and feedback across a few new commits already. Specifically, I did the following:

  • Renamed AiMetadata to GenerationMetadata.
  • Repackaged the AI metadata under org.springframework.ai.metadata.
  • Implemented the NULL Object pattern for RateLimit and Usage interfaces and returned the NULL value objects from GenerationMetadata instead of throwing an IllegalStateException.
  • Edited the Javadoc and documentation to match the API changes.

I am going to continue by building an implementation of the AI metadata API for Azure OpenAI and possibly Hugging Face.

jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 15, 2023
@jxblum
Copy link
Contributor Author

jxblum commented Nov 15, 2023

I completed an initial implementation of the AI metadata API for Microsoft Azure OpenAI Service.

Additionally, I rebased this PR on the latest changes from main so that this PR remains in a buildable and shippable state.

jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 16, 2023
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 16, 2023
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 16, 2023
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 16, 2023
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 16, 2023
…data in GenerationMetadata.

Now, instead of throwing an IllegalStateException, Spring AI returns a Null Object implementation of the RateLimit and Usage metadata from GenerationMetadata.

In additionm, Spring AI provides abstract base classes to conveniently implement the RateLimit and Usage interfaces for new AI clients.

Closes spring-projects#98
…data in GenerationMetadata.

Now, instead of throwing an IllegalStateException, Spring AI returns a Null Object implementation of the RateLimit and Usage metadata from GenerationMetadata.

In additionm, Spring AI provides abstract base classes to conveniently implement the RateLimit and Usage interfaces for new AI clients.

See spring-projects#98
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 21, 2023
jxblum added a commit to jxblum/spring-ai that referenced this pull request Nov 21, 2023
@markpollack markpollack added this to the 0.8.0 milestone Nov 21, 2023
markpollack pushed a commit to markpollack/spring-ai that referenced this pull request Nov 21, 2023
* Define GenerationMetadata property in AiResponse.
* Add OpenAI implementations of AiMetadata, RateLimit and Usage interfaces.
* Add REST Assured JsonPath dependency to spring-ai-openai module.
* Add OkHttp dependency to spring-ai-openai module.
* Add OkHttp Interceptor to parse OpenAI rate limit metadata from HTTP headers.
* Add OkHttp MockWebServer dependency to spring-ai-openai module, test scope
* Add Jakarta Servlet API dependency to spring-ai-openai module, test scope
* Add Spring Web MVC dependency to spring-ai-open-ai module., test scope
* Define OpenAI API response headers in an Enum.
* Add OpenAI test configuration using mock objects.
* Add integration test to assert successful extraction of OpenAI API response metadata.
* Include Spring Boot auto-configuration for (conditional) OpenAI metadata collection.
* Edit documentation and include information on AI metadata collected by Spring AI.
* Provide AI metadata implementation for Microsoft Azure OpenAI Service.
* Capture optional PromptMetadata in AiResponse.
* Define metadata for an AI generation choice.
* Capture AI choice metadata in Generation.
* Integrate ChoiceMetadata into AiResponse returned by OpenAI.

Fixes spring-projects#98
@markpollack
Copy link
Member

removed some of the older fields that were intended to capture metadata about requests.

merged as 37a4884

markpollack pushed a commit that referenced this pull request Nov 21, 2023
* Define GenerationMetadata property in AiResponse.
* Add OpenAI implementations of AiMetadata, RateLimit and Usage interfaces.
* Add REST Assured JsonPath dependency to spring-ai-openai module.
* Add OkHttp dependency to spring-ai-openai module.
* Add OkHttp Interceptor to parse OpenAI rate limit metadata from HTTP headers.
* Add OkHttp MockWebServer dependency to spring-ai-openai module, test scope
* Add Jakarta Servlet API dependency to spring-ai-openai module, test scope
* Add Spring Web MVC dependency to spring-ai-open-ai module., test scope
* Define OpenAI API response headers in an Enum.
* Add OpenAI test configuration using mock objects.
* Add integration test to assert successful extraction of OpenAI API response metadata.
* Include Spring Boot auto-configuration for (conditional) OpenAI metadata collection.
* Edit documentation and include information on AI metadata collected by Spring AI.
* Provide AI metadata implementation for Microsoft Azure OpenAI Service.
* Capture optional PromptMetadata in AiResponse.
* Define metadata for an AI generation choice.
* Capture AI choice metadata in Generation.
* Integrate ChoiceMetadata into AiResponse returned by OpenAI.

Fixes #98
habuma pushed a commit to habuma/spring-ai that referenced this pull request Nov 22, 2023
* Define GenerationMetadata property in AiResponse.
* Add OpenAI implementations of AiMetadata, RateLimit and Usage interfaces.
* Add REST Assured JsonPath dependency to spring-ai-openai module.
* Add OkHttp dependency to spring-ai-openai module.
* Add OkHttp Interceptor to parse OpenAI rate limit metadata from HTTP headers.
* Add OkHttp MockWebServer dependency to spring-ai-openai module, test scope
* Add Jakarta Servlet API dependency to spring-ai-openai module, test scope
* Add Spring Web MVC dependency to spring-ai-open-ai module., test scope
* Define OpenAI API response headers in an Enum.
* Add OpenAI test configuration using mock objects.
* Add integration test to assert successful extraction of OpenAI API response metadata.
* Include Spring Boot auto-configuration for (conditional) OpenAI metadata collection.
* Edit documentation and include information on AI metadata collected by Spring AI.
* Provide AI metadata implementation for Microsoft Azure OpenAI Service.
* Capture optional PromptMetadata in AiResponse.
* Define metadata for an AI generation choice.
* Capture AI choice metadata in Generation.
* Integrate ChoiceMetadata into AiResponse returned by OpenAI.

Fixes spring-projects#98
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants