Skip to content

Conversation

@eavanvalkenburg
Copy link
Member

Motivation and Context

We need to decide how to handle clients that can have different backends and make it consistent.

Description

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Copilot AI review requested due to automatic review settings November 18, 2025 13:02
@markwallace-microsoft markwallace-microsoft added documentation Improvements or additions to documentation python labels Nov 18, 2025
Copilot finished reviewing on behalf of eavanvalkenburg November 18, 2025 13:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR proposes an Architecture Decision Record (ADR) for standardizing how Python client constructors should handle different backends (e.g., OpenAI vs Azure OpenAI, Anthropic vs Anthropic Bedrock) in the Agent Framework. The document explores five different architectural approaches for handling multi-backend client creation, weighing their tradeoffs in terms of maintainability, discoverability, and ease of use. The PR also includes routine dependency updates in the Python lock file.

Reviewed Changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated no comments.

File Description
docs/decisions/001X-python-client-constructors.md New ADR document proposing design patterns for Python client constructors with multi-backend support
python/uv.lock Dependency version updates for various packages including openai, anthropic, numpy, and other dependencies


## Pros and Cons of the Options

### Separate clients for each backend, such as OpenAI and Azure OpenAI, Anthropic and AnthropicBedrock, etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious: currently, we have AzureOpenAIChatClient as a separate class that inherits from OpenAIBaseChatClient. If we move to a unified client approach, how will this affect the existing Azure-specific code like AzureOpenAIConfigMixin and the specialized authentication handling for Entra ID? Would we maintain the mixin architecture or flatten it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So with this option (and option 4), we would maintain the separated clients and use structures like the one we have for OpenAI with a base class and configs, or something similar in most cases. If we go for option 2 or 3 then we would flatten it. And actually a recent change in the openai client has made this much easier, because you can now supply a callable to the api_key which works for Azure OpenAI, so that we don't even need to use the OpenAIAzure SDK client anymore.


We have multiple Chat Client implementations that can be used with different servers, the most important example is OpenAI, where we have a separate client for OpenAI and for Azure OpenAI. The constructors for the underlying OpenAI client has now enabled both, so it might make sense to have a single Chat Client for both, the same also applies to other Chat Clients, such as Anthropic, which has a Anthropic client, but also AnthropicBedrock and AnthropicVertex, currently we don't support creating a AF AnthropicClient with those by default, but if you pass them in as a parameter, it will work. This is not the case for OpenAI, where we have a separate client for Azure OpenAI and OpenAI, the OpenAI clients still accept any OpenAI Client (or a subclass thereof) as a parameter, so it can already be used with different servers, including Azure OpenAI, but it is not the default.

We have a preference of creating clients inside of our code because then we can add a user-agent string to allow us to track usage of our clients with different services. This is most useful for Azure, but could also be a strong signal for other vendors to invest in first party support for Agent Framework. And we also make sure to not alter clients that are passed in, as that is often meant for a specific use case, such as setting up httpx clients with proxies, or other customizations that are not relevant for the Agent Framework.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we adopt a single-client approach where the backend is inferred or specified via parameter, how will we make sure that Azure usage is still separately trackable in telemetry? Would the user-agent string need to dynamically change based on the backend selected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all cases, as long as we create the client we add the user-agent. That might not to much in the short term for anything outside of Azure, but it might send a signal to other vendors about the usage of AF with their service which might make collaboration easier.


## Considered Options

- Separate clients for each backend, such as OpenAI and Azure OpenAI, Anthropic and AnthropicBedrock, etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our current design uses inheritance (AzureOpenAIChatClient extends OpenAIBaseChatClient). Would any of the proposed options require shifting to composition patterns instead? How would that impact the @use_function_invocation, @use_observability, and @use_chat_middleware decorators?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are more implementation issues, that we will have to figure out, this is all about what a developer using AF will encounter. I think in most cases a more complex init class should be enough, with a possible exception of Azure OpenAI which has some additional things, but the focus for those things are all shifting to foundry, so that's unlikely to be getting bigger.

azure_client = AzureOpenAIClient(deployment_name="...", ad_token_provider=...)
```

### Separate parameter set per backend with a single client, such as OpenAIClient with parameters, for endpoint/base_url, api_key, and entra auth.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of these options appear to be breaking changes. Would we plan to maintain both old and new? Provide a deprecation warning? Or simply break existing clients?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would indeed be breaking, that's why I want to get this behavior figured out and made consistent moving forward, will most likely be a onetime breaking change for some clients.

)
```

### Single client with a explicit parameter for the backend to use, such as OpenAIClient(backend="azure") or AnthropicClient(backend="vertex").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at Anthropic as an example, they have separate SDK clients for Bedrock and Vertex. If a user wants to add support for a new backend that isn't officially supported, which option makes that easiest?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for very new things, most likely the breaking glass option of supplying your own client to the "regular" AnthropicClient in AF is easiest, break glass should always still be possible, I'll add a note on that.

Comment on lines +29 to +33
1. Separate clients for each backend, such as OpenAI and Azure OpenAI, Anthropic and AnthropicBedrock, etc.
1. Separate parameter set per backend with a single client, such as OpenAIClient with parameters, for endpoint/base_url, api_key, and entra auth.
1. Single client with a explicit parameter for the backend to use, such as OpenAIClient(backend="azure") or AnthropicClient(backend="vertex").
1. Single client with a customized `__new__` method that can create the right client based on the parameters passed in, such as OpenAIClient(api_key="...", backend="azure") which returns a AzureOpenAIClient.
1. Map clients to underlying SDK clients, OpenAI's SDK client allows both OpenAI and Azure OpenAI, so would be a single client, while Anthropic's SDK has explicit clients for Bedrock and Vertex, so would be a separate client for AnthropicBedrock and AnthropicVertex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this list discusses clients based on provider (OpenAI, Azure OpenAI, Anthropic) etc, but we also have another dimension which is API type (e.g. OpenAI Chat Completions, OpenAI Responses API etc).

Is this out of scope? For example, if we go with option 3, does it mean we will have OpenAIChatClient(backend="azure") and OpenAIResponsesClient(backend="azure")?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is out of scope, as the capabilities and other features are often quite different between the api's (it's basically the reason OpenAI created the newer API's), so indeed you would get OpenAIChatClient and OpenAIResponsesClient. Let's discuss if we should bring that in scope

print(type(azure_client)) # AzureOpenAIClient
```

### 5. Map clients to underlying SDK clients, OpenAI's SDK client allows both OpenAI and Azure OpenAI, so would be a single client, while Anthropic's SDK has explicit clients for Bedrock and Vertex, so would be a separate client for AnthropicBedrock and AnthropicVertex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach makes sense to me. In general, the purpose of having client class and implementation on our side is to align provider's API to our API, like an adapter. Which means that we probably shouldn't create anything extra there and we should follow the same approach as implemented in underlying SDK when it's possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the one that does make sense, but we run the risk of having to deal with breaking changes that are much more severe then we would otherwise, consider anthropic, moving from 4 different clients (default, foundry, bedrock, vertex) into one, then we would also have to delete all the other clients, based on those variants. While the underlying code is all the same. I think one of the nice things about creating abstractions is that we can smoothen things for customers because they don't have to deal with this breaking change.

Comment on lines +137 to +138
- Bad, because it can lead to inconsistency between clients, some being separate per backend, while others are combined
- Bad, because it can lead to confusion for users if they expect a consistent approach across all clients
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Agent Framework users can operate with ChatAgent and similar abstractions, rather than with client itself, so these points shouldn't be problematic. The only potential problematic place can be client initialization/configuration, but I'm not sure whether this should be fixed/improved on Agent Framework level or underlying SDK level. Keep in mind that while we can design this for clients that exist in this repository, it's still possible for providers to implement and release their own implementations and they can be designed in many different ways.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, when you have a chat client then you pass that to/create a agent with it. And then everything else is the same, but understanding that for OpenAI you always use the same client, but with different options, while for anthropic you have to pick another client is inconsistent, and since we always start with a client, making this consistent is important I think.

- Good, because it reduces the number of clients where possible
- Bad, because it can lead to inconsistency between clients, some being separate per backend, while others are combined
- Bad, because it can lead to confusion for users if they expect a consistent approach across all clients
- Bad, because changes to the underlying SDK clients can lead to changes in our clients, which can lead to instability.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this point is applicable to any option, isn't that the case? Unless you have some specific examples that can impact this option, and if yes - it's worth to specify it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I say above, there are always changes in SDKs that might be breaking, but following them on this point, means a larger surface area for breaking changes.

)
```

### 3. Single client with a explicit parameter for the backend to use, such as OpenAIClient(backend="azure") or AnthropicClient(backend="vertex").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for option 3, with the optional back for OpenAI as it's the default. I think this would really reduce the cognitive load on devs when figure out what client to use / where to import from.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants