Skip to content
Open
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
167 changes: 167 additions & 0 deletions docs/decisions/001X-python-client-constructors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
---
status: proposed
contact: eavanvalkenburg
date: 2025-11-18
deciders: markwallace-microsoft, dmytrostruk, sphenry, alliscode
consulted: taochenosu, moonbox3, giles17
---

# Python Client Constructors

## Context and Problem Statement

We have multiple Chat Client implementations that can be used with different servers, the most important example is OpenAI, where we have a separate client for OpenAI and for Azure OpenAI. The constructors for the underlying OpenAI client has now enabled both, so it might make sense to have a single Chat Client for both, the same also applies to other Chat Clients, such as Anthropic, which has a Anthropic client, but also AnthropicBedrock and AnthropicVertex, currently we don't support creating a AF AnthropicClient with those by default, but if you pass them in as a parameter, it will work. This is not the case for OpenAI, where we have a separate client for Azure OpenAI and OpenAI, the OpenAI clients still accept any OpenAI Client (or a subclass thereof) as a parameter, so it can already be used with different servers, including Azure OpenAI, but it is not the default.

We have a preference of creating clients inside of our code because then we can add a user-agent string to allow us to track usage of our clients with different services. This is most useful for Azure, but could also be a strong signal for other vendors to invest in first party support for Agent Framework. And we also make sure to not alter clients that are passed in, as that is often meant for a specific use case, such as setting up httpx clients with proxies, or other customizations that are not relevant for the Agent Framework.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we adopt a single-client approach where the backend is inferred or specified via parameter, how will we make sure that Azure usage is still separately trackable in telemetry? Would the user-agent string need to dynamically change based on the backend selected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all cases, as long as we create the client we add the user-agent. That might not to much in the short term for anything outside of Azure, but it might send a signal to other vendors about the usage of AF with their service which might make collaboration easier.


There is likely not a single best solution, the goal here is consistency across clients, and with that ease of use for users of Agent Framework.

## Decision Drivers

- Reduce client sprawl and different clients that only have one or more different parameters.
- Make client creation inside our classes the default and cover as many backends as possible.
- Make clients easy to use and discover, so that users can easily find the right client for their use case.
- Allow client creation based on environment variables, so that users can easily configure their clients without having to pass in parameters.
- A breaking glass scenario should always be possible, so that users can pass in their own clients if needed, and it should also be easy to figure out how to do that.

## Considered Options

1. Separate clients for each backend, such as OpenAI and Azure OpenAI, Anthropic and AnthropicBedrock, etc.
1. Separate parameter set per backend with a single client, such as OpenAIClient with parameters, for endpoint/base_url, api_key, and entra auth.
1. Single client with a explicit parameter for the backend to use, such as OpenAIClient(backend="azure") or AnthropicClient(backend="vertex").
1. Single client with a customized `__new__` method that can create the right client based on the parameters passed in, such as OpenAIClient(api_key="...", backend="azure") which returns a AzureOpenAIClient.
1. Map clients to underlying SDK clients, OpenAI's SDK client allows both OpenAI and Azure OpenAI, so would be a single client, while Anthropic's SDK has explicit clients for Bedrock and Vertex, so would be a separate client for AnthropicBedrock and AnthropicVertex.
Comment on lines +75 to +79
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this list discusses clients based on provider (OpenAI, Azure OpenAI, Anthropic) etc, but we also have another dimension which is API type (e.g. OpenAI Chat Completions, OpenAI Responses API etc).

Is this out of scope? For example, if we go with option 3, does it mean we will have OpenAIChatClient(backend="azure") and OpenAIResponsesClient(backend="azure")?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is out of scope, as the capabilities and other features are often quite different between the api's (it's basically the reason OpenAI created the newer API's), so indeed you would get OpenAIChatClient and OpenAIResponsesClient. Let's discuss if we should bring that in scope


## Pros and Cons of the Options

### 1. Separate clients for each backend, such as OpenAI and Azure OpenAI, Anthropic and AnthropicBedrock, etc.
This option would entail potentially a large number of clients, and keeping track of additional backend implementation being created by vendors.
- Good, because it is clear which client is used
- Good, because we can easily have aliases of parameters, that are then mapped internally, such as `deployment_name` for Azure OpenAI mapping to `model_id` internally
- Good, because it is easy to map environment variables to the right client
- Good, because any customization of the behavior can be done in the subclass
- Good, because we can expose the classes in different places, currently the `AzureOpenAIClient` is exposed in the `azure` module, while the `OpenAIClient` is exposed in the `openai` module, the same could be done with Anthropic, exposed from `anthropic`, while `AnthropicBedrock` would be exposed from `agent_framework.amazon`.
- Good, stable clients per backend, as changes to one client do not affect the other clients.
- Bad, because it creates a lot of clients that are very similar (even if they subclass from one base client class)
- Bad, because it is hard to keep track of all the clients and their parameters
- Bad, because it is hard to discover the right client for a specific use case

Example code:
```python
from agent_framework.openai import OpenAIClient # using a fictional OpenAIClient, to illustrate the point
from agent_framework.azure import AzureOpenAIClient

openai_client = OpenAIClient(api_key="...")
azure_client = AzureOpenAIClient(deployment_name="...", ad_token_provider=...)
```

### 2. Separate parameter set per backend with a single client, such as OpenAIClient with parameters, for endpoint/base_url, api_key, and entra auth.
This option would entail a single client that can be used with different backends, but requires the user to pass in the right parameters.
- Good, because it reduces the number of clients and makes it easier to discover the right client with the right parameters
- Good, because it allows for a single client to be used with different backends and additional backends can be added easily
- Good, because the user does not have to worry about which client to use, they can just use the `OpenAIClient` or `AnthropicClient` and pass in the right parameters, and we create the right client for them, if that client changes, then we do that in the code, without any changes to the api.
- Good, because in many cases, the differences between the backends are just a few parameters, such as endpoint/base_url and authentication method.
- Good, because client resolution logic could be encapsulated in a factory method, making it easier to maintain and even extend by users.
- Neutral, this would be a one-time breaking change for users of the existing clients, but would make it easier to use in the long run.
- Bad, because it requires the user to know which parameters to pass in for the specific backend and when using environment variables, it is not always clear which parameters are used for which backend, or what the order of precedence is.
- Bad, because it can lead to confusion if the user passes in the wrong parameters for the specific backend
- Bad, because the name for a parameter that is similar but not the same between backends can be confusing, such as `deployment_name` for Azure OpenAI and `model_id` for OpenAI, would we then only have `model_id` for both, or have both parameters?
- Bad, because it can lead to a lot of parameters that are not used for a specific backend, such as `entra_auth` for Azure OpenAI, but not for OpenAI
- Bad, less stable per client, as changes to the parameter change all clients.
- Bad, because customized behavior per backend is harder to implement, as it requires more conditional logic in the client code.

Example code:
```python
from agent_framework.openai import OpenAIClient
openai_client = OpenAIClient(
model_id="...",
api_key="...",
)
azure_client = OpenAIClient(
deployment_name="...", # or model_id, depending on the decision
ad_token_provider=...,
)
```

### 3. Single client with a explicit parameter for the backend to use, such as OpenAIClient(backend="azure") or AnthropicClient(backend="vertex").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for option 3, with the optional back for OpenAI as it's the default. I think this would really reduce the cognitive load on devs when figure out what client to use / where to import from.

This option would entail a single client that can be used with different backends, but requires the user to pass in the right backend as a parameter.
- Same list as the option above, and:
- Good, because it is explicit about which backend to try and target, including for environment variables
- Bad, because adding a new backend would require a change to the client (the backend param would change from i.e. `Literal["openai", "azure"]` to `Literal["openai", "azure", "newbackend"]`)


Example code:
```python
from agent_framework.openai import OpenAIClient
openai_client = OpenAIClient(
backend="openai", # probably optional, since this would be the default
model_id="...",
api_key="...",
)
azure_client = OpenAIClient(
backend="azure",
deployment_name="...", # could also become `model_id` to make it a bit simpler
ad_token_provider=...,
)
```

### 4. Single client with a customized `__new__` method that can create the right client based on the parameters passed in, such as OpenAIClient(backend="azure") which returns a AzureOpenAIClient.
This option would entail a single client that can be used with different backends, and the right client is created based on the parameters passed in.
- Good, because the entry point for a user is very clear
- Good, because it allows for customization of the client based on the parameters passed in
- Bad, because it still needs all the extra client classes for the different backends
- Bad, because there might be confusion between using the subclasses or the main class with the customized `__new__` method
- Bad, because adding a new backend is still work that is required to be built

Example code:
```python
from agent_framework.openai import OpenAIClient
openai_client = OpenAIClient(
backend="openai", # probably optional, since this would be the default
model_id="...",
api_key="...",
)
azure_client = OpenAIClient(
backend="azure",
model_id="...",
ad_token_provider=...,
)
print(type(openai_client)) # OpenAIClient
print(type(azure_client)) # AzureOpenAIClient
```

### 5. Map clients to underlying SDK clients, OpenAI's SDK client allows both OpenAI and Azure OpenAI, so would be a single client, while Anthropic's SDK has explicit clients for Bedrock and Vertex, so would be a separate client for AnthropicBedrock and AnthropicVertex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach makes sense to me. In general, the purpose of having client class and implementation on our side is to align provider's API to our API, like an adapter. Which means that we probably shouldn't create anything extra there and we should follow the same approach as implemented in underlying SDK when it's possible.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the one that does make sense, but we run the risk of having to deal with breaking changes that are much more severe then we would otherwise, consider anthropic, moving from 4 different clients (default, foundry, bedrock, vertex) into one, then we would also have to delete all the other clients, based on those variants. While the underlying code is all the same. I think one of the nice things about creating abstractions is that we can smoothen things for customers because they don't have to deal with this breaking change.

This option would entail a mix of the above options, depending on the underlying SDK clients.
- Good, because it aligns with the underlying SDK clients and their capabilities
- Good, because it reduces the number of clients where possible
- Bad, because it can lead to inconsistency between clients, some being separate per backend, while others are combined
- Bad, because it can lead to confusion for users if they expect a consistent approach across all clients
Comment on lines +189 to +190
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Agent Framework users can operate with ChatAgent and similar abstractions, rather than with client itself, so these points shouldn't be problematic. The only potential problematic place can be client initialization/configuration, but I'm not sure whether this should be fixed/improved on Agent Framework level or underlying SDK level. Keep in mind that while we can design this for clients that exist in this repository, it's still possible for providers to implement and release their own implementations and they can be designed in many different ways.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, when you have a chat client then you pass that to/create a agent with it. And then everything else is the same, but understanding that for OpenAI you always use the same client, but with different options, while for anthropic you have to pick another client is inconsistent, and since we always start with a client, making this consistent is important I think.

- Bad, because changes to the underlying SDK clients can lead to changes in our clients, which can lead to instability.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this point is applicable to any option, isn't that the case? Unless you have some specific examples that can impact this option, and if yes - it's worth to specify it here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I say above, there are always changes in SDKs that might be breaking, but following them on this point, means a larger surface area for breaking changes.


Example code:
```python
from agent_framework.anthropic import AnthropicClient, AnthropicBedrockClient
from agent_framework.openai import OpenAIClient
openai_client = OpenAIClient(
model_id="...",
api_key="...",
)
azure_client = OpenAIClient(
model_id="...",
api_key=lambda: get_azure_ad_token(...),
)
anthropic_client = AnthropicClient(
model_id="...",
api_key="...",
)
anthropic_bedrock_client = AnthropicBedrockClient(
model_id="...",
aws_secret_key="...",
aws_access_key="...",
base_url="...",
)
```

## Decision Outcome

TBD