Google SDK Issues / Request to Not Deprecate GeminiModel

### Question

Hey all,

Wasn't sure where to put this since it's not really a Pydantic AI specific bug, we first noticed this issue in LiveKit's Agent SDK and it just so happens that the new GoogleModel/Provider in Pydantic AI also rely on the same underlying SDK.  My main ask here is that you don't fully deprecate or remove the older GeminiModel and GoogleVertexProvider implementations until it's actually resolved. 

**_Google GenAI SDK Bug Details_**
Google's python-genai SDK has a persistent issue that's causing increased latency in generation times (this is particularly bad with multimodal requests):
- https://github.com/googleapis/python-genai/issues/1206
- https://github.com/googleapis/python-genai/issues/1074
- https://github.com/googleapis/python-genai/issues/557

The increased latency is anywhere from 3-5 seconds on average over 50 parallel eval runs, even using aiohttp and overriding the async client settings with HTTP2.  I've tested this in a variety of network conditions and the issue is always the same.  They claim this is fixed, but even after upgrading to 1.37 and using google-genai[aiohttp] it doesn't seem to be resolved.  I know this issue is SDK-specific because regular REST calls to their API work as expected (e.g. the older GeminiModel implementation, and also how LiteLLM implemented Gemini).

Here's the configuration we're using for the HTTPX client, both passed to GeminiModel and others in the http_client field, and passed as async_client_args to HttpOptionsDict in Google's GeniAI SDK client which is then used to create the GoogleProvider:

```python
 def get_async_client_args() -> dict[str, Any]:
    """
    Get httpx.AsyncClient configuration arguments optimized for LLM providers.
    
    Returns:
        dict: Configuration arguments to pass to httpx.AsyncClient or as async_client_args
    """
    context = ssl.create_default_context()
    return {
        'http2': True,
        'http1': False,
        'timeout': httpx.Timeout(
            timeout=600,
            connect=5,
        ),
        'limits': httpx.Limits(
            max_connections=100,
            max_keepalive_connections=20,
            keepalive_expiry=30.0,
        ),
        'transport': httpx.AsyncHTTPTransport(http2=True),
        'verify': context
    }
```

Again, everything works fine if we just use the older code (that's marked deprecated), the only issue with GeminiModel is if you use include_thinking: True, mangles the thinking tokens into the text response, which breaks structured outputs.  I was going to fix this in the implementation and I'm happy to send a PR.

### Additional Context

Pydantic AI v1.0.13
Python 3.12



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Google SDK Issues / Request to Not Deprecate GeminiModel #3071

Question

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Google SDK Issues / Request to Not Deprecate GeminiModel #3071

Description

Question

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions