feat: expose agents as openai compatible endpoint with FastAPI #3320

dutchfarao · 2025-11-03T21:00:09Z

This PR adds the ability to register pydantic-ai agents and expose them as an OpenAI compatible endpoint. The registry allows you to register agents that can be either exposed as /v1/chat/completions and/or /v1/responses endpoint. The /v1/models endpoint derives all the unique models by their names that exist in the AgentRegistry.

Streaming support for responses API has not been implemented yet, can be done in a follow up PR.

DouweM · 2025-11-03T21:36:12Z

@dutchfarao Nice work! @Kludex Please have a look

Co-authored-by: Ion Koutsouris <[email protected]>

ion-elgreco · 2025-11-03T22:47:50Z

Me and @dutchfarao are sitting at a bar here, trying to fix the CI. We are frustrated and are giving up haha, the CI should use markers instead of this try_import stuff😆

Would like your support here

dutchfarao · 2025-11-03T22:48:12Z

Me and @dutchfarao are sitting at a bar here, trying to fix the CI. We are frustrated and are giving up haha, the CI should use markers instead of this try_import stuff😆

Would like your support here

The 20 force pushes were an indication

Signed-off-by: Ion Koutsouris <[email protected]>

ion-elgreco · 2025-11-04T10:31:23Z

Resolved the CI, having a clear mind helps ;)

Kludex

I'll stop my review.

This is not a good first step. Please never open a 6k lines PR on a repository without discussing it first.

I think the first step would be to implement what my PR (#2041) proposed first, and then expand the idea to Responses API, if needed.

Kludex · 2025-11-04T10:28:20Z

pydantic_ai_slim/pydantic_ai/fastapi/api/completions.py

+except ImportError as _import_error:  # pragma: no cover
+    raise ImportError(
+        'Please install the `openai` package to enable the fastapi openai compatible endpoint, '
+        'you can use the `openai` and `fastapi` optional group — `pip install "pydantic-ai-slim[openai,fastapi]"`'


I don't think the extra should be called fastapi. The idea is to expose the chat completions/responses endpoints - so maybe chat-completions.

Kludex · 2025-11-04T10:28:55Z

pydantic_ai_slim/pydantic_ai/fastapi/api/__init__.py

This module shouldn't be called fastapi. The underlying package doesn't matter.

We are open to better name suggestions! Internally we called this differently

Kludex · 2025-11-04T10:29:16Z

pydantic_ai_slim/pydantic_ai/fastapi/api/completions.py

+from pydantic_ai.fastapi.registry import AgentRegistry
+from pydantic_ai.settings import ModelSettings
+
+logger = logging.getLogger(__name__)


There's no need for this.

Kludex · 2025-11-04T10:31:36Z

pydantic_ai_slim/pydantic_ai/fastapi/api/completions.py

+        except Exception as e:
+            logger.error(f'Error creating completion: {e}')
+            raise


The error is shown anyway, there's no need for this handling.

Also, in other code sources, you may want to do: logger.exception('Error when creating completion'), since the exception is already logged.

Suggested change

except Exception as e:

logger.error(f'Error creating completion: {e}')

raise

Kludex · 2025-11-04T10:36:58Z

pydantic_ai_slim/pydantic_ai/fastapi/agent_router.py

+        *args: tuple[Any],
+        **kwargs: tuple[Any],


This is wrong. The type of the *args and **kwargs should represent each element in the case of args, and each value in the case of kwargs - that said, this is a properly typed library, so we can't include this as is.

Kludex · 2025-11-04T10:44:15Z

pydantic_ai_slim/pydantic_ai/fastapi/agent_router.py

+                except Exception as e:
+                    logger.error(f'Error in responses: {e}', exc_info=True)
+                    raise HTTPException(
+                        status_code=500,
+                        detail=ErrorResponse(
+                            error=ErrorObject(
+                                type='internal_server_error',
+                                message=str(e),
+                            ),
+                        ).model_dump(),
+                    )


No need again.

Kludex · 2025-11-04T10:45:04Z

pydantic_ai_slim/pydantic_ai/fastapi/agent_router.py

+        @self.get('/v1/models', response_model=ModelsResponse)
+        async def get_models() -> ModelsResponse:  # type: ignore
+            try:
+                return await self.models_api.list_models()
+            except Exception as e:
+                logger.error(f'Error listing models: {e}', exc_info=True)
+                raise HTTPException(
+                    status_code=500,
+                    detail=ErrorResponse(
+                        error=ErrorObject(
+                            type='internal_server_error',
+                            message=f'Error retrieving models: {str(e)}',
+                        ),
+                    ).model_dump(),
+                )
+
+        @self.get('/v1/models' + '/{model_id}', response_model=Model)
+        async def get_model(model_id: str) -> Model:  # type: ignore
+            try:
+                return await self.models_api.get_model(model_id)
+            except HTTPException:
+                raise
+            except Exception as e:
+                logger.error(f'Error fetching model info: {e}', exc_info=True)
+                raise HTTPException(
+                    status_code=500,
+                    detail=ErrorResponse(
+                        error=ErrorObject(
+                            type='internal_server_error',
+                            message=f'Error retrieving model: {str(e)}',
+                        ),
+                    ).model_dump(),
+                )


Is this necessary?

Kludex · 2025-11-04T10:46:01Z

pydantic_ai_slim/pydantic_ai/fastapi/registry.py

+logger = logging.getLogger(__name__)
+
+
+class AgentRegistry:


I don't think we should work on a registry basis.

I think the Agent object should expose an ASGI application, that you can expose.

If you need multiple agents to become apps/endpoints, you can mount them together.

Mounting means you create a subpath per model, doing it on registry bases allows you to have a single endpoint with N amount of agents (models) and OpenAI spec to select the models.

I see your point.

I don't think we want to work with registry design anyway. A factory that creates an ASGI application would be ideal.

def expose_agents( agents: dict[str, Agent], *, chat_completions_url: '/v1/chat/completions' | None = '/v1/chat/completions', responses_url: '/v1/responses' | None = None, ) -> ASGIApp:

I'm not sure the above is the best name for the function or the parameters are the best design, but the idea is to simplify the logic here.

I am not sure if it simplifies. The registry has the property to return the full list of unique models, and we also have distinguishment with chat completions or responses api only models, additionally we make it simple to derive the name from the agent

agent_registry = ( AgentRegistry() .register_completions_agent(completion_agent) # auto derives the name from Agent instance or you override the name .register_responses_agent(responses_agent) ) app = FastAPI( title="LLM Agent API", description="OpenAI-compatible API with pydantic-ai backend", version="1.0.0", lifespan=lifespan, ) app.include_router(AgentAPIRouter(agent_registry=agent_registry))

Kludex · 2025-11-04T10:46:47Z

tests/agent_to_fastapi/integration_tests/conftest.py

+    openai_client_for_chat = AsyncOpenAI(
+        base_url=fake_openai_base,
+        api_key='test-key',
+        http_client=DefaultAioHttpClient(),


Why using DefaultAioHttpClient?

Can be removed indeed for the test, we simply were using aiohttp as our default client for the normal

Actually it is needed for the mocking

We do not need to use aioresponses. We can use VCR.

Kludex · 2025-11-04T10:47:08Z

pyproject.toml

    "pip>=25.2",
    "genai-prices>=0.0.28",
    "mcp-run-python>=0.0.20",
+    "pytest-asyncio>=1.2.0",


Why? The whole code sources uses anyio.

Signed-off-by: Ion Koutsouris <[email protected]>

Kludex · 2025-11-04T14:41:01Z

I think the first step would be to implement what my PR (#2041) proposed first, and then expand the idea to Responses API, if needed.

@ion-elgreco I think the idea of this PR is a good, but I think we really need to strip it down first, as I mentioned on the comment above.

ion-elgreco · 2025-11-04T19:02:24Z

I think the first step would be to implement what my PR (#2041) proposed first, and then expand the idea to Responses API, if needed.

@ion-elgreco I think the idea of this PR is a good, but I think we really need to strip it down first, as I mentioned on the comment above.

I'm fine with splitting the responses and chat completions api, and just stack them as seperate PRs. But it won't make a huge difference, since it's a contained code change

Signed-off-by: Ion Koutsouris <[email protected]>

dutchfarao force-pushed the feat/fastapi_openai_spec branch 12 times, most recently from b81d7ab to 4705ba1 Compare November 3, 2025 21:33

DouweM assigned Kludex Nov 3, 2025

dutchfarao force-pushed the feat/fastapi_openai_spec branch 8 times, most recently from 9c28ece to 50430ae Compare November 3, 2025 22:27

feat: Agent to OpenAI endpoint router

c054723

Co-authored-by: Ion Koutsouris <[email protected]>

dutchfarao force-pushed the feat/fastapi_openai_spec branch from 50430ae to c054723 Compare November 3, 2025 22:33

fix: ci test markers

964fd4d

Signed-off-by: Ion Koutsouris <[email protected]>

ion-elgreco force-pushed the feat/fastapi_openai_spec branch from ddadb21 to 964fd4d Compare November 4, 2025 10:17

Kludex reviewed Nov 4, 2025

View reviewed changes

test: add more coverage for route disablement

bb4508b

Signed-off-by: Ion Koutsouris <[email protected]>

ion-elgreco force-pushed the feat/fastapi_openai_spec branch 3 times, most recently from ad71cc6 to 6d7a61e Compare November 4, 2025 12:27

chore: pr feedback

67d47cd

Signed-off-by: Ion Koutsouris <[email protected]>

ion-elgreco force-pushed the feat/fastapi_openai_spec branch from 6d7a61e to 67d47cd Compare November 4, 2025 12:30

refactor: create agent router factory

e7f12d7

Signed-off-by: Ion Koutsouris <[email protected]>

ion-elgreco force-pushed the feat/fastapi_openai_spec branch from c190a88 to e7f12d7 Compare November 5, 2025 10:24

	except Exception as e:
	logger.error(f'Error creating completion: {e}')
	raise

		logger = logging.getLogger(__name__)


		class AgentRegistry:

feat: expose agents as openai compatible endpoint with FastAPI #3320

Are you sure you want to change the base?

feat: expose agents as openai compatible endpoint with FastAPI #3320

Conversation

dutchfarao commented Nov 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DouweM commented Nov 3, 2025

Uh oh!

ion-elgreco commented Nov 3, 2025

Uh oh!

dutchfarao commented Nov 3, 2025

Uh oh!

ion-elgreco commented Nov 4, 2025

Uh oh!

Kludex left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ion-elgreco Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kludex commented Nov 4, 2025

Uh oh!

ion-elgreco commented Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dutchfarao commented Nov 3, 2025 •

edited

Loading

ion-elgreco Nov 4, 2025 •

edited

Loading