[KEP-4] Built-in HTTP & MCP Server for Pipeline Execution #5387
Replies: 6 comments 2 replies
-
|
Can you clarify how we expect this to be called programatically (so not through CLI)? What does the endpoint/function look like a user can call directly from python? For e.g. the Kedro integration with MLRun the team said explicitly that they definitely don't want to use CLI. |
Beta Was this translation helpful? Give feedback.
-
Exposing pipelines as HTTP endpoints is a common-enough user concern that I think it's very justifiable to provide an out-of-the-box solution.
I think you can never go wrong with implementing as a plugin here—you can always bring it into core, if there's sufficient adoption. The main question I'd have re adding it to core is how confident we feel about the approach being finalized. To this end, I think:
I don't think MCP should be included, especially in core. The ecosystem around this is way too new. For one, is MCP even the right solution? If the goal is to provide local agent access, why not Agent Skills? This is simpler (no server) and cheaper (no tokens for API calls). Again, this could shift to something new given how volatile the ecosystem is, but at this point in time there seems to be a movement from MCP to Skills for many use cases. Furthermore, FastMCP is popular, but it's is much less prevalent than FastAPI.
Not answering on MCP, since I think there are questions re whether this is even the right path forward at this time. For the HTTP server:
|
Beta Was this translation helpful? Give feedback.
-
|
I’m supportive of introducing a simple HTTP API layer; that was the original motivation. At this stage, it would be preferable to keep the API layer intentionally minimal and additive. This is still new territory for us, and we likely don’t want Kedro to take on orchestration responsibilities. Concurrency control and avoiding overlapping runs can remain the responsibility of the orchestrator or the user (e.g., versioned datasets, namespacing, max concurrency limits), at least for Phase 1. We can be very explicit in the documentation about these expectations and provide clear guidance on recommended patterns for safe usage. On MCP, I agree with @deepyaman .... while promising, the ecosystem is evolving quickly. Since we already have So a possible path forward could be:
|
Beta Was this translation helpful? Give feedback.
-
|
Definitely +1 from me to proceed with this work. More specifically:
|
Beta Was this translation helpful? Give feedback.
-
|
We should have this feature. +1 on should we have it
Yes
For the details and making it first class kedro native support, I feel we should have it in core. Rationale: I see there are lot of new features and we make some experimental and some introduced to core. As @deepyaman mentioned it is always easy and quick to make it a plugin and you can experiment and cannot go wrong. But at the same time user experience on installing a new plugin + Kedro team maintaining a new plugin seems an overkill for a
I agree on implementing http server for now as part of core.
Yes. I could not think of any complex use-case based on my knowledge to go for gRPC server. But it would be worth considering in future as we have evidence of latency issues with HTTP. Thank you |
Beta Was this translation helpful? Give feedback.
-
Closing SummaryThank you everyone for the discussion and feedback. Based on the comments in this thread, we are closing this KEP with the following agreed direction: Decisions
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Related PR: #5370
KEP shepherd: @DimedS
Q1. What are we trying to do?
Add two built-in server interfaces to Kedro so that pipelines can be triggered over HTTP (by orchestrators, dashboards, CI/CD) and discovered/executed by AI agents (Claude, Copilot, etc.) via the Model Context Protocol (MCP) — without requiring users to write any glue code.
Both are thin wrappers around the same
KedroSession.create() → session.run()path that the CLI uses. Full details and implementation are in PR #5370.Q2. What problem is this proposal NOT designed to solve?
Q3. How is it done today, and what are the limits of current practice?
Today pipelines are triggered via:
kedro runCLIKedroSessionin Python codeIf users need HTTP access, they must build their own wrapper (Flask/FastAPI app around
KedroSession). If they want AI agent integration, there is no path at all — they'd need to build a custom MCP server from scratch. Every team reinvents the same boilerplate.Q4. What is new in your approach and why do you think it will be successful?
execute_pipeline()inrunner.py, which creates aKedroSessionand callssession.run(). No duplication, no divergence from CLI behavior.kedro runparameter is available as a JSON field (HTTP) or tool parameter (MCP).create_http_server()returns a standard FastAPI app;create_mcp_server()returns a standard FastMCP instance. Users add auth, CORS, observability, or custom tools using patterns they already know. No Kedro-specific plugin system to learn.kedro[http],kedro[mcp]). No new hard dependencies.Q5. Who cares? If you are successful, what difference will it make?
Q6. What are the risks?
execute_pipeline()core.mcp>=1.0.0,<2.0.0. A breaking v2 release would require a compatibility update.Q7. How long will it take?
Phase 1 implementation is complete in PR #5370. Remaining work: unit tests, integration tests, documentation.
Q8. What are the mid-term and final "exams" to check for success?
Mid-term
/healthand/runwith full CLI parity ✅Final
Core Decisions for TSC Vote
The implementation details can be discussed during PR review. This KEP asks for alignment on four high-level decisions:
1. Should we proceed with server functionality at all?
Should Kedro provide built-in HTTP and/or MCP access to pipeline execution, or should this remain a user-side concern?
2. Core or plugin?
Should this live in
kedrocore (as implemented in #5370) or in a separatekedro-serverplugin?Arguments for core: stays in sync with
session.run()automatically, first-class discoverability (kedro server), zero cost if unused (optional extras).Arguments for plugin: smaller core surface, independent release cycle.
Note: extensibility is equivalent in both cases — the factory functions return standard FastAPI/FastMCP objects regardless of where they live.
3. Should both HTTP and MCP be included?
Or should we ship only one? They share the same execution core and CLI namespace, but serve different audiences (programmatic integration vs. AI agents).
4. Are you fine with the overall architecture?
The full architecture, file structure, design decisions, and extensibility model are documented in the PR #5370 description. Please review and flag any concerns.
Please vote +1/−1 in comments, not the poll!
Beta Was this translation helpful? Give feedback.
All reactions