From a205875a8f258fffc78fb93708029651c21db153 Mon Sep 17 00:00:00 2001 From: Shane Utt Date: Fri, 26 Sep 2025 15:40:13 -0400 Subject: [PATCH 1/3] docs: add initial payload processing proposal Co-Authored-By: Flynn Signed-off-by: Shane Utt --- proposals/7-payload-processing.md | 113 ++++++++++++++++++++++++++++++ 1 file changed, 113 insertions(+) create mode 100644 proposals/7-payload-processing.md diff --git a/proposals/7-payload-processing.md b/proposals/7-payload-processing.md new file mode 100644 index 0000000..8c1fe9a --- /dev/null +++ b/proposals/7-payload-processing.md @@ -0,0 +1,113 @@ +# Payload Processing + +* Authors: @shaneutt, @kflynn + +# What? + +Define standards for declaratively adding processing steps to HTTP requests and +responses in Kubernetes across the entire payload, including the body. + +# Why? + +Modern workloads require the ability to process the full payload of an HTTP +request and response, including both header and body: + +* **AI Inference Security**: Guard against bad prompts for inference requests, + or misaligned responses. +* **AI Inference Optimization**: Route requests based on semantics. Enable + caching based on semantic similarity to reduce inference costs and enable + faster response times for common requests. Enable RAG systems to supplement + inference requests with additional context to get better results. +* **Web Application Security**: Enforce signature-based detection rules, anomaly + detection systems, scan uploads, call external auth with payload data, etc. + +Payload processing can also encompass various use cases outside of AI, such as +external authorization or rate limiting. Despite these use cases, though, +payload processing is not standardized in Kubernetes today. + +## Definitions + +* **Payload Processors**: Features capable of processing the full payload of + requests and/or responses (including headers and body). Payload processors + may be implemented natively or as extensions. Many existing API gateways + (including Envoy and NGINX) include filter mechanisms which fit this + definition, but we are not limiting discussion to only these existing + mechanisms. + +## User Stories + +* As a developer of an application that performs AI inference as part of its + function: + + * I want routing decisions for inference requests to be able to be + dynamically adapted based on the content of each request, targeting the + most suitable models to improve the quality of inference results that my + application receives. + + * I want declarative configuration of failure modes for processing steps + (fail-open, fail-closed, fallback, etc) to ensure safe and efficient + runtime behavior of my application. + + * I want predictable ordering of all payload processing steps to ensure + safe and consistent runtime behavior. + +* As a security engineer, I want to be able to add a detection engine which + scans requests to identify malicious or anomalous request payloads and + block, sanitize, and/or report them before they reach backends. + +* As a cluster admin, I want to be able to add semantic caching to inference + requests in order to detect repeated requests and return cached results, + reducing overall inference costs and improving latency for common requests. + +* As a compliance officer: + + * I want to be able to add processors that examine inference **requests** + for personally identifiable information (PII) so that any PII can result + in the request being blocked, sanitized, or reported before sending it to + the inference backend. + + * I want to be able to add processors that examine inference **responses** + for malicious or misaligned results so that any such results can be + dropped, sanitized, or reported before the response is sent to the + requester. + +## Goals + +* Ensure that declarative APIs, standards, and guidance on best practices + exist for adding Payload Processors to HTTP requests and responses on + Kubernetes. +* Ensure that there is adequate documentation for developers to be able to + easily build implementations of Payload Processors according to the + standards. +* Support composability, pluggability, and ordered processing of Payload + Processors. +* Ensure the APIs can provide clear and easily observable defaulting behavior. +* Ensure the APIs can provide clear and obvious runtime behavior. +* Provide failure mode options for Payload Processors. + +## Non-Goals + +* Requiring every request or response to be processed by a payload processor. + The mechanisms described in this proposal are intended to be optional + extensions. + +# How? + +TODO in a later PR. + +> This should be left blank until the "What?" and "Why?" are agreed upon, +> as defining "How?" the goals are accomplished is not important unless we can +> first even agree on what the problem is, and why we want to solve it. +> +> This section is fairly freeform, because (again) these proposals will +> eventually find there way into any number of different final proposal formats +> in other projects. However, the general guidance is to break things down into +> highly focused sections as much as possible to help make things easier to +> read and review. Long, unbroken walls of code and YAML in this document are +> not advisable as that may increase the time it takes to review. + +# Relevant Links + +* [Original Slack Discussion](https://kubernetes.slack.com/archives/C09EJTE0LV9/p1757621006832049) +* [Document: Extended Body-Based Routing (BBR) in Gateway API Inference Extension](https://docs.google.com/document/d/1So9uRjZrLUHf7Rjv13xy_ip3_5HSI1cn1stS3EsXLWg) + From f91763d3e8fce4167807d822283324c27a872cbc Mon Sep 17 00:00:00 2001 From: Shane Utt Date: Mon, 6 Oct 2025 08:04:54 -0400 Subject: [PATCH 2/3] docs: add notes on the definition of payload processors --- proposals/7-payload-processing.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/proposals/7-payload-processing.md b/proposals/7-payload-processing.md index 8c1fe9a..376fe0d 100644 --- a/proposals/7-payload-processing.md +++ b/proposals/7-payload-processing.md @@ -28,11 +28,13 @@ payload processing is not standardized in Kubernetes today. ## Definitions * **Payload Processors**: Features capable of processing the full payload of - requests and/or responses (including headers and body). Payload processors - may be implemented natively or as extensions. Many existing API gateways - (including Envoy and NGINX) include filter mechanisms which fit this - definition, but we are not limiting discussion to only these existing - mechanisms. + requests and/or responses (including headers and body). + +> **Note**: At a definition level, we do not intend "Payload Processors" to be construed with +> any existing details or any other implementations that might have similarities. For instance, +> we are not trying to prescribe that these are done natively, or as extensions. We are also +> aware that many existing API Gateways include "filter" mechanisms which could be seen as +> fitting this definition, but we are not limiting discussion to only these existing mechanisms. ## User Stories From 4e7b9a46ddd85e340cdf27ea24a8b758a2222ab5 Mon Sep 17 00:00:00 2001 From: Shane Utt Date: Mon, 6 Oct 2025 08:05:27 -0400 Subject: [PATCH 3/3] docs: cleanup references in proposal 7 --- proposals/7-payload-processing.md | 1 - 1 file changed, 1 deletion(-) diff --git a/proposals/7-payload-processing.md b/proposals/7-payload-processing.md index 376fe0d..e56f469 100644 --- a/proposals/7-payload-processing.md +++ b/proposals/7-payload-processing.md @@ -111,5 +111,4 @@ TODO in a later PR. # Relevant Links * [Original Slack Discussion](https://kubernetes.slack.com/archives/C09EJTE0LV9/p1757621006832049) -* [Document: Extended Body-Based Routing (BBR) in Gateway API Inference Extension](https://docs.google.com/document/d/1So9uRjZrLUHf7Rjv13xy_ip3_5HSI1cn1stS3EsXLWg)