Skip to content

Commit afb6a23

Browse files
committed
KEP-4222: Add details of multi-protocol RawExtension serialization.
1 parent ed7065f commit afb6a23

File tree

1 file changed

+205
-0
lines changed
  • keps/sig-api-machinery/4222-cbor-serializer

1 file changed

+205
-0
lines changed

keps/sig-api-machinery/4222-cbor-serializer/README.md

Lines changed: 205 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,14 @@ tags, and then generate with `hack/update-toc.sh`.
9999
- [Encoding Determinism](#encoding-determinism)
100100
- [Unicode](#unicode)
101101
- [Libraries](#libraries)
102+
- [RawExtension](#rawextension)
103+
- [Usage](#usage)
104+
- [Transient External Types](#transient-external-types)
105+
- [Stored External Types](#stored-external-types)
106+
- [Types as Canonical Definition of Custom Resources](#types-as-canonical-definition-of-custom-resources)
107+
- [Scenarios](#scenarios)
108+
- [Compatibility](#compatibility)
109+
- [Migration](#migration)
102110
- [Test Plan](#test-plan)
103111
- [Prerequisite testing updates](#prerequisite-testing-updates)
104112
- [Unit tests](#unit-tests)
@@ -613,7 +621,204 @@ sequence that it will not successfully decode.
613621

614622
[Benchmarks](https://docs.google.com/spreadsheets/d/1yi8cHrnlbmCUY2Vo7Sknrf87WDOuGUswYsyqJfEUwls/edit#gid=0) TODO: inline
615623

624+
### RawExtension
616625

626+
The `RawExtension` type in `k8s.io/apimachinery/pkg/runtime` allows extension types to be handled
627+
opaquely within external versioned types, as long as they are syntactically valid.
628+
629+
The [type
630+
declaration](https://github.com/kubernetes/kubernetes/blob/169a952720ebd75fcbcb4f3f5cc64e82fdd3ec45/staging/src/k8s.io/apimachinery/pkg/runtime/types.go#L51-L109)
631+
is:
632+
633+
```go
634+
type RawExtension struct {
635+
Raw []byte
636+
Object Object
637+
}
638+
```
639+
640+
Using JSON, marshalling and unmarshalling of `RawExtension` is comparable to that of the standard
641+
library's [RawMessage](https://pkg.go.dev/encoding/json#RawMessage) type. For unmarshalling, if the
642+
input serialized JSON value is `null`, the destination `RawExtension` is not modified. Otherwise,
643+
its `Raw` field is set to a verbatim copy of the provided serialized JSON value. The contract of
644+
json.Unmarshaler states that implementations can assume that the input is valid encoding of a JSON
645+
value. Absent a bug in the caller (typically via `json.Marshal` or `(*json.Decoder).Decode`), a
646+
`RawExtension`'s Raw field will contain a valid JSON text after unmarshaling.
647+
648+
In general, for an encoding that supports Unstructured, the encoding of a RawExtension value must
649+
always be the same as the overall encoding of the request or response body. This is not the case for
650+
Protobuf. Protobuf can encode RawExtension fields with any encoding since both the writer and reader
651+
of a Protobuf message have the type information to know that they are serializing or deserializing a
652+
RawExtension message.
653+
654+
There are three cases when marshalling `RawExtension` to JSON:
655+
656+
1. If both Raw and Object are `nil`, `null` is returned.
657+
1. If Raw is not `nil`, return it verbatim.
658+
1. Otherwise (Raw is `nil` and Object is not `nil`), return the result of marshalling Object.
659+
660+
Note that, in the second case, the bytes of the Raw field must be a valid JSON text in order to
661+
successfully serialize an object containing a `RawExtension` to JSON.
662+
663+
#### Usage
664+
665+
##### Transient External Types
666+
667+
External versioned types may use `RawExtension` to exchange arbitrary objects and plugins without
668+
persisting them to storage. In these cases, only a single object encoding is involved. When
669+
preparing to send, or handle a received object containing `RawExtension`, callers can assume that
670+
the Raw bytes are in the same encoding as the negotiated request or response encoding.
671+
672+
##### Stored External Types
673+
674+
Storing the verbatim Raw bytes of a `RawExtension` received from a client introduces additional
675+
considerations on top of the transient (transmit-only) case. The encoding of the Raw bytes is
676+
determined by encoding of the request that wrote the value of the RawExtension, which may or may not
677+
be the same as the object's storage encoding.
678+
679+
##### Types as Canonical Definition of Custom Resources
680+
681+
Throughout the ecosystem, it is common practice to maintain Go structs as the canonical definition
682+
for API extensions. In many cases, `controller-gen` is used to mechanically translate such types
683+
from Go sources to CustomResourceDefinition manifests. Similarly, `client-gen` can produce typed Go
684+
clients that use the canonical Go types directly. These Go struct types can and sometimes do include
685+
fields of type `RawExtension`
686+
([example](https://github.com/openshift/api/blob/944467d2cc3b03225ccc24c4e88b876396202d5a/operator/v1/types.go#L91)).
687+
688+
#### Scenarios
689+
690+
The following tables enumerate API request and response flows that can involve `RawExtension`.
691+
692+
The *Client* and *Server* columns indicate the types the named component uses to processes API
693+
objects. If "dynamic", it uses Unstructured (e.g. a custom resource handler or a dynamic client). If
694+
"typed", it uses API-specific Go types that may include `RawExtension` (e.g. clients generated by
695+
`client-gen`, kube-apiserver built-in types, aggregated apiservers). The table omits cases where
696+
both the client and the server are dynamic (e.g. a dynamic client and a custom resource handler),
697+
since neither side should be dealing with `RawExtension` values. The edge case where a client
698+
program makes a `RawExtension` a child of an Unstructured value's `map[string]interface{}` can be
699+
considered a static client case for the purposes of this evaluation.
700+
701+
The *Encoding* column is the client's encoding of the request body (for requests) or the server's
702+
encoding of the response body (for responses).
703+
704+
**Marshalled Unstructured**
705+
706+
| N | Client | Server | Direction | Encoding |
707+
|---|---------|---------|-----------|----------|
708+
| 1 | dynamic | typed | request | json |
709+
| 2 | dynamic | typed | request | cbor |
710+
| 3 | typed | dynamic | response | json |
711+
| 4 | typed | dynamic | response | cbor |
712+
713+
In these cases, the marshalling side acts on an Unstructured object and is not aware that the
714+
unmarshalling side may decode some of the payload into a `RawExtension`. The bytes stored in the
715+
`RawExtension` by unmarshalling ultimately *depend on the negotiated content type, which can vary*
716+
with the enablement of the CBOR serializer. Existing programs have so far been able to assume that
717+
unmarshalled RawExtensions always have either nil or a valid JSON text in their Raw field.
718+
719+
**Marshalled RawExtension**
720+
721+
| N | Client | Server | Direction | Encoding |
722+
|----|---------|---------|-----------|----------|
723+
| 1 | typed | typed | request | json |
724+
| 2 | typed | typed | request | cbor |
725+
| 3 | typed | typed | response | json |
726+
| 4 | typed | typed | response | cbor |
727+
| 5 | dynamic | typed | response | json |
728+
| 6 | dynamic | typed | response | cbor |
729+
| 7 | typed | dynamic | request | json |
730+
| 8 | typed | dynamic | request | cbor |
731+
| 9 | typed | typed | request | protobuf |
732+
| 10 | typed | typed | response | protobuf |
733+
734+
In these cases, if the marshalling side populates Raw with a non-nil slice, it is responsible for
735+
ensuring that that encoding of the slice contents matches the encoding that will be used to
736+
serialize the object containing the `RawExtension`. This is trivially ensured in cases 9 and 10
737+
because Protobuf is capable of representing `RawExtension` values containing arbitrary
738+
bytes. Protobuf is not a supported encoding for Unstructured objects. Existing programs have in
739+
practice stored JSON in the Raw field of `RawExtension`.
740+
741+
#### Compatibility
742+
743+
If the `RawExtension` marshalling and unmarshalling behavior for CBOR were to be implemented in
744+
exactly the same way as the existing JSON behaviors, the assumptions in many existing programs that
745+
the Raw field can be assigned to a slice of JSON bytes, or that the Raw bytes of an unmarshalled
746+
`RawExtension` are valid JSON, would be broken.
747+
748+
The simple approach of automatically transcoding JSON to CBOR during CBOR marshalling, and
749+
transcoding CBOR to JSON during CBOR unmarshalling, would avoid breaking existing programs. However,
750+
the expense of transcoding to or from JSON would negate any performance advantage of a binary
751+
encoding. This expense would not be limited to a few API types: significant examples include the use
752+
of a `RawExtension` field in `metav1.WatchEvent` to represent each watch event's object state, or
753+
the arbitrary objects embedded in `admissionv1.AdmissionRequest`.
754+
755+
A new `ContentType string` field will be added to `RawExtension` to indicate the IANA media type of
756+
the Raw bytes. If empty, the assumed content type is "application/json". In existing usage, if a
757+
RawExtension's Raw field does not contain valid JSON, the RawExtension itself cannot be marshalled
758+
to JSON.
759+
760+
ContentType will not be serialized to JSON or CBOR, but it will be serialized to Protobuf. When
761+
unmarshalling either JSON or CBOR into a RawExtension, the content type is implicitly the same as
762+
that of the input. This is not true for Protobuf, which is capable of embedding RawExtensions using
763+
any encoding, since in all cases both the writer and reader of a Protobuf message are aware that
764+
they are handling an extension.
765+
766+
The proposed behavior for both MarshalJSON and MarshalCBOR is:
767+
768+
1. If both Raw and Object are `nil`, `null` is returned.
769+
1. If Object is not `nil`, return the result of marshalling Object to the target encoding.
770+
1. If the ContentType matches the media type of the target encoding (or if ContentType is the empty
771+
string and the target encoding is JSON), return the Raw bytes verbatim.
772+
1. Otherwise, return the result of transcoding the Raw bytes from the encoding indicated by
773+
ContentType to the target encoding.
774+
775+
Unmarshalling will behave the same for CBOR as it currently does for JSON and the input bytes will
776+
be copied verbatim to the Raw field. The ContentType will be set to "application/json" by a
777+
successful call to UnmarshalJSON and to "application/cbor" by a successful call to UnmarshalCBOR.
778+
779+
Additionally, by default, the Raw bytes of a decoded `RawExtension` will be automatically transcoded
780+
to JSON to preserve compatibility with programs that assume an unmarshalled RawExtension contains
781+
valid JSON. The CBOR serializer available through `serializer.CodecFactory` will be wired to use
782+
this, allowing existing programs to continue to assume that unmarshalled Raw bytes contain JSON. The
783+
stream serializer will not. In practice, the watch decoder assumes that the non-stream serializer
784+
can directly decode the Raw bytes of a `metav1.WatchEvent` decoded by the stream serializer.
785+
786+
There will be a migration period during which it will remain possible to disable automatic
787+
transcoding of RawExtension via feature gate.
788+
789+
##### Migration
790+
791+
**GA**
792+
793+
*Naive Clients*
794+
795+
1. Client assumes received RawExtension is JSON.
796+
1. Client receives CBOR response body. The response bytes that represent the RawExtension are CBOR.
797+
1. During decoding, the RawExtension's Raw field is transcoded from CBOR to JSON.
798+
1. Client continues processing RawExtension bytes as JSON.
799+
800+
*Advanced Clients*
801+
802+
1. Client tolerates RawExtensions containing either JSON or CBOR.
803+
1. Client receives CBOR response body. The response bytes that represent the RawExtension are CBOR.
804+
1. No transcoding is performed during decoding.
805+
1. Client detects the format of the RawExtension bytes and processes it accordingly. RawExtension
806+
will implement UnstructuredConverter, providing a one-liner to get an Unstructured from a
807+
RawExtension.
808+
809+
**Post-GA, CBOR as Default Preferred Request/Response Encoding for One Year**
810+
811+
Automatic transcoding client feature gate becomes disabled by default. The feature gate is unlocked
812+
and transcoding can be re-enabled without code changes using the existing client feature gate
813+
environment variable mechanism.
814+
815+
**Post-GA, CBOR as Default Preferred Request/Response Encoding for Two Years**
816+
817+
Automatic transcoding client feature gate is removed and requires code changes to enable.
818+
819+
All existent clusters will support CBOR. Existing programs continue to work unmodified. Updating
820+
client libraries in existing programs may cause them to break if they have not changed how they are
821+
handling RawExtensions.
617822

618823
### Test Plan
619824

0 commit comments

Comments
 (0)