Skip to content

Commit 2832ec5

Browse files
Add design for codecs and client protocols
1 parent 1a83ad2 commit 2832ec5

File tree

1 file changed

+162
-0
lines changed

1 file changed

+162
-0
lines changed

designs/serialization.md

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -536,3 +536,165 @@ client call to a service. A service could, for example, model its event
536536
structures and include them in their client. A customer could then use the
537537
generated `DeserializeableShape`s to deserialize those events into Python types
538538
when they're received without having to do so manually.
539+
540+
## Codecs
541+
542+
Serializers and deserializers are never truly disconnected - where there's one,
543+
there's always the other. They need to be tied together in a way that makes
544+
sense, is portable, and which provides extra utility for common use cases.
545+
546+
One such use case is the serialization and deserialization to and from discrete
547+
bytes of a common format represented by a media type such as `application/json`.
548+
These will be represented by the `Codec` interface:
549+
550+
```python
551+
@runtime_checkable
552+
class Codec(Protocol):
553+
554+
def create_serializer(self, sink: BytesWriter) -> ShapeSerializer:
555+
...
556+
557+
def create_deserializer(self, source: bytes | BytesReader) -> ShapeDeserializer:
558+
...
559+
560+
def serialize(self, shape: SerializeableShape) -> bytes:
561+
... # A default implementation will be provided
562+
563+
def deserialize[S: DeserializeableShape](
564+
self, source: bytes | BytesReader,
565+
shape: type[S],
566+
) -> S:
567+
... # A default implementation will be provided
568+
```
569+
570+
This interface provides a layer on top of serializers and deserializers that lets
571+
them be interacted with in a bytes-in, bytes-out way. This allows them to be used
572+
generically in places like HTTP message bodies. The following shows how one could
573+
use a JSON codec:
574+
575+
```python
576+
>>> codec = JSONCodec()
577+
>>> deserialized = codec.deserialize(b'{"member":9}', ExampleStructure)
578+
>>> print(deserialized)
579+
ExampleStructure(member=9)
580+
>>> print(codec.serialize(deserialized))
581+
b'{"member":9}'
582+
```
583+
584+
Combining them this way also allows for sharing configuration. In JSON, for
585+
example, there could be a configuration option to represent number types that
586+
can't fit in am IEEE 754 double as a string, since many JSON implementations
587+
(including JavaScript's) treat them as such.
588+
589+
`Codec`s also provides opportunities for minor optimizations, such as caching
590+
serializers and deserializers where possible.
591+
592+
## Client Protocols
593+
594+
`Codec`s aren't sufficient to fully represent a protocol, however, as there is
595+
also a transport layer that must be created and support data binding. An HTTP
596+
request, for example, can have operation members bound to headers, the query
597+
string, the response code, etc. Such transports generally operate by interacting
598+
`Request` and `Response` objects rather than raw bytes, so the bytes-based
599+
interfaces of `Codec` aren't sufficient by themselves.
600+
601+
```python
602+
class ClientProtocol[Request, Response](Protocol):
603+
604+
@property
605+
def id(self) -> ShapeID:
606+
...
607+
608+
def serialize_request[I: SerializeableShape, O: DeserializeableShape](
609+
self,
610+
operation: ApiOperation[I, O],
611+
input: I,
612+
endpoint: URI,
613+
context: dict[str, Any],
614+
) -> Request:
615+
...
616+
617+
def set_service_endpoint(
618+
self,
619+
request: Request,
620+
endpoint: Endpoint,
621+
) -> Request:
622+
...
623+
624+
async def deserialize_response[I: SerializeableShape, O: DeserializeableShape](
625+
self,
626+
operation: ApiOperation[I, O],
627+
error_registry: TypeRegistry,
628+
request: Request,
629+
response: Response,
630+
context: dict[str, Any],
631+
) -> O:
632+
...
633+
```
634+
635+
The `ClientProtocol` incorporates much more context than a `Codec` does.
636+
Serialization takes the operation's schema via `ApiOperation`, the endpoint to
637+
send the request to, and a general context bag that is passed through the
638+
request pipeline. Deserialization takes much of the same as well as a
639+
`TypeRegistry` that allows it to map errors it encounters to the generated
640+
exception classes.
641+
642+
In most cases these `ClientProtocol`s will be constructed with a `Codec` used to
643+
(de)serialize part of the request, such as the HTTP message body. Since that
644+
aspect is separate, it allows for flexibility through composition. Two Smithy
645+
protocols that support HTTP bindings but use a different body media type could
646+
share most of a `ClientProtocol` implementation with the `Codec` being swapped
647+
out to support the appropriate media type.
648+
649+
A `ClientProtocol` will need to be used alongside a `ClientTransport` that takes
650+
the same request and response types to handle sending the request.
651+
652+
```python
653+
class ClientTransport[Request, Response](Protocol):
654+
async def send(self, request: Request) -> Response:
655+
...
656+
```
657+
658+
Below is an example of what a very simplistic use of a `ClientProtocol` could
659+
look like. (The actual request pipeline in generated clients will be more
660+
robust, including things like automated retries, endpoint resolution, and so
661+
on.)
662+
663+
```python
664+
class ExampleClient:
665+
def __init__(
666+
self,
667+
protocol: ClientProtocol,
668+
transport: ClientTransport,
669+
):
670+
self.protocol = protocol
671+
self.transport = transport
672+
673+
async def example_operation(
674+
self, input: ExampleOperationInput
675+
) -> ExampleOperationOutput:
676+
context = {}
677+
transport_request = self.protocol.serialize_request(
678+
operation=EXAMPLE_OPERATION_SCHEMA,
679+
input=input,
680+
endpoint=BASE_ENDPOINT,
681+
context=context,
682+
)
683+
transport_response = await self.transport.send(transport_request)
684+
return self.protocol.deserialize_response(
685+
operation=EXAMPLE_OPERATION_SCHEMA,
686+
error_registry=EXAMPLE_OPERATION_REGISTRY,
687+
request=transport_request,
688+
response=transport_response,
689+
context=context,
690+
)
691+
```
692+
693+
As you can see, this makes the protocol and transport configurable at runtime.
694+
This will make it significantly easier for services to support multiple
695+
protocols and for customers to use whichever they please. It isn't even
696+
necessary to update the client version to make use of a new protocol - a
697+
customer could simply take a dependency on the implementation and use it.
698+
699+
Similarly, since the protocol is decoupled from the transport, customers can
700+
freely switch between implementations without also having to switch protocols.

0 commit comments

Comments
 (0)