Skip to content

Commit 5bdcfc5

Browse files
Add design for codecs and client protocols
1 parent 94ed292 commit 5bdcfc5

File tree

1 file changed

+162
-0
lines changed

1 file changed

+162
-0
lines changed

designs/serialization.md

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -523,3 +523,165 @@ client call to a service. A service could, for example, model its event
523523
structures and include them in their client. A customer could then use the
524524
generated `DeserializeableShape`s to deserialize those events into Python types
525525
when they're received without having to do so manually.
526+
527+
## Codecs
528+
529+
Serializers and deserializers are never truly disconnected - where there's one,
530+
there's always the other. They need to be tied together in a way that makes
531+
sense, is portable, and which provides extra utility for common use cases.
532+
533+
One such use case is the serialization and deserialization to and from discrete
534+
bytes of a common format represented by a media type such as `application/json`.
535+
These will be represented by the `Codec` interface:
536+
537+
```python
538+
@runtime_checkable
539+
class Codec(Protocol):
540+
541+
def create_serializer(self, sink: BytesWriter) -> ShapeSerializer:
542+
...
543+
544+
def create_deserializer(self, source: bytes | BytesReader) -> ShapeDeserializer:
545+
...
546+
547+
def serialize(self, shape: SerializeableShape) -> bytes:
548+
... # A default implementation will be provided
549+
550+
def deserialize[S: DeserializeableShape](
551+
self, source: bytes | BytesReader,
552+
shape: type[S],
553+
) -> S:
554+
... # A default implementation will be provided
555+
```
556+
557+
This interface provides a layer on top of serializers and deserializers that lets
558+
them be interacted with in a bytes-in, bytes-out way. This allows them to be used
559+
generically in places like HTTP message bodies. The following shows how one could
560+
use a JSON codec:
561+
562+
```python
563+
>>> codec = JSONCodec()
564+
>>> deserialized = codec.deserialize(b'{"member":9}', ExampleStructure)
565+
>>> print(deserialized)
566+
ExampleStructure(member=9)
567+
>>> print(codec.serialize(deserialized))
568+
b'{"member":9}'
569+
```
570+
571+
Combining them this way also allows for sharing configuration. In JSON, for
572+
example, there could be a configuration option to represent number types that
573+
can't fit in am IEEE 754 double as a string, since many JSON implementations
574+
(including JavaScript's) treat them as such.
575+
576+
`Codec`s also provides opportunities for minor optimizations, such as caching
577+
serializers and deserializers where possible.
578+
579+
## Client Protocols
580+
581+
`Codec`s aren't sufficient to fully represent a protocol, however, as there is
582+
also a transport layer that must be created and support data binding. An HTTP
583+
request, for example, can have operation members bound to headers, the query
584+
string, the response code, etc. Such transports generally operate by interacting
585+
`Request` and `Response` objects rather than raw bytes, so the bytes-based
586+
interfaces of `Codec` aren't sufficient by themselves.
587+
588+
```python
589+
class ClientProtocol[Request, Response](Protocol):
590+
591+
@property
592+
def id(self) -> ShapeID:
593+
...
594+
595+
def serialize_request[I: SerializeableShape, O: DeserializeableShape](
596+
self,
597+
operation: ApiOperation[I, O],
598+
input: I,
599+
endpoint: URI,
600+
context: dict[str, Any],
601+
) -> Request:
602+
...
603+
604+
def set_service_endpoint(
605+
self,
606+
request: Request,
607+
endpoint: Endpoint,
608+
) -> Request:
609+
...
610+
611+
async def deserialize_response[I: SerializeableShape, O: DeserializeableShape](
612+
self,
613+
operation: ApiOperation[I, O],
614+
error_registry: TypeRegistry,
615+
request: Request,
616+
response: Response,
617+
context: dict[str, Any],
618+
) -> O:
619+
...
620+
```
621+
622+
The `ClientProtocol` incorporates much more context than a `Codec` does.
623+
Serialization takes the operation's schema via `ApiOperation`, the endpoint to
624+
send the request to, and a general context bag that is passed through the
625+
request pipeline. Deserialization takes much of the same as well as a
626+
`TypeRegistry` that allows it to map errors it encounters to the generated
627+
exception classes.
628+
629+
In most cases these `ClientProtocol`s will be constructed with a `Codec` used to
630+
(de)serialize part of the request, such as the HTTP message body. Since that
631+
aspect is separate, it allows for flexibility through composition. Two Smithy
632+
protocols that support HTTP bindings but use a different body media type could
633+
share most of a `ClientProtocol` implementation with the `Codec` being swapped
634+
out to support the appropriate media type.
635+
636+
A `ClientProtocol` will need to be used alongside a `ClientTransport` that takes
637+
the same request and response types to handle sending the request.
638+
639+
```python
640+
class ClientTransport[Request, Response](Protocol):
641+
async def send(self, request: Request) -> Response:
642+
...
643+
```
644+
645+
Below is an example of what a very simplistic use of a `ClientProtocol` could
646+
look like. (The actual request pipeline in generated clients will be more
647+
robust, including things like automated retries, endpoint resolution, and so
648+
on.)
649+
650+
```python
651+
class ExampleClient:
652+
def __init__(
653+
self,
654+
protocol: ClientProtocol,
655+
transport: ClientTransport,
656+
):
657+
self.protocol = protocol
658+
self.transport = transport
659+
660+
async def example_operation(
661+
self, input: ExampleOperationInput
662+
) -> ExampleOperationOutput:
663+
context = {}
664+
transport_request = self.protocol.serialize_request(
665+
operation=EXAMPLE_OPERATION_SCHEMA,
666+
input=input,
667+
endpoint=BASE_ENDPOINT,
668+
context=context,
669+
)
670+
transport_response = await self.transport.send(transport_request)
671+
return self.protocol.deserialize_response(
672+
operation=EXAMPLE_OPERATION_SCHEMA,
673+
error_registry=EXAMPLE_OPERATION_REGISTRY,
674+
request=transport_request,
675+
response=transport_response,
676+
context=context,
677+
)
678+
```
679+
680+
As you can see, this makes the protocol and transport configurable at runtime.
681+
This will make it significantly easier for services to support multiple
682+
protocols and for customers to use whichever they please. It isn't even
683+
necessary to update the client version to make use of a new protocol - a
684+
customer could simply take a dependency on the implementation and use it.
685+
686+
Similarly, since the protocol is decoupled from the transport, customers can
687+
freely switch between implementations without also having to switch protocols.

0 commit comments

Comments
 (0)