Skip to content

Commit 668627c

Browse files
Add design for codecs and client protocols
1 parent 94ed292 commit 668627c

File tree

1 file changed

+166
-0
lines changed

1 file changed

+166
-0
lines changed

designs/serialization.md

Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -523,3 +523,169 @@ client call to a service. A service could, for example, model its event
523523
structures and include them in their client. A customer could then use the
524524
generated `DeserializeableShape`s to deserialize those events into Python types
525525
when they're received without having to do so manually.
526+
527+
## Codecs
528+
529+
Serializers and deserializers are never truly disconnected - where there's one,
530+
there's always the other. They need to be tied together in a way that makes
531+
sense, is portable, and which provides extra utility for common use cases.
532+
533+
One such use case is the serialization and deserialization to and from discrete
534+
bytes of a common format represented by a media type such as `application/json`.
535+
These will be represented by the `Codec` interface:
536+
537+
```python
538+
@runtime_checkable
539+
class Codec(Protocol):
540+
541+
@property
542+
def media_type(self) -> str:
543+
...
544+
545+
def create_serializer(self, sink: BytesWriter) -> ShapeSerializer:
546+
...
547+
548+
def create_deserializer(self, source: bytes | BytesReader) -> ShapeDeserializer:
549+
...
550+
551+
def serialize(self, shape: SerializeableShape) -> bytes:
552+
... # A default implementation will be provided
553+
554+
def deserialize[S: DeserializeableShape](
555+
self, source: bytes | BytesReader,
556+
shape: type[S],
557+
) -> S:
558+
... # A default implementation will be provided
559+
```
560+
561+
This interface provides a layer on top of serializers and deserializers that lets
562+
them be interacted with in a bytes-in, bytes-out way. This allows them to be used
563+
generically in places like HTTP message bodies. The following shows how one could
564+
use a JSON codec:
565+
566+
```python
567+
>>> codec = JSONCodec()
568+
>>> deserialized = codec.deserialize(b'{"member":9}', ExampleStructure)
569+
>>> print(deserialized)
570+
ExampleStructure(member=9)
571+
>>> print(codec.serialize(deserialized))
572+
b'{"member":9}'
573+
```
574+
575+
Combining them this way also allows for sharing configuration. In JSON, for
576+
example, there could be a configuration option to represent number types that
577+
can't fit in am IEEE 754 double as a string, since many JSON implementations
578+
(including JavaScript's) treat them as such.
579+
580+
`Codec`s also provides opportunities for minor optimizations, such as caching
581+
serializers and deserializers where possible.
582+
583+
## Client Protocols
584+
585+
`Codec`s aren't sufficient to fully represent a protocol, however, as there is
586+
also a transport layer that must be created and support data binding. An HTTP
587+
request, for example, can have operation members bound to headers, the query
588+
string, the response code, etc. Such transports generally operate by interacting
589+
`Request` and `Response` objects rather than raw bytes, so the bytes-based
590+
interfaces of `Codec` aren't sufficient by themselves.
591+
592+
```python
593+
class ClientProtocol[Request, Response](Protocol):
594+
595+
@property
596+
def id(self) -> ShapeID:
597+
...
598+
599+
def serialize_request[I: SerializeableShape, O: DeserializeableShape](
600+
self,
601+
operation: ApiOperation[I, O],
602+
input: I,
603+
endpoint: URI,
604+
context: dict[str, Any],
605+
) -> Request:
606+
...
607+
608+
def set_service_endpoint(
609+
self,
610+
request: Request,
611+
endpoint: Endpoint,
612+
) -> Request:
613+
...
614+
615+
async def deserialize_response[I: SerializeableShape, O: DeserializeableShape](
616+
self,
617+
operation: ApiOperation[I, O],
618+
error_registry: TypeRegistry,
619+
request: Request,
620+
response: Response,
621+
context: dict[str, Any],
622+
) -> O:
623+
...
624+
```
625+
626+
The `ClientProtocol` incorporates much more context than a `Codec` does.
627+
Serialization takes the operation's schema via `ApiOperation`, the endpoint to
628+
send the request to, and a general context bag that is passed through the
629+
request pipeline. Deserialization takes much of the same as well as a
630+
`TypeRegistry` that allows it to map errors it encounters to the generated
631+
exception classes.
632+
633+
In most cases these `ClientProtocol`s will be constructed with a `Codec` used to
634+
(de)serialize part of the request, such as the HTTP message body. Since that
635+
aspect is separate, it allows for flexibility through composition. Two Smithy
636+
protocols that support HTTP bindings but use a different body media type could
637+
share most of a `ClientProtocol` implementation with the `Codec` being swapped
638+
out to support the appropriate media type.
639+
640+
A `ClientProtocol` will need to be used alongside a `ClientTransport` that takes
641+
the same request and response types to handle sending the request.
642+
643+
```python
644+
class ClientTransport[Request, Response](Protocol):
645+
async def send(self, request: Request) -> Response:
646+
...
647+
```
648+
649+
Below is an example of what a very simplistic use of a `ClientProtocol` could
650+
look like. (The actual request pipeline in generated clients will be more
651+
robust, including things like automated retries, endpoint resolution, and so
652+
on.)
653+
654+
```python
655+
class ExampleClient:
656+
def __init__(
657+
self,
658+
protocol: ClientProtocol,
659+
transport: ClientTransport,
660+
):
661+
self.protocol = protocol
662+
self.transport = transport
663+
664+
async def example_operation(
665+
self, input: ExampleOperationInput
666+
) -> ExampleOperationOutput:
667+
context = {}
668+
transport_request = self.protocol.serialize_request(
669+
operation=EXAMPLE_OPERATION_SCHEMA,
670+
input=input,
671+
endpoint=BASE_ENDPOINT,
672+
context=context,
673+
)
674+
transport_response = await self.transport.send(transport_request)
675+
return self.protocol.deserialize_response(
676+
operation=EXAMPLE_OPERATION_SCHEMA,
677+
error_registry=EXAMPLE_OPERATION_REGISTRY,
678+
request=transport_request,
679+
response=transport_response,
680+
context=context,
681+
)
682+
```
683+
684+
As you can see, this makes the protocol and trasnport configurable at runtime.
685+
This will make it significantly easier for services to support multiple
686+
protocols and for customers to use whichever they please. It isn't even
687+
necessary to update the client version to make use of a new protocol - a
688+
customer could simply take a dependency on the implementation and use it.
689+
690+
Similarly, since the protocol is decoupled from the transport, customers can
691+
freely switch between implementations without also having to switch protocols.

0 commit comments

Comments
 (0)