|
| 1 | +--- |
| 2 | +id: skyhash |
| 3 | +title: Skyhash Protocol 1.0 |
| 4 | +--- |
| 5 | +:::note About this document |
| 6 | +Copyright (c) 2021 Sayan Nandan <[email protected]> |
| 7 | +**In effect since:** v0.6.0 |
| 8 | +**Date:** 11<sup>th</sup> May, 2021 |
| 9 | +::: |
| 10 | + |
| 11 | +## Introduction |
| 12 | + |
| 13 | +Skyhash or the Skytable Serialization Protocol (SSP) is a serialization protocol built on top of TCP that is |
| 14 | +used by Skytable for client/server communication. All clients willing to communicate with Skytable need to implement this protocol. |
| 15 | + |
| 16 | +## Concepts |
| 17 | + |
| 18 | +Skyhash uses a query/response action just like HTTP's request/response action — |
| 19 | +clients send queries while the server sends responses. All the bytes sent by a client to a server is called a _Query Packet_ while all the bytes sent by the server in response to this is called the _Response packet_. |
| 20 | + |
| 21 | +Irrespective of the action type, all these packets are made of a metaframe and a dataframe. |
| 22 | + |
| 23 | +### The Metaframe |
| 24 | +The metaframe is the first part of the packet separated from the rest of the packet by a line feed (`\n`) character. It looks like |
| 25 | +this: |
| 26 | +``` |
| 27 | +*<c>\n |
| 28 | +``` |
| 29 | +where `<c>` tells us the number of actions this packet corresponds to. For simple queries which run one action, this will be one while for batch queries it can have any value in the range (1, +∞). |
| 30 | + |
| 31 | +### The Dataframe |
| 32 | +The dataframe is made up of elements. Each element corresponds to |
| 33 | +a single action and hence corresponds to a single query. Simple queries will run one action and hence will have one element while batch queries will run a number of actions and hence will have a number of elements. |
| 34 | + |
| 35 | +Every element is of a certain [data type](#common-data-types) and this type determines how the element is serialized with Skyhash. Responses receive some extra data types which are |
| 36 | +highlighted in [response specific data types](#response-specific-data-types). |
| 37 | + |
| 38 | +## Common Data Types |
| 39 | + |
| 40 | +Usually serialized data types look like: |
| 41 | +``` |
| 42 | +<tsymbol><len>\n |
| 43 | +-----DATA------- |
| 44 | +``` |
| 45 | +where the `<tsymbol>` corresponds to the Type Symbol and the `<len>` corresponds to the length of |
| 46 | +this element. Below is a list of data types and their `<tsymbol>`s. |
| 47 | + |
| 48 | +### Strings (+) |
| 49 | +String elements are serialized like: |
| 50 | +``` |
| 51 | ++<c>\n |
| 52 | +<mystring>\n |
| 53 | +``` |
| 54 | +Where `<c>` is the number of bytes in the string '`<mystring>`'. |
| 55 | +So a string 'Sayan' will be serialized into: |
| 56 | +``` |
| 57 | ++5\n |
| 58 | +Sayan\n |
| 59 | +``` |
| 60 | + |
| 61 | +Strings are binary safe because they have prefixed lengths |
| 62 | + |
| 63 | +### Unsigned integers (:) |
| 64 | + |
| 65 | +64-bit usigned integers are serialized into: |
| 66 | +``` |
| 67 | +:<c>\n |
| 68 | +<myint>\n |
| 69 | +``` |
| 70 | +Where `<c>` is the number of digits in the integer and `<myint>` is the integer itself. |
| 71 | + |
| 72 | +### Arrays (&) |
| 73 | + |
| 74 | +Arrays are recursive data types, that is an array can contain another array which in turn can contain another array and so on. And array is essentially a collection of data types, including itself. Also, arrays can be multi-type. |
| 75 | + |
| 76 | +Skyhash serializes arrays into: |
| 77 | +``` |
| 78 | +&<c>\n |
| 79 | +<elements> |
| 80 | +``` |
| 81 | +Where `<c>` is the number of elements in this array and `<elements>` are the elements present in the array. Take a look at the following examples: |
| 82 | + |
| 83 | +1. An array containing two strings: |
| 84 | +``` |
| 85 | +&2\n |
| 86 | ++5\n |
| 87 | +Hello |
| 88 | ++5\n |
| 89 | +World\n |
| 90 | +``` |
| 91 | +This can be represented as: |
| 92 | +```js |
| 93 | +Array([String("Hello"), String("World")]) |
| 94 | +``` |
| 95 | +2. An array containing a string an two integers: |
| 96 | +``` |
| 97 | +&3\n |
| 98 | ++5\n |
| 99 | +Hello |
| 100 | +:1\n |
| 101 | +0\n |
| 102 | +:1\n |
| 103 | +1\n |
| 104 | +``` |
| 105 | +Which can be represented as: |
| 106 | +```js |
| 107 | +Array([String("Hello"), UnsignedInt64(0), UnsignedInt64(1)]) |
| 108 | +``` |
| 109 | +3. An array containing two arrays: |
| 110 | +Pipe symbols (|) and underscores (_) were added for explaining the logical parts of the array: |
| 111 | + |
| 112 | +``` |
| 113 | + ___________________________ |
| 114 | +&2\n |_____________| | |
| 115 | +&2\n | | | |
| 116 | ++5\n | | | |
| 117 | +Hello\n | Array 1 | | |
| 118 | ++5\n | | | |
| 119 | +World\n |_____________| | |
| 120 | +&3\n | | Nested | |
| 121 | ++5\n | | Array | |
| 122 | +Hello\n | | | |
| 123 | ++5\n | Array 2 | | |
| 124 | +World\n | | | |
| 125 | ++5\n | | | |
| 126 | +Again\n |_____________|_____________| |
| 127 | +``` |
| 128 | + |
| 129 | +This can be represented as: |
| 130 | +```js |
| 131 | +Array([ |
| 132 | + Array([String("Hello"), String("World")]), |
| 133 | + Array([String("Hello"), String("World"), String("Again")]) |
| 134 | +]) |
| 135 | +``` |
| 136 | +This can be nested even more! |
| 137 | + |
| 138 | + |
| 139 | +### Important notes |
| 140 | + |
| 141 | +These data types and `<tsymbols>` are non-exhaustive. Whenever you are attempting to deserialize a packet, always throw some kind of `UnimplementedError` to indicate that your client cannot yet deserialize this specific type. See all current data types and their tsymbols [in this table](data-types). |
| 142 | + |
| 143 | +## Response Specific Data Types |
| 144 | + |
| 145 | +Responses will return some additional data types. This is a _non-exhaustive_ list of such types. |
| 146 | + |
| 147 | +### Response Codes (!) |
| 148 | + |
| 149 | +Response codes are often returned by the server when no |
| 150 | +'producable' data can be returned, i.e something like FLUSHDB can only possibly return 'Okay' or an error. This distinction |
| 151 | +is made to reduce errors while matching responses. Skyhash will serialize a response code like: |
| 152 | +``` |
| 153 | +!<c>\n |
| 154 | +<code>\n |
| 155 | +``` |
| 156 | +Where `<c>` is the number of characters in the code and `<code>` is the code itself. So Code `0` that corresponds to `OKAY` will be serialized into: |
| 157 | +``` |
| 158 | +!1\n |
| 159 | +0\n |
| 160 | +``` |
| 161 | + |
| 162 | +You find a full list of response codes [in this table](response-codes). |
0 commit comments