Skip to content

Commit f4c393c

Browse files
committed
Fixed #43
1 parent 820e58c commit f4c393c

File tree

1 file changed

+179
-1
lines changed

1 file changed

+179
-1
lines changed

README.md

Lines changed: 179 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ A lot of data, especially data designed to be used in many different languages,
4646
- `BooleanArray` (a variable-length array of `Boolean`s)
4747
- `Char` (a single UTF-8 character)
4848
- `String` (an array of UTF-8 characters that also stores its total byte length)
49-
- `Octets` (a `Buffer` (raw binary data))
49+
- `Octets` (an `ArrayBuffer` (raw binary data))
5050
- Recursive types
5151
- `Tuple<Type>` (a constant-length array of `Type`s)
5252
- `Struct` (a fixed collection of up to 255 fields, each with a name (up to 255 bytes long) and a type)
@@ -265,6 +265,184 @@ Client-side:
265265
</script>
266266
````
267267

268+
## Binary formats
269+
In the following definitions, `uint8_t` means an 8-bit unsigned integer, `uint16_t` means a 16-bit unsigned integer, and `uint32_t` means a 32-bit unsigned integer.
270+
All numbers are stored in big-endian format.
271+
### Type
272+
273+
The binary format of a type contains a byte identifying the class of the type followed by additional information to describe the specific instance of the type, if necesary.
274+
For example, `new sb.UnsignedIntType` translates into `[0x13]`, and `new sb.StructType({abc: new sb.ByteType, def: new sb.StringType})` translates into:
275+
````javascript
276+
[
277+
0x51 /*StructType*/,
278+
2 /*2 fields*/,
279+
3 /*3 characters in first field's name*/, 0x61 /*a*/, 0x62 /*b*/, 0x63 /*c*/, 0x01 /*ByteType*/,
280+
3 /*3 characters in second field's name*/, 0x64 /*d*/, 0x65 /*e*/, 0x66 /*f*/, 0x41 /*StringType*/
281+
]
282+
````
283+
If the type has already been written to the buffer, it is also valid to serialize the type as:
284+
285+
- `0xFF`
286+
- `offset` ([position of first byte of `offset` in buffer] - [position of type in buffer]) - `uint16_t`
287+
288+
For example:
289+
````javascript
290+
const someType = new sb.TupleType({
291+
type: new sb.FloatType,
292+
length: 3
293+
})
294+
const type = new sb.StructType({
295+
one: someType,
296+
two: someType
297+
})
298+
/*type translates into
299+
[
300+
0x51 /*StructType*/,
301+
2 /*2 fields*/,
302+
3 /*3 characters in first field's name*/, 0x6f /*o*/, 0x6e /*n*/, 0x65 /*e*/,
303+
0x50 /*TupleType*/,
304+
0x20 /*FloatType*/,
305+
0, 0, 0, 3 /*3 floats in the tuple*/,
306+
3 /*3 characters in second field's name*/, 0x74 /*t*/, 0x77 /*w*/, 0x6f /*o*/,
307+
0xff, /*type is defined previously*/
308+
0, 11 /*type is defined 11 bytes before the 0 on this line*/
309+
]
310+
*/
311+
````
312+
In the following definitions, `type` means the binary type format.
313+
314+
- `ByteType`: identifier `0x01`
315+
- `ShortType`: identifier `0x02`
316+
- `IntType`: identifier `0x03`
317+
- `LongType`: identifier `0x04`
318+
- `BigIntType`: identifier `0x05`
319+
- `UnsignedByteType`: identifier `0x11`
320+
- `UnsignedShortType`: identifier `0x12`
321+
- `UnsignedIntType`: identifier `0x13`
322+
- `UnsignedLongType`: identifier `0x14`
323+
- `BigUnsignedIntType`: identifier `0x15`
324+
- `DateType`: identifier `0x1A`
325+
- `DayType`: identifier `0x1B`
326+
- `TimeType`: identifier `0x1C`
327+
- `FloatType`: identifier `0x20`
328+
- `DoubleType`: identifier `0x21`
329+
- `BooleanType`: identifier `0x30`
330+
- `BooleanTupleType`: identifier `0x31`, payload:
331+
- `length` - `uint32_t`
332+
- `BooleanArrayType`: identifier `0x32`
333+
- `CharType`: identifier `0x40`
334+
- `StringType`: identifier `0x41`
335+
- `OctetsType`: identifier `0x42`
336+
- `TupleType`: identifier `0x50`, payload:
337+
- `elementType` - `type`
338+
- `length` - `uint32_t`
339+
- `StructType`: identifier `0x51`, payload:
340+
- `fieldCount` - `uint8_t`
341+
- `fieldCount` instances of `field`:
342+
- `nameLength` - `uint8_t`
343+
- `name` - a UTF-8 string containing `nameLength` bytes
344+
- `fieldType` - `type`
345+
- `ArrayType`: identifier `0x52`, payload:
346+
- `elementType` - `type`
347+
- `SetType`: identifier `0x53`, payload identical to `ArrayType`:
348+
- `elementType` - `type`
349+
- `MapType`: identifier `0x54`, payload:
350+
- `keyType` - `type`
351+
- `valueType` - `type`
352+
- `EnumType`: identifier `0x55`, payload:
353+
- `valueType` - `type`
354+
- `valueCount` - `uint8_t`
355+
- `valueCount` instances of `value`:
356+
- `value` - a value that conforms to `valueType`
357+
- `ChoiceType`: identifier `0x56`, payload:
358+
- `typeCount` - `uint8_t`
359+
- `typeCount` instances of `possibleType`:
360+
- `possibleType` - `type`
361+
- `NamedChoiceType`: identifier `0x58`, payload:
362+
- `typeCount` - `uint8_t`
363+
- `typeCount` instances of `possibleType`:
364+
- `typeNameLength` - `uint8_t`
365+
- `typeName` - a UTF-8 string containing `typeNameLength` bytes
366+
- `typeType` - `type`
367+
- `RecursiveType`: identifier `0x57`, payload:
368+
- `recursiveID` (an identifier unique to this recursive type in this type buffer) - `uint16_t`
369+
- If this is the first instance of this recursive type in this buffer:
370+
- `recursiveType` (the type definition of this type) - `type`
371+
- `OptionalType`: identifier `0x60`, payload:
372+
- `typeIfNonNull` - `type`
373+
- `PointerType`: identifier `0x70`, payload:
374+
- `targetType` - `type`
375+
376+
### Value
377+
378+
- `ByteType`: 1-byte integer
379+
- `ShortType`: 2-byte integer
380+
- `IntType`: 4-byte integer
381+
- `LongType`: 8-byte integer
382+
- `BigIntType`:
383+
- `byteCount` - `uint16_t`
384+
- `number` - `byteCount`-byte integer
385+
- `UnsignedByteType`: 1-byte unsigned integer
386+
- `UnsignedShortType`: 2-byte unsigned integer
387+
- `UnsignedIntType`: 4-byte unsigned integer
388+
- `UnsignedLongType`: 8-byte unsigned integer
389+
- `BigUnsignedIntType`:
390+
- `byteCount` - `uint16_t`
391+
- `number` - `byteCount`-byte unsigned integer
392+
- `DateType`: 8-byte unsigned integer storing milliseconds in [Unix time](https://en.wikipedia.org/wiki/Unix_time)
393+
- `DayType`: 3-byte unsigned integer storing days since the [Unix time](https://en.wikipedia.org/wiki/Unix_time) epoch
394+
- `TimeType`: 4-byte unsigned integer storing milliseconds since the start of the day
395+
- `FloatType`: single precision (4-byte) [IEEE floating point](https://en.wikipedia.org/wiki/IEEE_floating_point)
396+
- `DoubleType`: double precision (8-byte) [IEEE floating point](https://en.wikipedia.org/wiki/IEEE_floating_point)
397+
- `BooleanType`: 1-byte value, either `0x00` for `false` or `0xFF` for `true`
398+
- `BooleanTupleType`: `ceil(length / 8)` bytes, where the `n`th boolean is stored at the `(n % 8)`th MSB (`0`-indexed) of the `floor(n / 8)`th byte (`0`-indexed)
399+
- `BooleanArrayType`:
400+
- `length` - `uint32_t`
401+
- `booleans` - `ceil(length / 8)` bytes, where the `n`th boolean is stored at the `(n % 8)`th MSB (`0`-indexed) of the `floor(n / 8)`th byte (`0`-indexed)
402+
- `CharType`: UTF-8 codepoint (somewhere between 1 and 4 bytes long)
403+
- `StringType`:
404+
- `string` - a UTF-8 string of any length not containing `'\0'`
405+
- `0x00` to mark the end of the string
406+
- `OctetsType`:
407+
- `length` - `uint32_t`
408+
- `octets` - `length` bytes
409+
- `TupleType`:
410+
- `length` values serialized by `elementType`
411+
- `StructType`:
412+
- For each field in order of declaration in the type format:
413+
- The field's value serialized by `fieldType`
414+
- `ArrayType`:
415+
- `length` - `uint32_t`
416+
- `length` values serialized by `elementType`
417+
- `SetType`:
418+
- `size` - `uint32_t`
419+
- `size` values serialized by `elementType`
420+
- `MapType`:
421+
- `size` - `uint32_t`
422+
- `size` instances of `keyValuePair`:
423+
- `key` - value serialized by `keyType`
424+
- `value` - value serialized by `valueType`
425+
- `EnumType`:
426+
- `index` of value in values array - `uint8_t`
427+
- `ChoiceType`:
428+
- `index` of type in possible types array - `uint8_t`
429+
- `value` - value serialized by specified type
430+
- `NamedChoiceType`:
431+
- `index` of type in possible types array - `uint8_t`
432+
- `value` - value serialized by specified type
433+
- `RecursiveType`:
434+
- `valueNotYetWrittenInBuffer` - byte containing either `0x00` or `0xFF`
435+
- If `valueNotYetWrittenInBuffer`
436+
- `value` - value serialized by `recursiveType`
437+
- Else
438+
- `offset` ([position of first byte of `offset` in buffer] - [position of `value` in buffer]) - `uint32_t`
439+
- `OptionalType`:
440+
- `valueIsNonNull` - byte containing either `0x00` or `0xFF`
441+
- If `valueIsNonNull`
442+
- `value` - value serialized by `typeIfNonNull`
443+
- `PointerType`:
444+
- `index` of value in buffer (note: if buffer contains both a type and a value, this index is relative to the start of the value data) - `uint32_t`
445+
268446
## Versioning
269447
Versions will be of the form `x.y.z`.
270448
`x` is the major release; changes to it represent significant or breaking changes to the API. Before the full release, it was `0`.

0 commit comments

Comments
 (0)