Skip to content

Commit 7416585

Browse files
committed
Propose an RFC for machine-readable protocol struct tables
Signed-off-by: Miguel Young de la Sota <[email protected]>
1 parent 4672852 commit 7416585

File tree

1 file changed

+252
-0
lines changed

1 file changed

+252
-0
lines changed
Lines changed: 252 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,252 @@
1+
* Name: `Command_Definition_Syntax`
2+
* Date: 2021-09-20
3+
* Pull Request: [#24](https://github.com/opencomputeproject/Security/pull/24)
4+
5+
# Objective
6+
7+
Currently, the byte layouts of Cerberus protocol commands (hereafter "protocol
8+
structs") are defined ad-hoc by way of tables containing offsets. These tables
9+
present a number of problems:
10+
- They are painful to edit, because they are offset-based.
11+
- They are unnecessarily verbose in some cases.
12+
- They are not machine readable.
13+
14+
This RFC proposes a formal syntax for the tables that is machine readable and
15+
concise in the common case.
16+
17+
# Proposal
18+
19+
Each table looks like the following in Markdown:
20+
```
21+
`message <Name>`
22+
| Type | Name | Description |
23+
| ---- | ---- | ----------- |
24+
...
25+
```
26+
Each table defines a message type. Each message type consists of a name and
27+
some number of fields. Each field consists of a type, a name, and an optional
28+
description.
29+
30+
Message names must be be in `CamelCase`, with optional periods to separate
31+
components; field names must be in `snake_case`. All table cells, save the
32+
descriptions, are to be wrapped in backticks.
33+
34+
The special field name `_` can be used for reserved/unused fields.
35+
36+
Descriptions should be complete sentences. Although ideally descriptions should
37+
fit in one line, this isn't always possible; if a description would surpass the
38+
80 column limit, it should go on as far as possible, ending in ` |` after the
39+
final sentence's period.
40+
41+
When editing tables, a formatter like http://markdowntable.com/ is recommended.
42+
43+
## Common Types
44+
45+
The vast majority of Cerberus messages' fields are just buffers of bytes, which
46+
are sometimes interpreted as integers. Some buffers are variable length (with
47+
a length prefix) and some are variable length to the end of the message.
48+
49+
The following types are intended to capture these use-cases:
50+
- Fixed-width bit strings, specified as `b` followed by a literal integer.
51+
For example, `b1` is a single bit, `b32` is four bytes, and `b256` is 32
52+
bytes, enough for a SHA2-256 digest. The length of `bN` is `N` bits.
53+
`b0` is well-formed, and specifies that the field is not encoded at all.
54+
The primary value is in conjunction with `align`, discussed below.
55+
56+
- Literal bit strings, specified as C-style hex or binary literals, such as
57+
`0xabcd` or `0b101010`. Binary literals' width is the number of binary digits:
58+
`0b0001` is four bits; leading zeros are significant. Hex literals' width is
59+
four times the number of hex digits: `0x0abcd` is 20 bits. Again, leading
60+
zeros are significant.
61+
62+
- Other message types, specified by that type's name. The type `MyMessage`
63+
indicates the entire representation of `MyMessage` inline. When referring
64+
to another message in a document, if the message contains a common
65+
dot-separated prefix, it may be omited. For example, `Foo.Bar.Baz` may be
66+
referred to as `Baz` inside of `Foo.Bar`, and `Foo.Bar.Baz` may refer to
67+
`Foo.Bar` as `Bar`.
68+
69+
- An enum type mapping, specified as `Map(field)`, where `field` is a previous
70+
field of enum type and `Map` is a type mapping (see below) for that field.
71+
The field is encoded as the type specified by the mapping.
72+
73+
- Fixed-length arrays: any other type followed by `[N]`, where `N` a literal
74+
integer value, consisting of encodings of that type concatenated. For
75+
example, `b8[32]` is equivalent to `b256`, and `MyMessage[2]` is two
76+
`MyMessages` back to back, which may be variable-length if `MyMessage` is
77+
variable-length.
78+
79+
If the type is `b8`, it may be ommited: `[32]` is 32 bytes.
80+
81+
- Variable-length arrays: any other type followed by `[field]`, or
82+
`[Map(field)]` where `field` is a previous field and `Map` is an enum
83+
value mapping (see below). `field`'s contents, as bits, are interpreted as a
84+
little-endian integer, which specifies the number of copies of the type
85+
to follow. For example, if `foo` is a `b16`, then `MyMessage[foo]` is as
86+
many `MyMessage`s as `foo` specifies when viewed as an unsigned, 16-bit
87+
integer.
88+
89+
If the type is `b8`, it may be ommited: `[field]` is `b8[field]`
90+
91+
- Variable-length prefixed arrays: any other type followed by `[bN]`, where
92+
`N` is a literal integer value. `T[bN]` is equivalent to encoding a
93+
`bN` called `length` followed by a `T[length]`. For example, `b8[b16]` is
94+
a sixteen-bit prefix followed by that many bytes.
95+
96+
If the type is `b8`, it may be ommited: `[b16]` is `b8[b16]`
97+
98+
- Variable-length unprefixed arrays: any other type followed by `...`. This
99+
is encoded as as many copies of that type until the end of the message.
100+
For example, `b8...` is all bytes to the end of the message, and
101+
`MyMessage...` means to parse `MyMessage`s until you run out of bytes.
102+
A `...` field must be the last field in the message, and such messages
103+
may only occur as the last field in other messages, and cannot be used to
104+
construct array types.
105+
106+
If the type is `b8`, it may be ommited: `...` is `b8...`
107+
108+
Unless stated otherwise, bytes form integers in little-endian order, and bits
109+
are ordered within bytes according to the underlying transport; messages' sizes
110+
are rounded up to a byte. For example, a message consisting of two `b1` fields
111+
is one byte long, and the fields represent the least- and second-least-significant
112+
bits of that byte.
113+
114+
The following are representative examples taken from the current Challenge
115+
Protocol spec:
116+
117+
`message Challenge.Response`
118+
| Type | Name | Description |
119+
|----------|---------------|---------------------------------------------------|
120+
| `b8` | `slot` | Slot number of the Certificate Chain. |
121+
| `b8` | `slot_mask` | Certificate slot mask. |
122+
| `b8` | `min_version` | Minimum protocol version supported by device. |
123+
| `b8` | `max_version` | Maximum protocol version supported by device. |
124+
| `0x0000` | `_` | Reserved. |
125+
| `b256` | `nonce` | Random 256-bit nonce. |
126+
| `[b8]` | `pmr0` | The contents of PMR0 (aggregated firmware digest).|
127+
| `...` | `signature` | Signature over concatenated request and response payloads. |
128+
129+
`message KeyExchange.Request.PairedKeyHmac`
130+
| Type | Name | Description |
131+
|--------|--------------------|--------------------------------------------|
132+
| `0x01` | `key_type` | The type of key data being sent. |
133+
| `b16` | `pairing_key_len` | Length of the pairing key, in bytes. |
134+
| `...` | `pairing_key_hmac` | HMAC of the pairing key: `HMAC(K_M, K_P)`. |
135+
136+
`message AttestationLogFormat`
137+
| Type | Name | Description |
138+
|----------|---------------------|---------------------------------------------|
139+
| `0x0b` | `header_format` | Header format version. |
140+
| `b16` | `entry_length` | Total length of the entry. |
141+
| `b32` | `unique_id` | A unique identifier for the entry. |
142+
| `b32` | `tcg_type` | The associated TCG event type. |
143+
| `b8` | `measurement_index` | Index of the measurement within the PMR. |
144+
| `b8` | `pmr_index` | Index of the PMR being extended. |
145+
| `0x0000` | `_` | Reserved. |
146+
| `b8` | `digest_count` | Number of digests. |
147+
| `0x0000` | `_` | Reserved. |
148+
| `0x0b` | `digest_algo_id` | Digest algorithm ID, fixed to SHA-256. |
149+
| `b256` | `digest` | SHA-256 digest used to extend the measurement. |
150+
| `[b32]` | `measurement` | The measurement value. |
151+
152+
### Alignment
153+
154+
Field types may be followed by `align(n)`, where `n` is a literal integer. This
155+
specifies that the field must be aligned to an `n`-byte boundary relative to the
156+
start of the message. The padding must be all-zeroes, and may be variable-length
157+
depending on the length of fields that came before. For example:
158+
159+
`message AlignedBuf`
160+
| Type | Name | Description |
161+
|------------------|--------|------------------------------------|
162+
| `b16` | `len` | The buffer length. |
163+
| `[len]` | `buf` | The buffer. |
164+
| `[len] align(4)` | `buf2` | Another buffer but 4-byte-aligned. |
165+
166+
If `len` were `3`, then `buf` would take up bytes `2` through `5`. The next
167+
multiple of `4` is `8`, so there would be three bytes of zero-padding before the
168+
first byte of `buf2`.
169+
170+
`b0 align(n)` may be used as the last field to indicate tail padding.
171+
172+
Alignment is most useful for types that are stored in-memory rather than
173+
deserialized from a byte-stream.
174+
175+
## Enums
176+
177+
Some fields are of a fixed size and take on a small set of values. Just like
178+
`message`s, `enum` names are `CamelCase` with periods, and their values must be
179+
`snake_case`. Values may be in hex or binary, and must all be of the same
180+
width.
181+
182+
`enum GetCertState.CertState`
183+
| Value | Name | Description |
184+
|--------|---------------------|-----------------------------------------------|
185+
| `0x00` | `chain_provisioned` | A valid chain has been provisioned. |
186+
| `0x01` | `chain_missing` | A valid chain has yet to be provisioned. |
187+
| `0x02` | `validating` | The stored chain is currently being validated.|
188+
189+
This can then be used directly in the "Type" column:
190+
191+
`message GetCertState`
192+
| Type | Name | Description |
193+
|-------------|-----------------|----------------------------------------------|
194+
| `CertState` | `cert_state` | The current certificate state. |
195+
| `b32` | `error_details` | Details of an error in certificate validation, if one has occurred. |
196+
197+
An `enum` may be *mapped* to provide an alternative encoding of its values.
198+
If we have an enum like
199+
200+
`enum HashType`
201+
| Value | Name | Description |
202+
|--------|------------|-------------|
203+
| `0b00` | `sha2_256` | SHA2-256. |
204+
| `0b01` | `sha2_324` | SHA2-384. |
205+
| `0b10` | `sha2_512` | SHA2-512. |
206+
207+
We can *map* it like so:
208+
`enum HashLength(HashType)`
209+
| Value | Name |
210+
|-------|------------|
211+
| `32` | `sha2_256` |
212+
| `48` | `sha2_324` |
213+
| `64` | `sha2_512` |
214+
215+
Note the name of the mapped enum in parenthesis. It satisfies the same namespace
216+
rules as a field of a message.
217+
218+
Because the width is not important, decimal integers may be used in addition to
219+
hex or binary. This can then be used to specify a variable-length array:
220+
`b8[HashLength(hash_type)]`.
221+
222+
Maps may also produce types:
223+
224+
`enum Digest(HashType)`
225+
| Type | Name |
226+
|--------|------------|
227+
| `b256` | `sha2_256` |
228+
| `b384` | `sha2_324` |
229+
| `b512` | `sha2_512` |
230+
231+
These may be used as fields directly: `Digest(hash_type)`.
232+
233+
# Specification Changelist
234+
235+
In addition to updating all tables to use the new syntax, a new specification,
236+
the "Cerberus Schema Table Specification", would be added, giving a description
237+
of the above formal language.
238+
239+
# Open Questions
240+
241+
There are a handful of messages this scheme cannot handle, and we need to decide
242+
whether to make those messages conform to it or introduce new options into
243+
the scheme:
244+
- "Get Configuration Ids" needs to sum two length prefixes together.
245+
We could resolve this by having two back-to-back `T[field]`s with the value of
246+
each respective field. Multiplication of length prefixes can be achieved by
247+
`T[field1][field2]`.
248+
- The "Get $Manifest Id" messages have an optional field, as does "Get Recovery
249+
Image Id".
250+
We could resolve this by using `T...` for them, with prose to specify the
251+
default value. We could also add syntax for "until the end but with a maximum
252+
length", e.g. `T[...n]`, but that seems overkill to me.

0 commit comments

Comments
 (0)