Skip to content

Commit 1fd6483

Browse files
committed
Split specification.md into three separate files.
Previously, the specification did not cleanly separate the (required) protocol from the (optional) encoding, and it also contained a lot of supplemental background material. Now the protocol, encoding (envelope), and background are three separate files. This should hopefully make the spec easier to use.
1 parent 55b006c commit 1fd6483

File tree

5 files changed

+391
-344
lines changed

5 files changed

+391
-344
lines changed

README.md

Lines changed: 22 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,20 +2,37 @@
22

33
Simple, foolproof standard for signing arbitrary data.
44

5-
* Why not [JOSE/JWS/JWT](https://jwt.io)? JSON-specific, too complicated, too
6-
easy to mess up.
7-
* Why not [PASETO](https://paseto.io)? JSON-specific, too opinionated.
5+
## Features
6+
7+
* Supports arbitrary message encodings, not just JSON.
8+
* Authenticates the message *and* the type to avoid confusion attacks.
9+
* Avoids canonicalization to reduce attack surface.
10+
* Allows any desired crypto primitives or libraries.
11+
12+
See [Background](background.md) for more information, including design
13+
considerations and rationale.
814

915
## What is it?
1016

11-
* [Signature protocol](specification.md)
12-
* [Data structure](specification.md) for storing the message and signatures
17+
Specifications for:
18+
19+
* [Protocol](protocol.md) (*required*)
20+
* [Data structure](envelope.md), a.k.a. "Envelope" (*recommended*)
1321
* (pending #9) Suggested crypto primitives
1422

1523
Out of scope (for now at least):
1624

1725
* Key management / PKI
1826

27+
## Why not...?
28+
29+
* Why not raw signatures? Too fragile.
30+
* Why not [JOSE/JWS/JWT](https://jwt.io)? JSON-specific, too complicated, too
31+
easy to mess up.
32+
* Why not [PASETO](https://paseto.io)? JSON-specific, too opinionated.
33+
34+
See [Background](background.md) for further motivation.
35+
1936
## Who uses it?
2037

2138
* [in-toto](https://in-toto.io) (pending [ITE-5](https://github.com/in-toto/ITE/pull/13))

background.md

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
# Background
2+
3+
## What is the intended use case?
4+
5+
This can be used anywhere digital signatures are needed.
6+
7+
The initial application is for signing software supply chain metadata in [TUF]
8+
and [in-toto].
9+
10+
## Why do we need this?
11+
12+
There is no other simple, foolproof signature scheme that we are aware of.
13+
14+
* Raw signatures are too fragle. Every public key must be used for exactly one
15+
purpose over exactly one message type, lest the system be vulnerable to
16+
[confusion attacks](#motivation). In many cases, this results in a difficult
17+
key management problem.
18+
19+
* [TUF] and [in-toto] currently use a scheme that avoids these problems but is
20+
JSON-specific and relies on [canonicalization](motivation.md), which is an
21+
unnecessarily large attack surface.
22+
23+
* [JWS] is JSON-specific, complicated, and error-prone.
24+
25+
* [PASETO] is JSON-specific and too opinionated. For example, it mandates
26+
ed25519 signatures, which may not be useful in all cases.
27+
28+
The intent of this project is to define a minimal signature scheme that avoids
29+
these issues.
30+
31+
## Design requirements
32+
33+
The [protocol](protocol.md):
34+
35+
* MUST reduce the possibility of a client misinterpreting the payload (e.g.
36+
interpreting a JSON message as protobuf)
37+
* MUST support arbitrary payload types (e.g. not just JSON)
38+
* MUST support arbitrary crypto primitives, libraries, and key management
39+
systems (e.g. Tink vs openssl, Google KMS vs Amazon KMS)
40+
* SHOULD avoid depending on canonicalization for security
41+
* SHOULD NOT require unnecessary encoding (e.g. base64)
42+
* SHOULD NOT require the verifier to parse the payload before verifying
43+
44+
The [data structure](encoding.md):
45+
46+
* MUST include both message and signature(s)
47+
* NOTE: Detached signatures are supported by having the included message
48+
contain a cryptographic hash of the external data.
49+
* MUST support multiple signatures in one structure / file
50+
* SHOULD discourage users from reading the payload without verifying the
51+
signatures
52+
* SHOULD be easy to parse using common libraries (e.g. JSON)
53+
* SHOULD support a hint indicating what signing key was used
54+
55+
## Motivation
56+
57+
There are two concerns with the current [in-toto]/[TUF] signature envelope.
58+
59+
First, the signature scheme depends on [Canonical JSON], which has one practical
60+
problem and two theoretical ones:
61+
62+
1. Practical problem: It requires the payload to be JSON or convertible to
63+
JSON. While this happens to be true of in-toto and TUF today, a generic
64+
signature layer should be able to handle arbitrary payloads.
65+
1. Theoretical problem 1: Two semantically different payloads could have the
66+
same canonical encoding. Although there are currently no known attacks on
67+
Canonical JSON, there have been attacks in the past on other
68+
canonicalization schemes
69+
([example](https://latacora.micro.blog/2019/07/24/how-not-to.html#canonicalization)).
70+
It is safer to avoid canonicalization altogether.
71+
1. Theoretical problem 2: It requires the verifier to parse the payload before
72+
verifying, which is both error-prone—too easy to forget to verify—and an
73+
unnecessarily increased attack surface.
74+
75+
The preferred solution is to transmit the encoded byte stream exactly as it was
76+
signed, which the verifier verifies before parsing. This is what is done in
77+
[JWS] and [PASETO], for example.
78+
79+
Second, the scheme does not include an authenticated "context" indicator to
80+
ensure that the signer and verifier interpret the payload in the same exact way.
81+
For example, if in-toto were extended to support CBOR and Protobuf encoding, the
82+
signer could get a CI/CD system to produce a CBOR message saying X and then a
83+
verifier to interpret it as a protobuf message saying Y. While we don't know of
84+
an exploitable attack on in-toto or TUF today, potential changes could introduce
85+
such a vulnerability. The signature scheme should be resilient against these
86+
classes of attacks. See [example attack](hypothetical_signature_attack.ipynb)
87+
for more details.
88+
89+
## Reasoning
90+
91+
Our goal was to create a signature envelope that is as simple and foolproof as
92+
possible. Alternatives such as [JWS] are extremely complex and error-prone,
93+
while others such as [PASETO] are overly specific. (Both are also
94+
JSON-specific.) We believe our proposal strikes the right balance of simplicity,
95+
usefulness, and security.
96+
97+
Rationales for specific decisions:
98+
99+
- Why use base64 for payload and sig?
100+
101+
- Because JSON strings do not allow binary data, so we need to either
102+
encode the data or escape it. Base64 is a standard, reasonably
103+
space-efficient way of doing so. Protocols that have a first-class
104+
concept of "bytes", such as protobuf or CBOR, do not need to use base64.
105+
106+
- Why sign raw bytes rather than base64 encoded bytes (as per JWS)?
107+
108+
- Because it's simpler. Base64 is only needed for putting binary data in a
109+
text field, such as JSON. In other formats, such as protobuf or CBOR,
110+
base64 isn't needed at all.
111+
112+
- Why does payloadType need to be signed?
113+
114+
- See [Motivation](#motivation).
115+
116+
- Why use PAE?
117+
118+
- Because we need an unambiguous way of serializing two fields,
119+
payloadType and payload. PAE is already documented and good enough. No
120+
need to reinvent the wheel.
121+
122+
- Why use a URI for payloadType rather than
123+
[Media Type](https://www.iana.org/assignments/media-types/media-types.xhtml)
124+
(a.k.a. MIME type)?
125+
126+
- Because Media Type only indicates how to parse but does not indicate
127+
purpose, schema, or versioning. If it were just "application/json", for
128+
example, then every application would need to impose some "type" field
129+
within the payload, lest we have similar vulnerabilities as if
130+
payloadType were not signed.
131+
- Also, URIs don't need to be registered while Media Types do.
132+
133+
- Why not stay backwards compatible by requiring the payload to always be JSON
134+
with a "_type" field? Then if you want a non-JSON payload, you could simply
135+
have a field that contains the real payload, e.g. `{"_type":"my-thing",
136+
"value":"base64…"}`.
137+
138+
1. It encourages users to add a "_type" field to their payload, which in
139+
turn:
140+
- (a) Ties the payload type to the authentication type. Ideally the
141+
two would be independent.
142+
- (b) May conflict with other uses of that same field.
143+
- (c) May require the user to specify type multiple times with
144+
different field names, e.g. with "@context" for
145+
[JSON-LD](https://json-ld.org/).
146+
2. It would incur double base64 encoding overhead for non-JSON payloads.
147+
3. It is more complex than PAE.
148+
149+
## Backwards Compatibility
150+
151+
Backwards compatibility with the old [in-toto]/[TUF] format will be handled by
152+
the application and explained in the corresponding application-specific change
153+
proposal, namely [ITE-5](https://github.com/in-toto/ITE/pull/13) for in-toto and
154+
via the principles laid out in
155+
[TAP-14](https://github.com/theupdateframework/taps/blob/master/tap14.md) for
156+
TUF.
157+
158+
Verifiers can differentiate between the
159+
[old](https://github.com/in-toto/docs/blob/master/in-toto-spec.md#42-file-formats-general-principles)
160+
and new envelope format by detecting the presence of the `payload` field (new
161+
format) vs `signed` field (old format).
162+
163+
[Canonical JSON]: http://wiki.laptop.org/go/Canonical_JSON
164+
[in-toto]: https://in-toto.io
165+
[JWS]: https://tools.ietf.org/html/rfc7515
166+
[PASETO]: https://github.com/paragonie/paseto/blob/master/docs/01-Protocol-Versions/Version2.md#sig
167+
[TUF]: https://theupdateframework.io

envelope.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# signing-spec Envelope
2+
3+
March 03, 2021
4+
5+
Version 0.1.0
6+
7+
This document describes the recommended data structure for storing signing-spec
8+
signatures, which we call the "JSON Envelope". For the protocol/algorithm, see
9+
[Protocol](protocol.md).
10+
11+
## Standard JSON envelope
12+
13+
The standard data structure for storing a signed message is a JSON message of
14+
the following form, called the "JSON envelope":
15+
16+
```json
17+
{
18+
"payload": "<Base64(SERIALIZED_BODY)>",
19+
"payloadType": "<PAYLOAD_TYPE>",
20+
"signatures": [{
21+
"keyid": "<KEYID>",
22+
"sig": "<Base64(SIGNATURE)>"
23+
}]
24+
}
25+
```
26+
27+
See [Protocol](protocol.md) for a definition of parameters and functions.
28+
29+
Empty fields may be omitted. [Multiple signatures](#multiple-signatures) are
30+
allowed.
31+
32+
Base64() is [Base64 encoding](https://tools.ietf.org/html/rfc4648), transforming
33+
a byte sequence to a unicode string. Either standard or URL-safe encoding is
34+
allowed.
35+
36+
### Multiple signatures
37+
38+
An envelope may have more than one signature, which is equivalent to separate
39+
envelopes with individual signatures.
40+
41+
```json
42+
{
43+
"payload": "<Base64(SERIALIZED_BODY)>",
44+
"payloadType": "<PAYLOAD_TYPE>",
45+
"signatures": [{
46+
"keyid": "<KEYID_1>",
47+
"sig": "<SIG_1>"
48+
}, {
49+
"keyid": "<KEYID_2>",
50+
"sig": "<SIG_2>"
51+
}]
52+
}
53+
```
54+
55+
## Other data structures
56+
57+
The standard envelope is JSON message with an explicit `payloadType`.
58+
Optionally, applications may encode the signed message in other methods without
59+
invalidating the signature:
60+
61+
- An encoding other than JSON, such as CBOR or Protobuf.
62+
- Use a default `payloadType` if omitted and/or code `payloadType` as a
63+
shorter string or enum.
64+
65+
At this point we do not standardize any other encoding. If a need arises, we may
66+
do so in the future.

0 commit comments

Comments
 (0)