Skip to content

Commit d7a5afd

Browse files
authored
Merge pull request #23 from MarkLodato/split
Split specification.md into three separate files.
2 parents 55b006c + 01870d6 commit d7a5afd

File tree

5 files changed

+396
-344
lines changed

5 files changed

+396
-344
lines changed

README.md

Lines changed: 28 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,21 +2,44 @@
22

33
Simple, foolproof standard for signing arbitrary data.
44

5-
* Why not [JOSE/JWS/JWT](https://jwt.io)? JSON-specific, too complicated, too
6-
easy to mess up.
7-
* Why not [PASETO](https://paseto.io)? JSON-specific, too opinionated.
5+
## Features
6+
7+
* Supports arbitrary message encodings, not just JSON.
8+
* Authenticates the message *and* the type to avoid confusion attacks.
9+
* Avoids canonicalization to reduce attack surface.
10+
* Allows any desired crypto primitives or libraries.
11+
12+
See [Background](background.md) for more information, including design
13+
considerations and rationale.
814

915
## What is it?
1016

11-
* [Signature protocol](specification.md)
12-
* [Data structure](specification.md) for storing the message and signatures
17+
Specifications for:
18+
19+
* [Protocol](protocol.md) (*required*)
20+
* [Data structure](envelope.md), a.k.a. "Envelope" (*recommended*)
1321
* (pending #9) Suggested crypto primitives
1422

1523
Out of scope (for now at least):
1624

1725
* Key management / PKI
1826

27+
## Why not...?
28+
29+
* Why not raw signatures? Too fragile.
30+
* Why not [JOSE/JWS/JWT](https://jwt.io)? JSON-specific, too complicated, too
31+
easy to mess up.
32+
* Why not [PASETO](https://paseto.io)? JSON-specific, too opinionated.
33+
* Why not the legacy TUF/in-toto signature scheme? JSON-specific, relies on
34+
canonicalization.
35+
36+
See [Background](background.md) for further motivation.
37+
1938
## Who uses it?
2039

40+
<!-- Reminder: once in-toto and TUF switch to this new format, update the rest
41+
of the docs that currently reference the old format as "current", "existing",
42+
etc. -->
43+
2144
* [in-toto](https://in-toto.io) (pending [ITE-5](https://github.com/in-toto/ITE/pull/13))
2245
* [TUF](https://theupdateframework.io) (pending)

background.md

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
# Background
2+
3+
## What is the intended use case?
4+
5+
This can be used anywhere digital signatures are needed.
6+
7+
The initial application is for signing software supply chain metadata in [TUF]
8+
and [in-toto].
9+
10+
## Why do we need this?
11+
12+
There is no other simple, foolproof signature scheme that we are aware of.
13+
14+
* Raw signatures are too fragile. Every public key must be used for exactly
15+
one purpose over exactly one message type, lest the system be vulnerable to
16+
[confusion attacks](#motivation). In many cases, this results in a difficult
17+
key management problem.
18+
19+
* [TUF] and [in-toto] currently use a scheme that avoids these problems but is
20+
JSON-specific and relies on [canonicalization](motivation.md), which is an
21+
unnecessarily large attack surface.
22+
23+
* [JWS] is JSON-specific, complicated, and error-prone.
24+
25+
* [PASETO] is JSON-specific and too opinionated. For example, it mandates
26+
ed25519 signatures, which may not be useful in all cases.
27+
28+
The intent of this project is to define a minimal signature scheme that avoids
29+
these issues.
30+
31+
## Design requirements
32+
33+
The [protocol](protocol.md):
34+
35+
* MUST reduce the possibility of a client misinterpreting the payload (e.g.
36+
interpreting a JSON message as protobuf)
37+
* MUST support arbitrary payload types (e.g. not just JSON)
38+
* MUST support arbitrary crypto primitives, libraries, and key management
39+
systems (e.g. Tink vs openssl, Google KMS vs Amazon KMS)
40+
* SHOULD avoid depending on canonicalization for security
41+
* SHOULD NOT require unnecessary encoding (e.g. base64)
42+
* SHOULD NOT require the verifier to parse the payload before verifying
43+
44+
The [data structure](envelope.md):
45+
46+
* MUST include both message and signature(s)
47+
* NOTE: Detached signatures are supported by having the included message
48+
contain a cryptographic hash of the external data.
49+
* MUST support multiple signatures in one structure / file
50+
* SHOULD discourage users from reading the payload without verifying the
51+
signatures
52+
* SHOULD be easy to parse using common libraries (e.g. JSON)
53+
* SHOULD support a hint indicating what signing key was used
54+
55+
## Motivation
56+
57+
There are two concerns with the current [in-toto]/[TUF] signature envelope.
58+
59+
First, the signature scheme depends on [Canonical JSON], which has one practical
60+
problem and two theoretical ones:
61+
62+
1. Practical problem: It requires the payload to be JSON or convertible to
63+
JSON. While this happens to be true of in-toto and TUF today, a generic
64+
signature layer should be able to handle arbitrary payloads.
65+
1. Theoretical problem 1: Two semantically different payloads could have the
66+
same canonical encoding. Although there are currently no known attacks on
67+
Canonical JSON, there have been attacks in the past on other
68+
canonicalization schemes
69+
([example](https://latacora.micro.blog/2019/07/24/how-not-to.html#canonicalization)).
70+
It is safer to avoid canonicalization altogether.
71+
1. Theoretical problem 2: It requires the verifier to parse the payload before
72+
verifying, which is both error-prone—too easy to forget to verify—and an
73+
unnecessarily increased attack surface.
74+
75+
The preferred solution is to transmit the encoded byte stream exactly as it was
76+
signed, which the verifier verifies before parsing. This is what is done in
77+
[JWS] and [PASETO], for example.
78+
79+
Second, the scheme does not include an authenticated "context" indicator to
80+
ensure that the signer and verifier interpret the payload in the same exact way.
81+
For example, if in-toto were extended to support CBOR and protobuf encoding, the
82+
signer could get a CI/CD system to produce a CBOR message saying X and then a
83+
verifier to interpret it as a protobuf message saying Y. While we don't know of
84+
an exploitable attack on in-toto or TUF today, potential changes could introduce
85+
such a vulnerability. The signature scheme should be resilient against these
86+
classes of attacks. See [example attack](hypothetical_signature_attack.ipynb)
87+
for more details.
88+
89+
## Reasoning
90+
91+
Our goal was to create a signature envelope that is as simple and foolproof as
92+
possible. Alternatives such as [JWS] are extremely complex and error-prone,
93+
while others such as [PASETO] are overly specific. (Both are also
94+
JSON-specific.) We believe our proposal strikes the right balance of simplicity,
95+
usefulness, and security.
96+
97+
Rationales for specific decisions:
98+
99+
- Why use base64 for payload and sig?
100+
101+
- Because JSON strings do not allow binary data, so we need to either
102+
encode the data or escape it. Base64 is a standard, reasonably
103+
space-efficient way of doing so. Protocols that have a first-class
104+
concept of "bytes", such as protobuf or CBOR, do not need to use base64.
105+
106+
- Why sign raw bytes rather than base64 encoded bytes (as per JWS)?
107+
108+
- Because it's simpler. Base64 is only needed for putting binary data in a
109+
text field, such as JSON. In other formats, such as protobuf or CBOR,
110+
base64 isn't needed at all.
111+
112+
- Why does payloadType need to be signed?
113+
114+
- See [Motivation](#motivation).
115+
116+
- Why use PAE?
117+
118+
- Because we need an unambiguous way of serializing two fields,
119+
payloadType and payload. PAE is already documented and good enough. No
120+
need to reinvent the wheel.
121+
122+
- Why use a URI for payloadType rather than
123+
[Media Type](https://www.iana.org/assignments/media-types/media-types.xhtml)
124+
(a.k.a. MIME type)?
125+
126+
- Because Media Type only indicates how to parse but does not indicate
127+
purpose, schema, or versioning. If it were just "application/json", for
128+
example, then every application would need to impose some "type" field
129+
within the payload, lest we have similar vulnerabilities as if
130+
payloadType were not signed.
131+
- Also, URIs don't need to be registered while Media Types do.
132+
133+
- Why not stay backwards compatible by requiring the payload to always be JSON
134+
with a "_type" field? Then if you want a non-JSON payload, you could simply
135+
have a field that contains the real payload, e.g. `{"_type":"my-thing",
136+
"value":"base64…"}`.
137+
138+
1. It encourages users to add a "_type" field to their payload, which in
139+
turn:
140+
- (a) Ties the payload type to the authentication type. Ideally the
141+
two would be independent.
142+
- (b) May conflict with other uses of that same field.
143+
- (c) May require the user to specify type multiple times with
144+
different field names, e.g. with "@context" for
145+
[JSON-LD](https://json-ld.org/).
146+
2. It would incur double base64 encoding overhead for non-JSON payloads.
147+
3. It is more complex than PAE.
148+
149+
## Backwards Compatibility
150+
151+
Backwards compatibility with the old [in-toto]/[TUF] format will be handled by
152+
the application and explained in the corresponding application-specific change
153+
proposal, namely [ITE-5](https://github.com/in-toto/ITE/pull/13) for in-toto and
154+
via the principles laid out in
155+
[TAP-14](https://github.com/theupdateframework/taps/blob/master/tap14.md) for
156+
TUF.
157+
158+
Verifiers can differentiate between the
159+
[old](https://github.com/in-toto/docs/blob/master/in-toto-spec.md#42-file-formats-general-principles)
160+
and new envelope format by detecting the presence of the `payload` field (new
161+
format) vs `signed` field (old format).
162+
163+
[Canonical JSON]: http://wiki.laptop.org/go/Canonical_JSON
164+
[in-toto]: https://in-toto.io
165+
[JWS]: https://tools.ietf.org/html/rfc7515
166+
[PASETO]: https://github.com/paragonie/paseto/blob/master/docs/01-Protocol-Versions/Version2.md#sig
167+
[TUF]: https://theupdateframework.io

envelope.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# signing-spec Envelope
2+
3+
March 03, 2021
4+
5+
Version 0.1.0
6+
7+
This document describes the recommended data structure for storing signing-spec
8+
signatures, which we call the "JSON Envelope". For the protocol/algorithm, see
9+
[Protocol](protocol.md).
10+
11+
## Standard JSON envelope
12+
13+
The standard data structure for storing a signed message is a JSON message of
14+
the following form, called the "JSON envelope":
15+
16+
```json
17+
{
18+
"payload": "<Base64(SERIALIZED_BODY)>",
19+
"payloadType": "<PAYLOAD_TYPE>",
20+
"signatures": [{
21+
"keyid": "<KEYID>",
22+
"sig": "<Base64(SIGNATURE)>"
23+
}]
24+
}
25+
```
26+
27+
See [Protocol](protocol.md) for a definition of parameters and functions.
28+
29+
Empty fields may be omitted. [Multiple signatures](#multiple-signatures) are
30+
allowed.
31+
32+
Base64() is [Base64 encoding](https://tools.ietf.org/html/rfc4648), transforming
33+
a byte sequence to a unicode string. Either standard or URL-safe encoding is
34+
allowed.
35+
36+
### Multiple signatures
37+
38+
An envelope may have more than one signature, which is equivalent to separate
39+
envelopes with individual signatures.
40+
41+
```json
42+
{
43+
"payload": "<Base64(SERIALIZED_BODY)>",
44+
"payloadType": "<PAYLOAD_TYPE>",
45+
"signatures": [{
46+
"keyid": "<KEYID_1>",
47+
"sig": "<SIG_1>"
48+
}, {
49+
"keyid": "<KEYID_2>",
50+
"sig": "<SIG_2>"
51+
}]
52+
}
53+
```
54+
55+
## Other data structures
56+
57+
The standard envelope is JSON message with an explicit `payloadType`.
58+
Optionally, applications may encode the signed message in other methods without
59+
invalidating the signature:
60+
61+
- An encoding other than JSON, such as CBOR or protobuf.
62+
- Use a default `payloadType` if omitted and/or code `payloadType` as a
63+
shorter string or enum.
64+
65+
At this point we do not standardize any other encoding. If a need arises, we may
66+
do so in the future.

0 commit comments

Comments
 (0)