Skip to content

Commit f17fdaf

Browse files
authored
Merge pull request #608 from jonboulle/master
config: minor cleanup
2 parents 52d4836 + d081d47 commit f17fdaf

File tree

1 file changed

+8
-6
lines changed

1 file changed

+8
-6
lines changed

config.md

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Changing it means creating a new derived image, instead of changing the existing
2626
### Layer DiffID
2727

2828
A layer DiffID is the digest over the layer's uncompressed tar archive and serialized in the descriptor digest format, e.g., `sha256:a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9`.
29-
Layers must be packed and unpacked reproducibly to avoid changing the layer DiffID, for example by using tar-split to save the tar headers.
29+
Layers must be packed and unpacked reproducibly to avoid changing the layer DiffID, for example by using [tar-split][] to save the tar headers.
3030

3131
NOTE: Do not confuse DiffIDs with [layer digests](manifest.md#image-manifest-property-descriptions), often referenced in the manifest, which are digests over compressed or uncompressed content.
3232

@@ -50,19 +50,20 @@ For this, we define the binary `|` operation to be the result of applying the ri
5050
For example, given base layer `A` and a changeset `B`, we refer to the result of applying `B` to `A` as `A|B`.
5151

5252
Above, we define the `ChainID` for a single layer (`L₀`) as equivalent to the `DiffID` for that layer.
53-
Otherwise, the `ChainID` for `L₀|...|Lₙ₋₁|Lₙ` is defined as recursion `Digest(ChainID(L₀|...|Lₙ₋₁) + " " + DiffID(Lₙ))`.
53+
Otherwise, the `ChainID` for a set of applied layers (`L₀|...|Lₙ₋₁|Lₙ`) is defined as the recursion `Digest(ChainID(L₀|...|Lₙ₋₁) + " " + DiffID(Lₙ))`.
5454

5555
#### Explanation
5656

5757
Let's say we have layers A, B, C, ordered from bottom to top, where A is the base and C is the top.
5858
Defining `|` as a binary application operator, the root filesystem may be `A|B|C`.
5959
While it is implied that `C` is only useful when applied to `A|B`, the identifier `C` is insufficient to identify this result, as we'd have the equality `C = A|B|C`, which isn't true.
6060

61-
The main issue is when we have two definitions of `C`, `C = C` and `C = A|B|C`. If this is true (with some handwaving), `C = x|C` where `x = any application` must be true.
61+
The main issue is when we have two definitions of `C`, `C = C` and `C = A|B|C`.
62+
If this is true (with some handwaving), `C = x|C` where `x = any application` must be true.
6263
This means that if an attacker can define `x`, relying on `C` provides no guarantee that the layers were applied in any order.
6364

6465
The `ChainID` addresses this problem by being defined as a compound hash.
65-
__We differentiate the changeset `C`, from the order dependent application `A|B|C` by saying that the resulting rootfs is identified by ChainID(A|B|C), which can be calculated by `ImageConfig.rootfs`.__
66+
__We differentiate the changeset `C`, from the order-dependent application `A|B|C` by saying that the resulting rootfs is identified by ChainID(A|B|C), which can be calculated by `ImageConfig.rootfs`.__
6667

6768
Let's expand the definition of `ChainID(A|B|C)` to explore its internal structure:
6869

@@ -72,7 +73,7 @@ ChainID(A|B) = Digest(ChainID(A) + " " + DiffID(B))
7273
ChainID(A|B|C) = Digest(ChainID(A|B) + " " + DiffID(C))
7374
```
7475

75-
We can replace the each definition and reduce to a single equality:
76+
We can replace each definition and reduce to a single equality:
7677

7778
```
7879
ChainID(A|B|C) = Digest(Digest(DiffID(A) + " " + DiffID(B)) + " " + DiffID(C))
@@ -85,7 +86,7 @@ Most importantly, we can easily see that `ChainID(C) != ChainID(A|B|C)`, otherwi
8586

8687
Each image's ID is given by the SHA256 hash of its [configuration JSON](#image-json).
8788
It is represented as a hexadecimal encoding of 256 bits, e.g., `sha256:a9561eb1b190625c9adb5a9513e72c4dedafc1cb2d4c5236c9a6957ec7dfd5a9`.
88-
Since the [configuration JSON](#image-json) that gets hashed references hashes of each layer in the image, this formulation of the ImageID makes images content-addresable.
89+
Since the [configuration JSON](#image-json) that gets hashed references hashes of each layer in the image, this formulation of the ImageID makes images content-addressable.
8990

9091
## Properties
9192

@@ -276,3 +277,4 @@ Here is an example image configuration JSON document:
276277

277278
[rfc3339-s5.6]: https://tools.ietf.org/html/rfc3339#section-5.6
278279
[runtime-platform]: https://github.com/opencontainers/runtime-spec/blob/v1.0.0-rc3/config.md#platform
280+
[tar-split]: https://github.com/vbatts/tar-split

0 commit comments

Comments
 (0)