From 37ec118098a864340b45cba3105901cc0a17c9a3 Mon Sep 17 00:00:00 2001 From: "W. Trevor King" Date: Fri, 23 Sep 2016 14:34:26 -0700 Subject: [PATCH] layer: Require ustar (IEEE Std 1003.1-2013) The idea with a spec like this is to define behavior so that different implementations can interoperate reliably. When there is an interop problem betweem implementations A and B, it should be clear from the spec whether implementation A is broken, implementation B is broken, or the spec is insufficiently clear. For example, pax defines 'g' and 'x' typeflags [1] that aren't part of the older ustar (originally defined in IEEE Std 1003.1-1988) [2]. Before this commit, if implementation A produced a layer with a 'g' or 'x' typeflag and implementation B died unpacking it, it was unclear whose fault it was. With this commit, it is clearly A's fault (because it is using features not defined for ustar). If implementation A had produces a layer with an '2' typeflag (which ustar specifies for symlinks) and implementation B died unpacking it, it is B's fault. If implementation A had produces a layer with an 'S' typeflag (which GNU uses for sparse files [3]) and implementation B died unpacking it, it is neither party's fault, because ustar explicitly makes those values implementation-defined. Interop around them is up to out of band communication between the layer author and layer consumer, and is not covered by this spec. The previous "File Types" section listed sockets, but the ustar spec has: Attempts to archive a socket using ustar interchange format shall produce a diagnostic message. And I see no socket entry in Go's set of typeflag constants [4], so I'm not sure how they were supported before. Go has supported pax since v1.1 [5,6], and pax lets you do things (like having symlink targets longer than 100 characters). But we're avoiding requiring support for PAX because of name-recognition issues [7]. [1]: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_02 [2]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#POSIX_ustar_Archives [3]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#gnu-tar-archives [4]: https://golang.org/pkg/archive/tar/#pkg-constants [5]: https://codereview.appspot.com/6700047 [6]: https://github.com/golang/go/commit/106827990466b9246f9395233882b1e816df398a [7]: https://github.com/opencontainers/image-spec/pull/342#issuecomment-249518142 Signed-off-by: W. Trevor King --- layer.md | 87 ++++++++++---------------------------------------------- 1 file changed, 15 insertions(+), 72 deletions(-) diff --git a/layer.md b/layer.md index 97c6bf3c6..df03b050e 100644 --- a/layer.md +++ b/layer.md @@ -6,8 +6,14 @@ This document will use a concrete example to illustrate how to create and consum ## Distributable Format -Layer Changesets for the [mediatype](./media-types.md) `application/vnd.oci.image.layer.tar+gzip` MUST be packaged in a [tar archive][tar-archive] compressed with [gzip][gzip]. -Layer Changesets for the [mediatype](./media-types.md) `application/vnd.oci.image.layer.tar+gzip` MUST NOT include duplicate entries for file paths in the resulting [tar archive][tar-archive]. +Layer changesets have the [media type](media-types.md) `application/vnd.oci.image.layer.tar+gzip`. +Layer changesets SHOULD be packaged in the ustar interchange format, as [specified by IEEE Std 1003.1-2013][ustar], and MUST be compressed using gzip, as [specified by RFC 1952][rfc1952]. +Layer changesets MUST NOT include duplicate entries for target paths (the value computed from `prefix` and `name` in the ustar header). + +Implementations consuming layer changesets MUST be able to unpack both gzip and ustar. +Portable layers SHOULD NOT use features that [ustar][] leaves unspecified, undefined, or implementation-defined. +For example, pax [extends ustar by specifying `typeflag` values `g` and `x`][pax-header], so support for unpacking such entries may be mixed. +[Sparse files](https://en.wikipedia.org/wiki/Sparse_file) SHOULD NOT be used because they [are a GNU extension][tar.5-gnu]. ## Change Types @@ -21,69 +27,6 @@ Additions and Modifications are represented the same in the changeset tar archiv Removals are represented using "[whiteout](#whiteouts)" file entries (See [Representing Changes](#representing-changes)). -### File Types - -Throughout this document section, the use of word "files" or "entries" includes: - -* regular files -* directories -* sockets -* symbolic links -* block devices -* character devices -* FIFOs - -### File Attributes - -Where supported, MUST include file attributes for Additions and Modifications include: - -* Modification Time (`mtime`) -* User ID (`uid`) - * User Name (`uname`) *secondary to `uid`* -* Group ID (`gid `) - * Group Name (`gname`) *secondary to `gid`* -* Mode (`mode`) -* Extended Attributes (`xattrs`) -* Symlink reference (`linkname` + symbolic link type) -* [Hardlink](#hardlinks) reference (`linkname`) - -[Sparse files](https://en.wikipedia.org/wiki/Sparse_file) SHOULD NOT be used because they lack consistent support across tar implementations. - -#### Hardlinks - -Hardlinks are a [POSIX concept](http://pubs.opengroup.org/onlinepubs/9699919799/functions/link.html) for having one or more directory entries for the same file on the same device. -Not all filesystems support hardlinks (e.g. [FAT](https://en.wikipedia.org/wiki/File_Allocation_Table)). - -Hardlinks are possible with all [file types](#file-types) except `directories`. -Non-directory files are considered "hardlinked" when their link count is greater than 1. -Hardlinked files are on a same device (i.e. comparing Major:Minor pair) and have the same inode. -The corresponding files that share the link with the > 1 linkcount may be outside the directory that the changeset is being produced from, in which case the `linkname` is not recorded in the changeset. - -Hardlinks are stored in a tar archive with type of a `1` char, per the [GNU Basic Tar Format][gnu-tar-standard] and [libarchive tar(5)][libarchive-tar]. - -While approaches to deriving new or changed hardlinks may vary, a possible approach is: - -``` -SET LinkMap to map[< Major:Minor String >]map[< inode integer >]< path string > -SET LinkNames to map[< src path string >]< dest path string > -FOR each path in root path - IF path type is directory - CONTINUE - ENDIF - SET filestat to stat(path) - IF filestat num of links == 1 - CONTINUE - ENDIF - IF LinkMap[filestat device][filestat inode] is not empty - SET LinkNames[path] to LinkMap[filestat device][filestat inode] - ELSE - SET LinkMap[filestat device][filestat inode] to path - ENDIF -END FOR -``` - -With this approach, the link map and links names of a directory could be compared against that of another directory to derive additions and changes to hardlinks. - ## Creating ### Initial Root Filesystem @@ -112,7 +55,7 @@ rootfs-c9d-v1/ my-app-tools ``` -The `rootfs-c9d-v1` directory is then created as a plain [tar archive][tar-archive] with relative path to `rootfs-c9d-v1`. +The `rootfs-c9d-v1` directory is then created as a plain [tar archive][ustar] with paths relative to `rootfs-c9d-v1`. Entries for the following files: ``` @@ -127,7 +70,7 @@ Entries for the following files: ### Populate a Comparison Filesystem Create a new directory and initialize it with a copy or snapshot of the prior root filesystem. -Example commands that can preserve [file attributes](#file-attributes) to make this copy are: +Example commands that can preserve [file attributes][ustar] to make this copy are: * [cp(1)](http://linux.die.net/man/1/cp): `cp -a rootfs-c9d-v1/ rootfs-c9d-v1.s1/` * [rsync(1)](http://linux.die.net/man/1/rsync): `rsync -aHAX rootfs-c9d-v1/ rootfs-c9d-v1.s1/` * [tar(1)](http://linux.die.net/man/1/tar): `mkdir rootfs-c9d-v1.s1 && tar --acls --xattrs -C rootfs-c9d-v1/ -c . | tar -C rootfs-c9d-v1.s1/ --acls --xattrs -x` (including `--selinux` where supported) @@ -188,7 +131,7 @@ This reflects the removal of `/etc/my-app-config` and creation of a file and dir ### Representing Changes -A [tar archive][tar-archive] is then created which contains *only* this changeset: +A [tar archive][ustar] is then created which contains *only* this changeset: - Added and modified files and directories in their entirety - Deleted files or directories marked with a [whiteout file](#whiteouts) @@ -315,7 +258,7 @@ Layers that have these restrictions SHOULD be tagged with an alternative mediaty [Descriptors](descriptor.md) referencing these layers MAY include `urls` for downloading these layers. It is implementation-defined whether or not implementations upload layers tagged with this media type. -[libarchive-tar]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#POSIX_ustar_Archives -[gnu-tar-standard]: http://www.gnu.org/software/tar/manual/html_node/Standard.html -[tar-archive]: https://en.wikipedia.org/wiki/Tar_(computing) -[gzip]: http://www.zlib.org/rfc-gzip.html +[ustar]: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_06 +[pax-header]: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_02 +[rfc1952]: https://tools.ietf.org/html/rfc1952 +[tar.5-gnu]: https://github.com/libarchive/libarchive/wiki/ManPageTar5#gnu-tar-archives