Add a test that important files come early in extracted images #1964
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In a few places, we make some assumptions that calling ggcr's
mutate.Extractand scanning for specific files will not end up reading the entire image contents.For a refresher,
mutate.Extractreturns aReadCloserthat can be passed totar.NewReaderand iterated over to get file contents in a layered image. It reads the top-most layer first, in order, collects.wh.files to possibly whiteout lower-layered file paths, then proceeds down the layers until the end.It's relatively common to
Extract, iterate through files until you find one you like, andrc.Close(), which exits the goroutine. If you were reading aremote.Imageover the wire from the registry, or a localtarball.Imageordaemon.Image, you'll only read layers until you find the one you care about, and can save quite a lot of time and resources if the file you care about is "early" in the extracted tar.Apko purposefully puts its "first-party" files (those not from packages) into the top-most layer, upfront in the tar, to make these operations as cheap as possible.
This has historically just been a convention, but it's a nice convention, and I thought it might be useful to have a test that ensures we don't accidentally undo that in the future, e.g., if we refactor how layering works, or add other layering strategies.
The test as-is checks that these files are found in the first ~4KB of the image, rather than at the end of a possibly gigantic tar stream.