Skip to content

[nydus] Add empty layer in OCI image when merging platforms#352

Merged
imeoer merged 2 commits intogoharbor:mainfrom
Fricounet:fricounet/empty-layer-oci
Sep 22, 2025
Merged

[nydus] Add empty layer in OCI image when merging platforms#352
imeoer merged 2 commits intogoharbor:mainfrom
Fricounet:fricounet/empty-layer-oci

Conversation

@Fricounet
Copy link
Copy Markdown
Contributor

Add an empty tar.gz layer at the beginning of the OCI image that's referrenced in the nydus merged-manifest compared to the original OCI one. This empty layer, being the first layer, will force the OCI image to have a completely different chainID from its original counterpart. This is done so that containerd and other runtimes don't try to reuse the OCI layers for other images that share layers with the original OCI variant. The drawback being that nydus-merged images can't share layers with pure OCI images when they are pulled on non-nydus clients.

The choice to use an empty tar.gz layer as the new first layer is because:

  • this is a format understood by every runtime (on the other hand, docker doesn't understand layers that use application/vnd.oci.empty.v1+json)
  • an empty layer won't change the unpacked digest of the final image
  • it shouldn't be possible for a regular image (not built using nydus merge-manifest) to start wtih an empty tar.gzip. That's because they would need to be based on scratch and any layer they would add will necessary change something on the filesystem so it won't be an empty change

Testing

Testing this change should show that the final nydus image doesn't cause layer reuse issues:

  • I built 2 images with shared layers:
    image-a:
FROM ubuntu:20.04

RUN echo "This is a shared base layer" > /shared-content.txt && \
    echo "Common configuration" > /etc/shared-config.conf && \
    mkdir -p /shared-data

RUN echo "#!/bin/bash" > /usr/local/bin/shared-script.sh && \
    echo "echo 'This script is in the shared layers'" >> /usr/local/bin/shared-script.sh && \
    chmod +x /usr/local/bin/shared-script.sh

WORKDIR /app
docker buildx build --push . -f a.dockerfile -t localhost:5000/image-a

image-b:

FROM ubuntu:20.04

RUN echo "This is a shared base layer" > /shared-content.txt && \
    echo "Common configuration" > /etc/shared-config.conf && \
    mkdir -p /shared-data

RUN echo "#!/bin/bash" > /usr/local/bin/shared-script.sh && \
    echo "echo 'This script is in the shared layers'" >> /usr/local/bin/shared-script.sh && \
    chmod +x /usr/local/bin/shared-script.sh

WORKDIR /app

# NYDUS-SPECIFIC content (should NOT appear in OCI image)
RUN echo "This content belongs ONLY to the nydus image" > /nydus-specific.txt && \
    echo "NYDUS_APP=true" > /etc/nydus-env.conf && \
    mkdir -p /nydus-only-data && \
    echo "nydus-secret-data" > /nydus-only-data/secret.txt

# More NYDUS-SPECIFIC content
RUN echo "#!/bin/bash" > /usr/local/bin/nydus-only-script.sh && \
    echo "echo 'This script should ONLY be in nydus image'" >> /usr/local/bin/nydus-only-script.sh && \
    echo "echo 'If you see this in OCI image, there is a bug!'" >> /usr/local/bin/nydus-only-script.sh && \
    chmod +x /usr/local/bin/nydus-only-script.sh

CMD ["/usr/local/bin/nydus-only-script.sh"]
docker buildx build --push . -f b.dockerfile -t localhost:5000/image-b

Converting image-b to nydus with the current release of nydusify

nydusify convert --source localhost:5000/image-b:latest --target-suffix -nydus-broken --merge-platform

Then pulling both images with containerd:

$ crictl pull localhost:5000/image-b:latest-nydus-broken
Image is up to date for sha256:2a4a57339629c3a53d5d044da23ab0979d166f849ac28520eb3af1e7c2832855

$ crictl pull localhost:5000/image-a:latest
Image is up to date for sha256:d05f40f709a10b870d375bae75fd943a703bd887a455578a3e3bb29c304f0fa8

$ ctr snapshots tree
 sha256:470b66ea5123c93b0d5606e4213bf9e47d3d426b640d32472e4ac213186c4bb6
  \_ sha256:034b1b090fefd084928e43e3a32d024041ffa59f8307a7513a0b46c53a3a31c3
    \_ sha256:c65e7e76437bad248b44d8ea4196ab440cc5b231b7c38809e00fce0ac37f8cd7
      \_ sha256:9a9712ef67bac530a8428adb7e4e0ae4e0fa23602f0f14e19bd087961614e97f
        \_ sha256:c930a734441e91855717e2b7c1bdf9fc2ec13296334f7fa0ea86063342941122
          \_ sha256:06cfbab4be39d591fd0542df0a3e026c888f8d720d95ccb7a80ad014b685da70

The final result is that both images share layers

On the other hand, if we use a nydusify built with these changes:

./cmd/nydusify convert --source localhost:5000/image-b:latest --target-suffix -nydus --merge-platform

When we pull both images

$ crictl pull localhost:5000/image-b:latest-nydus
Image is up to date for sha256:a2972175f46d9015bdbc6d92fee2f2f090b8efecaaf418c6b0a5ab577d862c5f

$ crictl pull localhost:5000/image-a:latest
Image is up to date for sha256:d05f40f709a10b870d375bae75fd943a703bd887a455578a3e3bb29c304f0fa8

$ ctr snapshots tree
 sha256:470b66ea5123c93b0d5606e4213bf9e47d3d426b640d32472e4ac213186c4bb6
  \_ sha256:034b1b090fefd084928e43e3a32d024041ffa59f8307a7513a0b46c53a3a31c3
    \_ sha256:c65e7e76437bad248b44d8ea4196ab440cc5b231b7c38809e00fce0ac37f8cd7
      \_ sha256:9a9712ef67bac530a8428adb7e4e0ae4e0fa23602f0f14e19bd087961614e97f
 sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef
  \_ sha256:d278c8223e446d9d48226816284703bc9f5bc740596f13cfaee8b0f1926d475f
    \_ sha256:0d2b93084b15a90b4094675332e316d1f03ed7e347db966ba440c97ec7ec39d2
      \_ sha256:1c7c78e8f561e7aeb383e2ac5498946db43a0042ab1d74ad12b4fe94308b378c
        \_ sha256:ab6d7b03ec1e9b7a565027ad657af410183f9a1a10be8e24fe30b430f11d10d1
          \_ sha256:c9d67654a4c9f3e9e9c2f35be0cda03b206fa45cdf7085766d648c00ba4d3d24
            \_ sha256:f5d2915eb668300ef855de26612e74001575510fbd99017d6b7cec8d0504a4fa

The first tree corresponds to image-a (sha match with above)
The second tree is the nydus image which now has different digests and 1 additional layer whose chainID 5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef match the value we expect from the empty tar.gz

Add an empty tar.gz layer at the beginning of the OCI image that's referrenced in the nydus merged-manifest compared to the original OCI one. This empty layer, being the first layer, will force the OCI image to have a completely different chainID from its original counterpart. This is done so that containerd and other runtimes don't try to reuse the OCI layers for other images that share layers with the original OCI variant.
The drawback being that nydus-merged images can't share layers with pure OCI images when they are pulled on non-nydus clients.

The choice to use an empty tar.gz layer as the new first layer is because:
- this is a format understood by every runtime (on the other hand, docker doesn't understand layers that use `application/vnd.oci.empty.v1+json`)
- an empty layer won't change the unpacked digest of the final image
- it shouldn't be possible for a regular image (not built using nydus merge-manifest) to start wtih an empty tar.gzip. That's because they would need to be based on `scratch` and any layer they would add will necessary change something on the filesystem so it won't be an empty change

Signed-off-by: Baptiste Girard-Carrabin <baptiste.girardcarrabin@datadoghq.com>
@Fricounet Fricounet force-pushed the fricounet/empty-layer-oci branch from 9ad6d04 to 7413411 Compare September 17, 2025 11:23
@Fricounet
Copy link
Copy Markdown
Contributor Author

Maybe I should hide this behind an additional flag?

emptyDescriptor := ocispec.Descriptor{
MediaType: emptyLayerMediaType,
Digest: digest.FromBytes(emptyDescriptorBytes),
Size: int64(len(emptyDescriptorBytes)),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should add an annotation to indicate that the layer is a special empty layer? (not block this PR :))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@imeoer good point, I've added it in 7e7bbfd as it was easy to do

Copy link
Copy Markdown
Collaborator

@imeoer imeoer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM!

@imeoer
Copy link
Copy Markdown
Collaborator

imeoer commented Sep 19, 2025

Maybe I should hide this behind an additional flag?

It's okay if we only apply this on mergeManifest.

Set the annotation`containerd.io/snapshot/nydus-empty-layer: "true"` on the empty layer that's added in the OCI image. Just to be able to find it easily.

Signed-off-by: Baptiste Girard-Carrabin <baptiste.girardcarrabin@datadoghq.com>
@Fricounet Fricounet force-pushed the fricounet/empty-layer-oci branch from 287a45b to 7e7bbfd Compare September 19, 2025 11:44
@imeoer imeoer merged commit e62d1e0 into goharbor:main Sep 22, 2025
9 checks passed
@imeoer
Copy link
Copy Markdown
Collaborator

imeoer commented Sep 22, 2025

@Fricounet Thanks! Tagged github.com/goharbor/acceleration-service v0.2.21.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants