|
| 1 | +# Blockfile Snapshotter |
| 2 | + |
| 3 | +The blockfile snapshotter uses raw block files for each snapshot. Block files are |
| 4 | +copied from a parent or base empty block file. Mounting requires a virtual machine |
| 5 | +or support for loopback mounts. |
| 6 | + |
| 7 | +## Use Case |
| 8 | + |
| 9 | +Snapshotters serve the purpose of extracting an image from the OCI image store and |
| 10 | +creating a snapshot that is useful to containers. It handles setting up the |
| 11 | +underlying infrastructure, such as preparing a directory or other filesystem setup, |
| 12 | +applying the layers to create a single mountable directory to serve as the container |
| 13 | +base, and mounting into the container upon start. |
| 14 | + |
| 15 | +The most commonly used snapshotter is the overlayfs snapshotter, which is the default |
| 16 | +in containerd. The overlayfs snapshotter provides a directory on the host filesystem, |
| 17 | +which then is bind-mounted into the container. |
| 18 | + |
| 19 | +The blockfile snapshotter targets a use case where the container will run inside a |
| 20 | +VM. Specifically, the OCI image will be the filesystem for the container, like with |
| 21 | +a normal container, but the container itself will be run inside a VM. |
| 22 | +Since the VM cannot bind-mount directories from the host, the blockfile snapshotter |
| 23 | +creates a block device for the snapshot, which can be attached to the VM as a block |
| 24 | +device to facilitate getting the contents into the guest. |
| 25 | + |
| 26 | +## Alternatives |
| 27 | + |
| 28 | +There are alternatives to the blockfile snapshotter for mounting directories into a |
| 29 | +VM. One alternative is a [virtiofs](https://virtio-fs.gitlab.io) driver, |
| 30 | +assuming your VMM supports it. Similarly, you can use |
| 31 | +[9p](https://www.kernel.org/doc/Documentation/filesystems/9p.txt) to mount a local |
| 32 | +directory into the VM, assuming your VMM supports it. |
| 33 | + |
| 34 | +Additionally, the [devicemapper snapshotter](./devmapper.md) can be used to create |
| 35 | +snapshots on filesystem images in a devicemapper thin-pool. |
| 36 | + |
| 37 | +## Usage |
| 38 | + |
| 39 | +### Checking if the blockfile snapshotter is available |
| 40 | + |
| 41 | +To check if the blockfile snapshotter is available, run the following command: |
| 42 | + |
| 43 | +```bash |
| 44 | +$ ctr plugins ls | grep blockfile |
| 45 | +``` |
| 46 | + |
| 47 | +### Configuration |
| 48 | + |
| 49 | +To configure the snapshotter, you can use the following configuration options |
| 50 | +in your containerd `config.toml`. Don't forget to restart it after changing the |
| 51 | +configuration. |
| 52 | + |
| 53 | +```toml |
| 54 | + [plugins.'io.containerd.snapshotter.v1.blockfile'] |
| 55 | + scratch_file = "/opt/containerd/blockfile" |
| 56 | + root_path = "/somewhere/on/disk" |
| 57 | + fs_type = 'ext4' |
| 58 | + mount_options = [] |
| 59 | + recreate_scratch = true |
| 60 | +``` |
| 61 | + |
| 62 | +- `root_path`: The directory where the block files are stored. This directory must be writable by the containerd process. |
| 63 | +- `scratch_file`: The path to the empty file that will be used as the base for the block files. This file should exist before first using the snapshotter. |
| 64 | +- `fs_type`: The filesystem type to use for the block files. Currently supported are `ext4` and `xfs`. |
| 65 | +- `mount_options`: Additional mount options to use when mounting the block files. |
| 66 | +- `recreate_scratch`: If set to `true`, the snapshotter will recreate the scratch file if it is missing. If set to `false`, the snapshotter will fail if the scratch file is missing. |
| 67 | + |
| 68 | +### Creating the scratch file |
| 69 | + |
| 70 | +You can create a scratch file as follows. This example uses a 500MB scratch file. |
| 71 | + |
| 72 | +```bash |
| 73 | +$ # make a 500M file |
| 74 | +$ dd if=/dev/zero of=/opt/containerd/blockfile bs=1M count=500 |
| 75 | +500+0 records in |
| 76 | +500+0 records out |
| 77 | +524288000 bytes (524 MB, 500 MiB) copied, 1.76253 s, 297 MB/s |
| 78 | + |
| 79 | +$ # format the file with ext4 |
| 80 | +$ sudo mkfs.ext4 /opt/containerd/blockfile |
| 81 | +mke2fs 1.47.0 (5-Feb-2023) |
| 82 | +Discarding device blocks: done |
| 83 | +Creating filesystem with 512000 1k blocks and 128016 inodes |
| 84 | +Filesystem UUID: d9947ecc-722d-4627-9cf9-fa2a3b622106 |
| 85 | +Superblock backups stored on blocks: |
| 86 | + 8193, 24577, 40961, 57345, 73729, 204801, 221185, 401409 |
| 87 | + |
| 88 | +Allocating group tables: done |
| 89 | +Writing inode tables: done |
| 90 | +Creating journal (8192 blocks): done |
| 91 | +Writing superblocks and filesystem accounting information: done |
| 92 | +``` |
| 93 | + |
| 94 | +### Running a container |
| 95 | + |
| 96 | +To run a container using the blockfile snapshotter, you need to specify the |
| 97 | +snapshotter: |
| 98 | + |
| 99 | +```bash |
| 100 | +$ # ensure that the image we are using exists; it is a regular OCI image |
| 101 | +$ ctr image pull docker.io/library/busybox:latest |
| 102 | +$ # run the container with the provides snapshotter |
| 103 | +$ ctr run -rm -t --snapshotter blockfile docker.io/library/busybox:latest hello sh |
| 104 | +``` |
| 105 | + |
| 106 | +To use it via the go client API, it is identical to using any other snapshotter: |
| 107 | + |
| 108 | +```go |
| 109 | +import ( |
| 110 | + "context" |
| 111 | + "github.com/containerd/containerd" |
| 112 | + "github.com/containerd/containerd/snapshots" |
| 113 | +) |
| 114 | + |
| 115 | +// create a new client |
| 116 | +client, err := containerd.New("/run/containerd/containerd.sock") |
| 117 | +snapshotter := "blockfile" |
| 118 | +cOpts := []containerd.NewContainerOpts{ |
| 119 | + containerd.WithImage(image), |
| 120 | + containerd.WithImageConfigLabels(image), |
| 121 | + containerd.WithAdditionalContainerLabels(labels), |
| 122 | + containerd.WithSnapshotter(snapshotter) |
| 123 | +} |
| 124 | +container, err := client.NewContainer(ctx, containerID, cOpts...) |
| 125 | +``` |
| 126 | + |
| 127 | +## How It Works |
| 128 | + |
| 129 | +The blockfile snapshotter functions similarly to other snapshotters. |
| 130 | +It unpacks each individual layer from a container image, with each layer unpack |
| 131 | +building on the content from its parent(s). |
| 132 | + |
| 133 | +The blockfile snapshotter is unique in two ways: |
| 134 | + |
| 135 | +1. It applies layers inside a disk image file, rather than on the host filesystem. |
| 136 | +1. It creates a block image file for each layer, applying the previous on top of it. |
| 137 | + |
| 138 | +Rather than a single directory with the contents, the end of the blockfile |
| 139 | +snapshotter's process is a single file, which has the contents of the full |
| 140 | +filesystem image. That image file can be loopback mounted, or attached to a virtual |
| 141 | +machine. |
| 142 | + |
| 143 | +For every layer the snapshotter creates a new blockfile, starting with a copy of the |
| 144 | +blockfile from the previous layer. If there is no previous layer, i.e. for the first |
| 145 | +layer, it copies the scratch file. |
| 146 | + |
| 147 | +For example, for an image with 3 layers - called A, B, C - the process is as follows: |
| 148 | + |
| 149 | +1. Layer A: |
| 150 | + 1. Copy the scratch file to a new blockfile for layer A. |
| 151 | + 1. Loopback-mount the blockfile for layer A. |
| 152 | + 1. Apply layer A to the mount. |
| 153 | + 1. Unmount the blockfile for layer A. |
| 154 | +1. Layer B: |
| 155 | + 1. Copy the blockfile for layer A to a new blockfile for layer B. |
| 156 | + 1. Loopback-mount the blockfile for layer B. |
| 157 | + 1. Apply layer B to the mount. |
| 158 | + 1. Unmount the blockfile for layer B. |
| 159 | +1. Layer C: |
| 160 | + 1. Copy the blockfile for layer B to a new blockfile for layer C. |
| 161 | + 1. Loopback-mount the blockfile for layer C. |
| 162 | + 1. Apply layer C to the mount. |
| 163 | + 1. Unmount the blockfile for layer C. |
| 164 | + |
| 165 | +Each unpack of a layer builds upon the contents of the previous layers into a new |
| 166 | +blockfile. This completes with the final blockfile containing the full filesystem |
| 167 | +image. |
| 168 | + |
| 169 | +As a result of the process, each layer leads to another blockfile in the system: |
| 170 | + |
| 171 | +1. Layer A blockfile: contents of layer A |
| 172 | +1. Layer B blockfile: contents of layer A + layer B |
| 173 | +1. Layer C blockfile: contents of layer A + layer B + layer C |
| 174 | + |
| 175 | +If available in the underlying filesystem and the host OS, the process uses |
| 176 | +sparse file support whenever available. This means that the blockfiles only take |
| 177 | +up the space required for the actual content. |
| 178 | + |
| 179 | +For example, if the scratch image is 500MB, and each layer adds 25MB, then the |
| 180 | +file sizes will be: |
| 181 | + |
| 182 | +1. Layer A blockfile: 25MB from layer A |
| 183 | +1. Layer B blockfile: 50MB from layer A and B |
| 184 | +1. Layer C blockfile: 75MB from layer A, B, and C |
| 185 | + |
| 186 | +Total space usage thus is 25+50+75=150MB. This is a fraction of the amount |
| 187 | +required if each layer's blockfile used the full 500MB, i.e. 1500MB in total. |
0 commit comments