containerd 2.1.3: image pull hangs silently on certain registries (range-request not handled)

## Bug Description

On MicroK8s 1.35 with containerd 2.1.3, pulling images from certain OCI-compliant registries hangs indefinitely with no error output. The TCP connection is established and TLS completes successfully, but no data flows — containerd simply stalls waiting for a response that never arrives.

## Environment

- MicroK8s: 1.35
- containerd: 2.1.3
- Nodes: multi-node cluster (control plane + worker nodes)
- Issue appears on worker nodes pulling from external registries

## Symptoms

- `crictl pull` or `microk8s ctr images pull` hangs indefinitely with no error
- The OCI index resolves successfully ("already exists"), but platform manifest fetches stay stuck at "waiting"
- `ss` confirms the TCP connection to the registry is established
- `curl` to the same registry endpoint works correctly, returning proper HTTP responses
- No timeout, no error — just silence in containerd logs
- Kubernetes Pods using an image from the affected registry stay pending forever. I had also this error with Jobs which never complete their init containers, leading to `context deadline exceeded` in Helm pre-install hooks

## Root Cause

containerd 2.1 introduced a **multipart layer fetch** feature that sends `Range: bytes=0-N` HTTP headers to enable parallel downloads. Some registries respond with HTTP `200` (full content) rather than `206 Partial Content` when they do not support or choose to ignore range requests.

containerd 2.1.3 does not handle this case — the fetch goroutines hang indefinitely waiting for a partial-content response that will never come. This is tracked upstream as [containerd/containerd#11864](https://github.com/containerd/containerd/issues/11864).

Three fixes in the upstream `release/2.1` branch are relevant:

| Upstream commit | Description | First included in |
|---|---|---|
| [34a1cb1dd](https://github.com/containerd/containerd/commit/34a1cb1dd) | Deadlock: semaphore not released on error in `dockerFetcher.open()` | v2.1.4 |
| [add2dcf86](https://github.com/containerd/containerd/commit/add2dcf86) | Fetcher doesn't always close response body and call `Release()` | v2.1.4 |
| [ca3de4fe7](https://github.com/containerd/containerd/commit/ca3de4fe7) | Range-get request ignored by registry not surfaced as `errContentRangeIgnored` | v2.1.6 |

The third fix (`ca3de4fe7`) is the most directly relevant: it ensures that when a registry ignores the `Range` header and returns a full `200` response, containerd detects this and falls back gracefully rather than hanging.

## Workaround (confirmed working)

Create a per-host config for the affected registry under `$SNAP_DATA/args/certs.d/`:

```
/var/snap/microk8s/current/args/certs.d/<registry-hostname>/hosts.toml
```

```toml
server = "https://<registry-hostname>"

[host."https://<registry-hostname>"]
  capabilities = ["pull", "resolve"]
  dial_timeout = "30s"
```

Then restart containerd:

```
sudo snap restart microk8s.daemon-containerd
```

## Suggested Fix

Bump the containerd version in [`build-scripts/components/containerd/version.sh`](build-scripts/components/containerd/version.sh) from `v2.1.3` to `v2.1.6` (released 2025-12-17).

```diff
-echo "v2.1.3"
+echo "v2.1.6"
```

No patch changes are needed. The existing `patches/v2.1.3/` directory is automatically selected by the version selector in `build-scripts/print-patches-for.py` for any target version `≥ v2.1.3`, and the sideload patch applies cleanly to `v2.1.6` (it only adds new files with no conflicts).

v2.1.6 also includes an update to the vendored `golang.org/x/net/http2` transport (196 lines changed), which may further improve HTTP/2 reliability with various registries.

## References

- Upstream issue: https://github.com/containerd/containerd/issues/11864
- Fix PR (semaphore deadlock): https://github.com/containerd/containerd/pull/12126
- Fix (range header ignored): commit `ca3de4fe7` on `release/2.1`
- containerd v2.1.6 release: https://github.com/containerd/containerd/releases/tag/v2.1.6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

containerd 2.1.3: image pull hangs silently on certain registries (range-request not handled) #5404

Bug Description

Environment

Symptoms

Root Cause

Workaround (confirmed working)

Suggested Fix

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Upstream commit	Description	First included in
34a1cb1dd	Deadlock: semaphore not released on error in `dockerFetcher.open()`	v2.1.4
add2dcf86	Fetcher doesn't always close response body and call `Release()`	v2.1.4
ca3de4fe7	Range-get request ignored by registry not surfaced as `errContentRangeIgnored`	v2.1.6

containerd 2.1.3: image pull hangs silently on certain registries (range-request not handled) #5404

Description

Bug Description

Environment

Symptoms

Root Cause

Workaround (confirmed working)

Suggested Fix

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions