Skip to content

"shim disconnected" error from dind container causing long initialization #620

@jgrabenstein

Description

@jgrabenstein

We have codefresh runtime installed using helm (cf-runtime-8.3.1) from artifacthub.io, on a GKE cluster (1.32.6-gke.1060000).

Sometimes our runs of pipelines in codefresh take an extremely long time in the "initializing process" (validating connection to docker daemon) step. It seems to retry automatically after 10+ minutes but will still fail and retry again. Other times, they run normally. This is causing a lot of frustration and delay.

From the logs, I see the dind container throwing some messages that seem related.

time="2025-09-15T15:39:08.919776421Z" level=info msg="shim disconnected" id=a3e103d17bb7421d6178970f350e8d703cbc505297c4e50d63bceca1bc4ae471 namespace=moby
time="2025-09-15T15:39:08.919838401Z" level=warning msg="cleaning up after shim disconnected" id=a3e103d17bb7421d6178970f350e8d703cbc505297c4e50d63bceca1bc4ae471 namespace=moby
time="2025-09-15T15:39:08.919851111Z" level=info msg="cleaning up dead shim" namespace=moby
time="2025-09-15T15:39:08.919968911Z" level=info msg="ignoring event" container=a3e103d17bb7421d6178970f350e8d703cbc505297c4e50d63bceca1bc4ae471 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete"

There is also this message from the engine container:

fn: inspect, attempt: 1 failed with: Error: (HTTP code 404) no such container - No such container: bcea459935492295070c1c12933ac385f6d17a6838cbdba2a7d6dc9cd169dcd3  (transaction: 82932591-72b8-4a1e-a920-0085922e2828). throwing error

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions