Skip to content

Exec plugin from container image #1162

@danwinship

Description

@danwinship

One of the arguments for daemonization (#821) is that copying binaries out of container images onto the local disk is tricky and insecure. So here's an alternative. (Which... is not actually fully an alternative to #821... maybe it would make sense to do both.)

  • The runtime creates a directory /var/run/cni at startup.
  • The user uses kubelet/CRI/whatever to start up a "network plugin" container. The definition of this container/pod includes mounting a single file out of the container image into /var/run/cni on the host. Something like /var/run/cni/mynetwork.conf.
  • The elements of plugins in this CNI conf file may have a containerPath element (in addition to type), which refers to a binary in the container's filesystem.
  • The runtime chooses a default CNI plugin via whatever means it likes (just like today). (This may involve combining data from /var/run/cni and /etc/cni/net.d, for instance, so that everything still works just like it used to for systems where there is no network plugin using this new system.)
  • Everything else works mostly just like before, except that for CNI conf files that use containerPath, the runtime executes the CNI binary out of the container's mount namespace, rather than in the host mount namespace. (Since the runtime is the one that mounted the CNI conf files out of the container(s) into /var/run/cni in the first place, it knows which mount namespaces are associated with which conf files.)
  • Runtimes that support this feature are required to also support STATUS, and must consider the network plugin ready only when it responds successfully to a STATUS call. (It can't consider the network ready as soon as the conf file appears, because it's guaranteed the conf file will appear before the plugin is even running.)
  • When the container exits, the mount for its conf file in /var/run/cni is automatically destroyed... which is nice in some ways, but it will screw up the network status during container restarts, so we need to figure out something there...
  • A network plugin that wants to remain backward compatible with runtimes that don't support this new feature can just check whether it has received any STATUS calls yet at the point when it has finished starting up, and if the answer is "no", then it can assume the runtime is ignoring its /var/run/cni conf file, and instead copy a traditional conf file and binary out into the host filesystem at that point just like it would have before.

The combination of using /var/run for the conf file and using the container image for the binary, means the network plugin doesn't need to modify anything actually on the host filesystem, which may help people who want immutable host filesystems.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions