Skip to content

Commit 9927297

Browse files
committed
Add Seccomp Notify support
This adds the specification for Seccomp Userspace Notification and the Golang bindings. This contains: - A new OCI hook "sendSeccompFd" used to pass the seccompfd to an external seccomp agent via the hook. - Additional SeccompState struct containing the container state and file descriptors passed for seccomp. This was discussed in the OCI Weekly Discussion on September 16th, 2020, see: - https://hackmd.io/El8Dd2xrTlCaCG59ns5cwg#September-16-2020 - https://docs.google.com/document/d/1xHw5GQjMj6ZKR-40aKmTWZRkvlPuzMGQRu-YpOFQc30/edit Documentation for this feature: - https://www.kernel.org/doc/html/v5.0/userspace-api/seccomp_filter.html#userspace-notification - man pages: seccomp_user_notif.2 at https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/log/?h=seccomp_user_notif - brauner's blog: https://brauner.github.io/2020/07/23/seccomp-notify.html This PR is an alternative proposal to PR 1038. Signed-off-by: Alban Crequy <[email protected]>
1 parent 1af9934 commit 9927297

File tree

7 files changed

+120
-8
lines changed

7 files changed

+120
-8
lines changed

config-linux.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -633,7 +633,7 @@ The following parameters can be specified to set up seccomp:
633633
* **`names`** *(array of strings, REQUIRED)* - the names of the syscalls.
634634
`names` MUST contain at least one entry.
635635
* **`action`** *(string, REQUIRED)* - the action for seccomp rules.
636-
A valid list of constants as of libseccomp v2.4.0 is shown below.
636+
A valid list of constants as of libseccomp v2.5.0 is shown below.
637637

638638
* `SCMP_ACT_KILL`
639639
* `SCMP_ACT_KILL_PROCESS`
@@ -642,6 +642,7 @@ The following parameters can be specified to set up seccomp:
642642
* `SCMP_ACT_TRACE`
643643
* `SCMP_ACT_ALLOW`
644644
* `SCMP_ACT_LOG`
645+
* `SCMP_ACT_NOTIFY`
645646

646647
* **`errnoRet`** *(uint, OPTIONAL)* - the errno return code to use.
647648
Some actions like `SCMP_ACT_ERRNO` and `SCMP_ACT_TRACE` allow to specify the errno

config.md

Lines changed: 66 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -400,6 +400,11 @@ For POSIX platforms, the configuration structure supports `hooks` for configurin
400400
* Entries in the array have the same schema as `createRuntime` entries.
401401
* The value of `path` MUST resolve in the [runtime namespace](glossary.md#runtime-namespace).
402402
* The `createContainer` hooks MUST be executed in the [container namespace](glossary.md#container-namespace).
403+
* **`sendSeccompFd`** (array of objects, OPTIONAL) is an array of [`sendSeccompFd` hooks](#sendseccompfd).
404+
* Entries in the array have the same schema as `createRuntime` entries.
405+
* The value of `path` MUST resolve in the [runtime namespace](glossary.md#runtime-namespace).
406+
* The `sendSeccompFd` hooks MUST be executed in the [runtime namespace](glossary.md#runtime-namespace).
407+
* The data passed over stdin is the [seccomp state](#seccompstate).
403408
* **`startContainer`** (array of objects, OPTIONAL) is an array of [`startContainer` hooks](#startContainer-hooks).
404409
* Entries in the array have the same schema as `createRuntime` entries.
405410
* The value of `path` MUST resolve in the [container namespace](glossary.md#container-namespace).
@@ -415,7 +420,8 @@ For POSIX platforms, the configuration structure supports `hooks` for configurin
415420

416421
Hooks allow users to specify programs to run before or after various lifecycle events.
417422
Hooks MUST be called in the listed order.
418-
The [state](runtime.md#state) of the container MUST be passed to hooks over stdin so that they may do work appropriate to the current state of the container.
423+
All hooks MUST be passed a data structure over stdin so that they may do work appropriately.
424+
Exect when specified otherwise above, the data structure is the [state](runtime.md#state) of the container.
419425

420426
### <a name="configHooksPrestart" />Prestart
421427

@@ -452,6 +458,57 @@ For example, on Linux this would happen before the `pivot_root` operation is exe
452458

453459
The definition of `createContainer` hooks is currently underspecified and hooks authors, should only expect from the runtime that the mount namespace and different mounts will be setup. Other operations such as cgroups and SELinux/AppArmor labels might not have been performed by the runtime.
454460

461+
### <a name="configHooksSendSeccompFd" />SendSeccompFd Hooks
462+
463+
The `sendSeccompFd` hooks MUST only be called if the seccomp policy contains `SCMP_ACT_NOTIFY`.
464+
465+
The `sendSeccompFd` hooks MUST be called after the [`start`](runtime.md#start) operation is called and after the seccomp policy is installed but [before the user-specified program command is executed](runtime.md#lifecycle).
466+
The `sendSeccompFd` hooks MAY additionally be called while the container is in the [`running` state](runtime.md#runtimeState), for example during an `exec` operation.
467+
The goal of this hook is to pass the seccomp file descriptor to a seccomp agent.
468+
469+
The `sendSeccompFd` hooks' path MUST resolve in the [runtime namespace](glossary.md#runtime-namespace).
470+
The `sendSeccompFd` hooks MUST be executed in the [runtime namespace](glossary.md#runtime-namespace).
471+
472+
### <a name="seccompstate" />The Seccomp State
473+
474+
The seccomp state is a data structure passed via stdin to the SendSeccompFd hooks.
475+
It includes the following properties:
476+
477+
* **`ociVersion`** (string, REQUIRED) is version of the Open Container Initiative Runtime Specification with which the seccomp state complies.
478+
* **`phase`** (string, REQUIRED) is the phase at which the seccomp file descriptor is created.
479+
The value MAY be one of:
480+
481+
* `start`: the seccomp filter is created following the [`start`](runtime.md#start) command
482+
* `exec`: the seccomp filter is created following an `exec` command
483+
484+
Additional values MAY be defined by the runtime, however, they MUST be used to represent new values not defined above.
485+
* **`seccompFd`** (int, REQUIRED) is the file descriptor for Seccomp User Notification passed via process inheritance to the SendSeccompFd hooks.
486+
* **`pid`** (int, REQUIRED) is the process ID on which the seccomp filter is applied. In the `start` phase, this is the same as `state.pid`. In the `exec` phase, this is a different pid than `state.pid`.
487+
* **`pidFd`** (int, OPTIONAL) is a pidfd for the process on which the seccomp filter is applied. This file descriptor is also passed via process inheritance to the SendSeccompFd hooks.
488+
* **`state`** (map, REQUIRED) is the [state](runtime.md#state) of the container.
489+
490+
When serialized in JSON, the format MUST adhere to the following pattern:
491+
492+
```json
493+
{
494+
"ociVersion": "0.2.0",
495+
"phase": "start",
496+
"seccompFd": 3,
497+
"pid": 4422,
498+
"pidFd": 4,
499+
"state": {
500+
"ociVersion": "0.2.0",
501+
"id": "oci-container1",
502+
"status": "running",
503+
"pid": 4422,
504+
"bundle": "/containers/redis",
505+
"annotations": {
506+
"myKey": "myValue"
507+
}
508+
}
509+
}
510+
```
511+
455512
### <a name="configHooksStartContainer" />StartContainer Hooks
456513

457514
The `startContainer` hooks MUST be called [before the user-specified process is executed](runtime.md#lifecycle) as part of the [`start`](runtime.md#start) operation.
@@ -485,6 +542,7 @@ See the below table for a summary of hooks and when they are called:
485542
| `prestart` (Deprecated) | runtime | After the start operation is called but before the user-specified program command is executed. |
486543
| `createRuntime` | runtime | During the create operation, after the runtime environment has been created and before the pivot root or any equivalent operation. |
487544
| `createContainer` | container | During the create operation, after the runtime environment has been created and before the pivot root or any equivalent operation. |
545+
| `sendSeccompFd` | runtime | After the start operation is called but before the user-specified program command is executed. |
488546
| `startContainer` | container | After the start operation is called but before the user-specified program command is executed. |
489547
| `poststart` | runtime | After the user-specified process is executed but before the start operation returns. |
490548
| `poststop` | runtime | After the container is deleted but before the delete operation returns. |
@@ -520,6 +578,13 @@ See the below table for a summary of hooks and when they are called:
520578
"env": [ "key1=value1"]
521579
}
522580
],
581+
"sendSeccompFd": [
582+
{
583+
"path": "/usr/bin/seccomp-agent",
584+
"args": ["seccomp-agent", "--allow-mknods=/dev/null,/dev/net/tun"],
585+
"env": [ "key1=value1"]
586+
}
587+
],
523588
"startContainer": [
524589
{
525590
"path": "/usr/bin/refresh-ldcache"

schema/config-schema.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@
1818
"createContainer": {
1919
"$ref": "defs.json#/definitions/ArrayOfHooks"
2020
},
21+
"sendSeccompFd": {
22+
"$ref": "defs.json#/definitions/ArrayOfHooks"
23+
},
2124
"startContainer": {
2225
"$ref": "defs.json#/definitions/ArrayOfHooks"
2326
},

schema/defs-linux.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,8 @@
6060
"SCMP_ACT_ERRNO",
6161
"SCMP_ACT_TRACE",
6262
"SCMP_ACT_ALLOW",
63-
"SCMP_ACT_LOG"
63+
"SCMP_ACT_LOG",
64+
"SCMP_ACT_NOTIFY"
6465
]
6566
},
6667
"SeccompFlag": {

schema/test/config/good/spec-example.json

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -172,6 +172,13 @@
172172
"env": [ "key1=value1"]
173173
}
174174
],
175+
"sendSeccompFd": [
176+
{
177+
"path": "/usr/bin/seccomp-agent",
178+
"args": ["seccomp-agent", "--allow-mknods=/dev/null,/dev/net/tun"],
179+
"env": [ "key1=value1"]
180+
}
181+
],
175182
"startContainer": [
176183
{
177184
"path": "/usr/bin/refresh-ldcache"

specs-go/config.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -137,6 +137,9 @@ type Hooks struct {
137137
// CreateContainer is a list of hooks to be run after the container has been created but before pivot_root or any equivalent operation has been called
138138
// It is called in the Container Namespace
139139
CreateContainer []Hook `json:"createContainer,omitempty"`
140+
// SendSeccompFd is a list of hooks to be run after a new seccomp fd is created
141+
// It is called in the Runtime Namespace
142+
SendSeccompFd []Hook `json:"sendSeccompFd,omitempty"`
140143
// StartContainer is a list of hooks to be run after the start operation is called but before the container process is started
141144
// It is called in the Container Namespace
142145
StartContainer []Hook `json:"startContainer,omitempty"`
@@ -646,6 +649,7 @@ const (
646649
ActTrace LinuxSeccompAction = "SCMP_ACT_TRACE"
647650
ActAllow LinuxSeccompAction = "SCMP_ACT_ALLOW"
648651
ActLog LinuxSeccompAction = "SCMP_ACT_LOG"
652+
ActNotify LinuxSeccompAction = "SCMP_ACT_NOTIFY"
649653
)
650654

651655
// LinuxSeccompOperator used to match syscall arguments in Seccomp

specs-go/state.go

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,20 +5,22 @@ type ContainerState string
55

66
const (
77
// StateCreating indicates that the container is being created
8-
StateCreating ContainerState = "creating"
8+
StateCreating ContainerState = "creating"
99

1010
// StateCreated indicates that the runtime has finished the create operation
11-
StateCreated ContainerState = "created"
11+
StateCreated ContainerState = "created"
1212

1313
// StateRunning indicates that the container process has executed the
1414
// user-specified program but has not exited
15-
StateRunning ContainerState = "running"
15+
StateRunning ContainerState = "running"
1616

1717
// StateStopped indicates that the container process has exited
18-
StateStopped ContainerState = "stopped"
18+
StateStopped ContainerState = "stopped"
1919
)
2020

21-
// State holds information about the runtime state of the container.
21+
// State holds information about the runtime state of the container. The State
22+
// can be displayed when requested (query state operation); it is also passed
23+
// via stdin to many hooks.
2224
type State struct {
2325
// Version is the version of the specification that is supported.
2426
Version string `json:"ociVersion"`
@@ -33,3 +35,32 @@ type State struct {
3335
// Annotations are key values associated with the container.
3436
Annotations map[string]string `json:"annotations,omitempty"`
3537
}
38+
39+
type SeccompPhase string
40+
41+
const (
42+
// SeccompPhaseStart indicates that the seccomp filter is applied to
43+
// the main process of the container during container start
44+
SeccompPhaseStart SeccompPhase = "start"
45+
46+
// SeccompPhaseExec indicates that the seccomp filter is applied to a
47+
// new process that entered the container while it's running
48+
SeccompPhaseExec SeccompPhase = "exec"
49+
)
50+
51+
type SeccompState struct {
52+
// Version is the version of the specification that is supported.
53+
Version string `json:"ociVersion"`
54+
// Phase indicates whether this seccomp filter is applied during
55+
// container start or on a process that enters the container later on
56+
Phase SeccompPhase `json:"seccompPhase"`
57+
// SeccompFd is the file descriptor for Seccomp User Notification
58+
SeccompFd int `json:"seccompFd"`
59+
// Pid is the process ID on which the seccomp filter is applied
60+
Pid int `json:"pid,omitempty"`
61+
// PidFd is a pidfd for the process on which the seccomp filter is
62+
// applied
63+
PidFd int `json:"pidFd,omitempty"`
64+
// State of the container
65+
State State `json:"state"`
66+
}

0 commit comments

Comments
 (0)