Skip to content

Commit ce14e9e

Browse files
committed
libcontainer/SPEC.md: add documentation for Intel RDT/CAT
Signed-off-by: Xiaochen Shen <[email protected]>
1 parent ad228cf commit ce14e9e

File tree

1 file changed

+84
-0
lines changed

1 file changed

+84
-0
lines changed

libcontainer/SPEC.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,90 @@ that no processes or threads escape the cgroups. This sync is
154154
done via a pipe ( specified in the runtime section below ) that the container's
155155
init process will block waiting for the parent to finish setup.
156156

157+
**intelRdt**:
158+
Intel platforms with new Xeon CPU support Intel Resource Director Technology
159+
(RDT). Cache Allocation Technology (CAT) is a sub-feature of RDT, which
160+
currently supports L3 cache resource allocation.
161+
162+
This feature provides a way for the software to restrict cache allocation to a
163+
defined 'subset' of L3 cache which may be overlapping with other 'subsets'.
164+
The different subsets are identified by class of service (CLOS) and each CLOS
165+
has a capacity bitmask (CBM).
166+
167+
It can be used to handle L3 cache resource allocation for containers if
168+
hardware and kernel support Intel RDT/CAT.
169+
170+
In Linux kernel, it is exposed via "resource control" filesystem, which is a
171+
"cgroup-like" interface.
172+
173+
Comparing with cgroups, it has similar process management lifecycle and
174+
interfaces in a container. But unlike cgroups' hierarchy, it has single level
175+
filesystem layout.
176+
177+
Intel RDT "resource control" filesystem hierarchy:
178+
```
179+
mount -t resctrl resctrl /sys/fs/resctrl
180+
tree /sys/fs/resctrl
181+
/sys/fs/resctrl/
182+
|-- info
183+
| |-- L3
184+
| |-- cbm_mask
185+
| |-- min_cbm_bits
186+
| |-- num_closids
187+
|-- cpus
188+
|-- schemata
189+
|-- tasks
190+
|-- <container_id>
191+
|-- cpus
192+
|-- schemata
193+
|-- tasks
194+
195+
```
196+
197+
For runc, we can make use of `tasks` and `schemata` configuration for L3 cache
198+
resource constraints.
199+
200+
The file `tasks` has a list of tasks that belongs to this group (e.g.,
201+
<container_id>" group). Tasks can be added to a group by writing the task ID
202+
to the "tasks" file (which will automatically remove them from the previous
203+
group to which they belonged). New tasks created by fork(2) and clone(2) are
204+
added to the same group as their parent. If a pid is not in any sub group, it
205+
is in root group.
206+
207+
The file `schemata` has allocation masks/values for L3 cache on each socket,
208+
which contains L3 cache id and capacity bitmask (CBM).
209+
```
210+
Format: "L3:<cache_id0>=<cbm0>;<cache_id1>=<cbm1>;..."
211+
```
212+
For example, on a two-socket machine, L3's schema line could be `L3:0=ff;1=c0`
213+
Which means L3 cache id 0's CBM is 0xff, and L3 cache id 1's CBM is 0xc0.
214+
215+
The valid L3 cache CBM is a *contiguous bits set* and number of bits that can
216+
be set is less than the max bit. The max bits in the CBM is varied among
217+
supported Intel Xeon platforms. In Intel RDT "resource control" filesystem
218+
layout, the CBM in a group should be a subset of the CBM in root. Kernel will
219+
check if it is valid when writing. e.g., 0xfffff in root indicates the max bits
220+
of CBM is 20 bits, which mapping to entire L3 cache capacity. Some valid CBM
221+
values to set in a group: 0xf, 0xf0, 0x3ff, 0x1f00 and etc.
222+
223+
For more information about Intel RDT/CAT kernel interface:
224+
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/x86/intel_rdt_ui.txt
225+
226+
An example for runc:
227+
```
228+
There are two L3 caches in the two-socket machine, the default CBM is 0xfffff
229+
and the max CBM length is 20 bits. This configuration assigns 4/5 of L3 cache
230+
id 0 and the whole L3 cache id 1 for the container:
231+
232+
"linux": {
233+
"resources": {
234+
"intelRdt": {
235+
"l3CacheSchema": "L3:0=ffff0;1=fffff"
236+
}
237+
}
238+
}
239+
```
240+
157241
### Security
158242

159243
The standard set of Linux capabilities that are set in a container

0 commit comments

Comments
 (0)