Skip to content

Commit ee9707e

Browse files
Waiman-Longhtejun
authored andcommitted
cgroup/cpuset: Enable memory migration for cpuset v2
When a user changes cpuset.cpus, each task in a v2 cpuset will be moved to one of the new cpus if it is not there already. For memory, however, they won't be migrated to the new nodes when cpuset.mems changes. This is an inconsistency in behavior. In cpuset v1, there is a memory_migrate control file to enable such behavior by setting the CS_MEMORY_MIGRATE flag. Make it the default for cpuset v2 so that we have a consistent set of behavior for both cpus and memory. There is certainly a cost to make memory migration the default, but it is a one time cost that shouldn't really matter as long as cpuset.mems isn't changed frequenty. Update the cgroup-v2.rst file to document the new behavior and recommend against changing cpuset.mems frequently. Since there won't be any concurrent access to the newly allocated cpuset structure in cpuset_css_alloc(), we can use the cheaper non-atomic __set_bit() instead of the more expensive atomic set_bit(). Signed-off-by: Waiman Long <[email protected]> Acked-by: Johannes Weiner <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
1 parent e7cc988 commit ee9707e

File tree

2 files changed

+16
-1
lines changed

2 files changed

+16
-1
lines changed

Documentation/admin-guide/cgroup-v2.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2056,6 +2056,17 @@ Cpuset Interface Files
20562056
The value of "cpuset.mems" stays constant until the next update
20572057
and won't be affected by any memory nodes hotplug events.
20582058

2059+
Setting a non-empty value to "cpuset.mems" causes memory of
2060+
tasks within the cgroup to be migrated to the designated nodes if
2061+
they are currently using memory outside of the designated nodes.
2062+
2063+
There is a cost for this memory migration. The migration
2064+
may not be complete and some memory pages may be left behind.
2065+
So it is recommended that "cpuset.mems" should be set properly
2066+
before spawning new tasks into the cpuset. Even if there is
2067+
a need to change "cpuset.mems" with active tasks, it shouldn't
2068+
be done frequently.
2069+
20592070
cpuset.mems.effective
20602071
A read-only multiple values file which exists on all
20612072
cpuset-enabled cgroups.

kernel/cgroup/cpuset.c

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2761,12 +2761,16 @@ cpuset_css_alloc(struct cgroup_subsys_state *parent_css)
27612761
return ERR_PTR(-ENOMEM);
27622762
}
27632763

2764-
set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
2764+
__set_bit(CS_SCHED_LOAD_BALANCE, &cs->flags);
27652765
nodes_clear(cs->mems_allowed);
27662766
nodes_clear(cs->effective_mems);
27672767
fmeter_init(&cs->fmeter);
27682768
cs->relax_domain_level = -1;
27692769

2770+
/* Set CS_MEMORY_MIGRATE for default hierarchy */
2771+
if (cgroup_subsys_on_dfl(cpuset_cgrp_subsys))
2772+
__set_bit(CS_MEMORY_MIGRATE, &cs->flags);
2773+
27702774
return &cs->css;
27712775
}
27722776

0 commit comments

Comments
 (0)