Skip to content

Commit c6a8093

Browse files
Hannes Reineckeakpm00
authored andcommitted
drivers/base/memory: add node id parameter to add_memory_block()
Patch series "mm/memory_hotplug: fixup crash during uevent handling", v4. we have some udev rules trying to read the sysfs attribute 'valid_zones' during an memory 'add' event, causing a crash in zone_for_pfn_range(). Debugging found that mem->nid was set to NUMA_NO_NODE, which crashed in NODE_DATA(nid). Further analysis revealed that we're running into a race with udev event processing: add_memory_resource() has this function calls: 1) __try_online_node() 2) arch_add_memory() 3) create_memory_block_devices() -> calls device_register() -> memory 'add' event 4) node_set_online()/__register_one_node() -> calls device_register() -> node 'add' event 5) register_memory_blocks_under_node() -> sets mem->nid Which, to the uninitated, is ... weird ... Why do we try to online the node in 1), but only register the node in 4) _after_ we have created the memory blocks in 3) ? And why do we set the 'nid' value in 5), when the uevent (which might need to see the correct 'nid' value) is sent out in 3) ? There must be a reason, I'm sure ... So here's a small patchset to fixup uevent ordering. The first patch adds a 'nid' parameter to add_memory_blocks() (to avoid mem->nid being initialized with NUMA_NO_NODE), and the second patch reshuffles the code in add_memory_resource() to fully initialize the node prior to calling create_memory_block_devices() so that the node is valid at that time and uevent processing will see correct values in sysfs. This patch (of 3): We have some udev rules trying to read the sysfs attribute 'valid_zones' during an memory 'add' event, causing a crash in zone_for_pfn_range(). Debugging found that mem->nid was set to NUMA_NO_NODE, which crashed in NODE_DATA(nid). Further analysis revealed that we're running into a race with udev event processing: add_memory_resource() has this function calls: 1) __try_online_node() 2) arch_add_memory() 3) create_memory_block_devices() -> calls device_register() -> memory 'add' event 4) node_set_online()/__register_one_node() -> calls device_register() -> node 'add' event 5) register_memory_blocks_under_node() -> sets mem->nid Which, to the uninitated, is ... weird ... Why do we try to online the node in 1), but only register the node in 4) _after_ we have created the memory blocks in 3) ? And why do we set the 'nid' value in 5), when the uevent (which might need to see the correct 'nid' value) is sent out in 3) ? There must be a reason, I'm sure ... So here's a small patchset to fixup uevent ordering. The first patch adds a 'nid' parameter to add_memory_blocks() (to avoid mem->nid being initialized with NUMA_NO_NODE), and the second patch reshuffles the code in add_memory_resource() to fully initialize the node prior to calling create_memory_block_devices() so that the node is valid at that time and uevent processing will see correct values in sysfs. This patch (of 3): Add a 'nid' parameter to add_memory_block() to initialize the memory block with the correct node id. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Hannes Reinecke <[email protected]> Acked-by: David Hildenbrand <[email protected]> Acked-by: Oscar Salvador <[email protected]> Reviewed-by: Donet Tom <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1 parent 1367da7 commit c6a8093

File tree

1 file changed

+4
-11
lines changed

1 file changed

+4
-11
lines changed

drivers/base/memory.c

Lines changed: 4 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -809,7 +809,7 @@ void memory_block_add_nid(struct memory_block *mem, int nid,
809809
}
810810
#endif
811811

812-
static int add_memory_block(unsigned long block_id, unsigned long state,
812+
static int add_memory_block(unsigned long block_id, int nid, unsigned long state,
813813
struct vmem_altmap *altmap,
814814
struct memory_group *group)
815815
{
@@ -827,7 +827,7 @@ static int add_memory_block(unsigned long block_id, unsigned long state,
827827

828828
mem->start_section_nr = block_id * sections_per_block;
829829
mem->state = state;
830-
mem->nid = NUMA_NO_NODE;
830+
mem->nid = nid;
831831
mem->altmap = altmap;
832832
INIT_LIST_HEAD(&mem->group_next);
833833

@@ -854,13 +854,6 @@ static int add_memory_block(unsigned long block_id, unsigned long state,
854854
return 0;
855855
}
856856

857-
static int add_hotplug_memory_block(unsigned long block_id,
858-
struct vmem_altmap *altmap,
859-
struct memory_group *group)
860-
{
861-
return add_memory_block(block_id, MEM_OFFLINE, altmap, group);
862-
}
863-
864857
static void remove_memory_block(struct memory_block *memory)
865858
{
866859
if (WARN_ON_ONCE(memory->dev.bus != &memory_subsys))
@@ -900,7 +893,7 @@ int create_memory_block_devices(unsigned long start, unsigned long size,
900893
return -EINVAL;
901894

902895
for (block_id = start_block_id; block_id != end_block_id; block_id++) {
903-
ret = add_hotplug_memory_block(block_id, altmap, group);
896+
ret = add_memory_block(block_id, NUMA_NO_NODE, MEM_OFFLINE, altmap, group);
904897
if (ret)
905898
break;
906899
}
@@ -1005,7 +998,7 @@ void __init memory_dev_init(void)
1005998
continue;
1006999

10071000
block_id = memory_block_id(nr);
1008-
ret = add_memory_block(block_id, MEM_ONLINE, NULL, NULL);
1001+
ret = add_memory_block(block_id, NUMA_NO_NODE, MEM_ONLINE, NULL, NULL);
10091002
if (ret) {
10101003
panic("%s() failed to add memory block: %d\n",
10111004
__func__, ret);

0 commit comments

Comments
 (0)