You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
drivers/base/memory: add node id parameter to add_memory_block()
Patch series "mm/memory_hotplug: fixup crash during uevent handling", v4.
we have some udev rules trying to read the sysfs attribute 'valid_zones'
during an memory 'add' event, causing a crash in zone_for_pfn_range().
Debugging found that mem->nid was set to NUMA_NO_NODE, which crashed in
NODE_DATA(nid). Further analysis revealed that we're running into a race
with udev event processing: add_memory_resource() has this function calls:
1) __try_online_node()
2) arch_add_memory()
3) create_memory_block_devices()
-> calls device_register() -> memory 'add' event
4) node_set_online()/__register_one_node()
-> calls device_register() -> node 'add' event
5) register_memory_blocks_under_node()
-> sets mem->nid
Which, to the uninitated, is ... weird ...
Why do we try to online the node in 1), but only register the node in 4)
_after_ we have created the memory blocks in 3) ? And why do we set the
'nid' value in 5), when the uevent (which might need to see the correct
'nid' value) is sent out in 3) ? There must be a reason, I'm sure ...
So here's a small patchset to fixup uevent ordering. The first patch adds
a 'nid' parameter to add_memory_blocks() (to avoid mem->nid being
initialized with NUMA_NO_NODE), and the second patch reshuffles the code
in add_memory_resource() to fully initialize the node prior to calling
create_memory_block_devices() so that the node is valid at that time and
uevent processing will see correct values in sysfs.
This patch (of 3):
We have some udev rules trying to read the sysfs attribute 'valid_zones'
during an memory 'add' event, causing a crash in zone_for_pfn_range().
Debugging found that mem->nid was set to NUMA_NO_NODE, which crashed in
NODE_DATA(nid). Further analysis revealed that we're running into a race
with udev event processing: add_memory_resource() has this function calls:
1) __try_online_node()
2) arch_add_memory()
3) create_memory_block_devices()
-> calls device_register() -> memory 'add' event
4) node_set_online()/__register_one_node()
-> calls device_register() -> node 'add' event
5) register_memory_blocks_under_node()
-> sets mem->nid
Which, to the uninitated, is ... weird ...
Why do we try to online the node in 1), but only register the node in 4)
_after_ we have created the memory blocks in 3) ? And why do we set the
'nid' value in 5), when the uevent (which might need to see the correct
'nid' value) is sent out in 3) ? There must be a reason, I'm sure ...
So here's a small patchset to fixup uevent ordering. The first patch adds
a 'nid' parameter to add_memory_blocks() (to avoid mem->nid being
initialized with NUMA_NO_NODE), and the second patch reshuffles the code
in add_memory_resource() to fully initialize the node prior to calling
create_memory_block_devices() so that the node is valid at that time and
uevent processing will see correct values in sysfs.
This patch (of 3):
Add a 'nid' parameter to add_memory_block() to initialize the memory block
with the correct node id.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Hannes Reinecke <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Acked-by: Oscar Salvador <[email protected]>
Reviewed-by: Donet Tom <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
0 commit comments