powerpc: Set nr_cpus=16 for kdump kernel #97

sourabhjains · 2025-07-16T09:55:05Z

Configure the kdump kernel with nr_cpus=16 to enable multi-threading in the makedumpfile core collector, allowing faster dump collection.

Commit d428557 ("Use all available CPUs to collect dump") enabled the use of multi-threading for makedumpfile core collector.

There are two reasons for choosing nr_cpus=16:

Multiple experiments show that optimal performance is achieved when nr_cpus is between 16 and 30. The graph in commit d428557 supports this.
nr_cpus=16 is already used for the fadump kernel, so using the same value for kdump to maintain consistency.

gemini-code-assist

Summary of Changes

Hello @sourabhjains, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request optimizes the kdump process on PowerPC systems by adjusting the kdump kernel's CPU configuration. By increasing the number of available CPUs, the makedumpfile utility can operate in a multi-threaded manner, leading to faster and more efficient crash dump collection. This change also ensures consistency with existing fadump configurations.

Highlights

Performance Enhancement: The primary objective of this change is to significantly improve the performance of the makedumpfile core collector during kdump operations by enabling it to leverage multi-threading more effectively.
Kernel Parameter Update: The nr_cpus kernel boot parameter for the kdump kernel on ppc64le systems has been increased from 1 to 16. This allows the kdump kernel to utilize more CPU cores for dump collection.
Configuration Consistency: This update also brings the nr_cpus setting for the kdump kernel in line with the existing configuration for the fadump kernel, both now set to 16 for consistency.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request updates the nr_cpus parameter for the kdump kernel on ppc64le from 1 to 16, aiming to improve dump collection performance and ensure consistency with the fadump configuration.

gemini-code-assist · 2025-07-16T09:56:59Z

gen-kdump-sysconfig.sh

 		"hugepages hugepagesz slub_debug quiet log_buf_len swiotlb hugetlb_cma ignition.firstboot"
 	update_param KDUMP_COMMANDLINE_APPEND \
-		"irqpoll nr_cpus=1 noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd hugetlb_cma=0"
+		"irqpoll nr_cpus=16 noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd hugetlb_cma=0"


The value 16 for nr_cpus is duplicated on line 105. To improve maintainability, define this value as a variable at the beginning of the ppc64le) case block and reuse it in both update_param calls.

Suggested change

"irqpoll nr_cpus=16 noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd hugetlb_cma=0"

PPC_NR_CPUS=16

update_param KDUMP_COMMANDLINE_REMOVE \

"hugepages hugepagesz slub_debug quiet log_buf_len swiotlb hugetlb_cma ignition.firstboot"

update_param KDUMP_COMMANDLINE_APPEND \

"irqpoll nr_cpus=${PPC_NR_CPUS} noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd hugetlb_cma=0"

One thing that comes to mind is nr_cpus enlarges PER_CPU area, which risks OOM. So we should consider how to adjust the crashkernel for PPC at the same time.

Thanks

The PER_CPU memory usage for the kdump kernel on RHEL9 and RHEL10 is as follows:

nr_cpus=1 :  7 MB
nr_cpus=16 : 14 MB

Even with nr_cpus=1, the number of CPUs online in the kdump kernel was 8. This might be due to SMT, which could explain why the PER_CPU usage is 7 MB for nr_cpus=1 and 14 MB for nr_cpus=16.

No idea to sync nr_cpus in both kdump_get_arch_recommend_crashkernel() and gen-kdump-sysconfig.sh.

What about use nr_cpus * PER_CPU in kdump_get_arch_recommend_crashkernel() to calculate the crashkernel value and leave notes in the config file to remind the user to update the crashkernel value when modifying the nr_cpus

@coiby @licliu , please share your thoughts on my idea.

Thanks

pfliu · 2025-07-18T01:34:47Z

I suggest to add crashkernel by nr_cpus * PER_CPU_AREA_SIZE in kdump_get_arch_recommend_crashkernel() for PPC. Where the default nr_cpus is 16 here.

sourabhjains · 2025-07-25T09:35:22Z

Thanks for the suggestion @pfliu
I will do the necessary changes and update the PR.

pfliu · 2025-08-22T12:37:00Z

@sourabhjains, what about:

diff --git a/gen-kdump-sysconfig.sh b/gen-kdump-sysconfig.sh
index eb5287b..a756597 100755
--- a/gen-kdump-sysconfig.sh
+++ b/gen-kdump-sysconfig.sh
@@ -29,10 +29,12 @@ KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet log_buf_len swio
 
 # This variable lets us append arguments to the current kdump commandline
 # after processed by KDUMP_COMMANDLINE_REMOVE
-KDUMP_COMMANDLINE_APPEND="irqpoll maxcpus=1 reset_devices novmcoredd cma=0 hugetlb_cma=0"
+# PER_CPU_AREA roughly equals 1MB. Remember to keep nr_cpus moderate
+KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 reset_devices novmcoredd cma=0 hugetlb_cma=0"
 
 # This variable lets us append arguments to fadump (powerpc) capture kernel,
 # further to the parameters passed via the bootloader.
+# PER_CPU_AREA roughly equals 1MB. Remember to keep nr_cpus moderate
 FADUMP_COMMANDLINE_APPEND=""
 
 # Any additional kexec arguments required.  In most situations, this should
diff --git a/kdump-lib.sh b/kdump-lib.sh
index a3ef956..6d1e28c 100755
--- a/kdump-lib.sh
+++ b/kdump-lib.sh
@@ -1004,8 +1004,12 @@ _crashkernel_add()
 kdump_get_arch_recommend_crashkernel()
 {
 	local _arch _ck_cmdline _dump_mode
+	local _pcpu_area=0
+	# the basic formula is based on nr_cpus=1
+	local nr_cpus=1
 	local _delta=0
 
+
 	if [[ -z $1 ]]; then
 		if is_fadump_capable; then
 			_dump_mode=fadump
@@ -1047,6 +1051,11 @@ kdump_get_arch_recommend_crashkernel()
 			has_mlx5 && ((_delta += 150))
 		fi
 	elif [[ $_arch == "ppc64le" ]]; then
+		# per cpu area takes 1MB
+		_pcpu_area=1
+		nr_cpus=16
+
+		_delta=$(( _delta + _pcpu_area * nr_cpus ))
 		if [[ $_dump_mode == "fadump" ]]; then
 			_ck_cmdline="4G-16G:768M,16G-64G:1G,64G-128G:2G,128G-1T:4G,1T-2T:6G,2T-4T:12G,4T-8T:20G,8T-16T:36G,16T-32T:64G,32T-64T:128G,64T-:180G"
 		else

sourabhjains · 2025-08-23T18:22:09Z

@pfliu I have used your approach for additional crashkernel memory calculation, but instead of hard-coding the nr_cpus value, I used a dynamic approach.

I added a function in kdumpctl to parse the configured nr_cpus value and passed it to the respective functions so that it is considered while calculating the crashkernel value.

The additional memory calculation is currently applicable only to PowerPC, but it can be easily extended if other architectures decide to increase their nr_cpus value in the future.

Please review my latest patches and let me if you any review comment. And apologies for delayed response.

sourabhjains · 2025-08-23T18:27:35Z

@pfliu How do you prepare rpm package out of upstream source?

I see a spec file in the source but I didn't find Makefile that can create tar out of source.

I had a hard time testing these changes.

coiby · 2025-09-01T09:31:03Z

/packit build

coiby · 2025-09-01T09:53:12Z

Hi @sourabhjains,

Sorry to hear you have difficultly testing the changes. Do you think among the following three methods, which way works best for you?

Let Github Action create RPM for you
FYI, the rpm building job has been finished for Fedora-41 and Fedora-rawhide. You shall be able to download the PRMs now.
Run "make install" to install the files directly
You can use "git archive" to create a source tarball, copy it to the target machine, unpack it and then run make install
Create RPM by yourself
tools/run-integration-tests.sh has the code to build a RPM. If you prefer this, I'll extract the code and put it into Makefile.

sourabhjains · 2025-09-02T03:49:57Z

Hi @sourabhjains,

Sorry to hear you have difficultly testing the changes. Do you think among the following three methods, which way works best for you?

Let Github Action create RPM for you
FYI, the rpm building job has been finished for Fedora-41 and Fedora-rawhide. You shall be able to download the PRMs now.

Run "make install" to install the files directly
You can use "git archive" to create a source tarball, copy it to the target machine, unpack it and then run make install

Create RPM by yourself
tools/run-integration-tests.sh has the code to build a RPM. If you prefer this, I'll extract the code and put it into Makefile.

Thanks @coiby for providing the steps to build the RPM on our own. Although I have verified this patch, I will create the RPM and try again for RHEL.

sourabhjains · 2025-09-03T16:30:50Z

@coiby Thanks for sharing tools/run-integration-tests.sh I am able to create tar and rpm package.

I tested this patch on RHEL10 and things are working as expected.

sourabhjains · 2025-09-03T16:34:59Z

Create RPM by yourself
tools/run-integration-tests.sh has the code to build a RPM. If you prefer this, I'll extract the code and put it into Makefile.

@coiby Yes, having a Makefile target to generate the kdump-util tar will be helpful.

pfliu · 2025-09-10T10:39:44Z

@sourabhjains ,

@pfliu I have used your approach for additional crashkernel memory calculation, but instead of hard-coding the nr_cpus value, I used a dynamic approach.

Great. That is what I expect.

I added a function in kdumpctl to parse the configured nr_cpus value and passed it to the respective functions so that it is considered while calculating the crashkernel value.

I suggest to hide the detail in kdump_get_arch_recommend_crashkernel() instead of passing nr_cpus through param since currently we handle smmu, mlx5 in it.

What about calling find_nr_cpus() directly from kdump_get_arch_recommend_crashkernel().

Beside is it enough 1M per cpu? If dumping through ssh, does the makedumpile consume more memory in multi-thread?

The additional memory calculation is currently applicable only to PowerPC, but it can be easily extended if other architectures decide to increase their nr_cpus value in the future.

Please review my latest patches and let me if you any review comment. And apologies for delayed response.

Np. I really like your solution. I am looking forward to the final version

Thanks.

coiby · 2025-09-19T02:11:24Z

Create RPM by yourself
tools/run-integration-tests.sh has the code to build a RPM. If you prefer this, I'll extract the code and put it into Makefile.

@coiby Yes, having a Makefile target to generate the kdump-util tar will be helpful.

To clarify, do you need a Makefile target to create kdump-utils tar or rpm?

sourabhjains · 2025-10-13T08:57:53Z

Create RPM by yourself
tools/run-integration-tests.sh has the code to build a RPM. If you prefer this, I'll extract the code and put it into Makefile.

@coiby Yes, having a Makefile target to generate the kdump-util tar will be helpful.

To clarify, do you need a Makefile target to create kdump-utils tar or rpm?

target to build rpm. But if that is too much of work I am even happy with target to build tar.

sourabhjains · 2025-10-13T14:46:00Z

@pfliu Apologies for the late response - I was caught up with some personal things.

@pfliu I have used your approach for additional crashkernel memory calculation, but instead of hard-coding the nr_cpus value, I used a dynamic approach.

Great. That is what I expect.

I added a function in kdumpctl to parse the configured nr_cpus value and passed it to the respective functions so that it is considered while calculating the crashkernel value.

I suggest to hide the detail in kdump_get_arch_recommend_crashkernel() instead of passing nr_cpus through param since currently we handle smmu, mlx5 in it.

What about calling find_nr_cpus() directly from kdump_get_arch_recommend_crashkernel().

Yeah it is better to keep find_nr_cpus() in kdump-lib.sh file itself.

The only things is I need to know the dump mode and command-line argument in kdump-lib.sh. Hope it is fine to access DEFAULT_DUMP_MODE and KDUMP_COMMANDLINE_APPEND/FADUMP_COMMANDLINE_APPEND in kdump-lib.sh

Beside is it enough 1M per cpu? If dumping through ssh, does the makedumpile consume more memory in multi-thread?

I will verify it before posting the next version.

The additional memory calculation is currently applicable only to PowerPC, but it can be easily extended if other architectures decide to increase their nr_cpus value in the future.

Please review my latest patches and let me if you any review comment. And apologies for delayed response.

Np. I really like your solution. I am looking forward to the final version

Thanks.

sourabhjains · 2025-10-14T07:46:06Z

@sourabhjains ,

@pfliu I have used your approach for additional crashkernel memory calculation, but instead of hard-coding the nr_cpus value, I used a dynamic approach.

Great. That is what I expect.

I added a function in kdumpctl to parse the configured nr_cpus value and passed it to the respective functions so that it is considered while calculating the crashkernel value.

I suggest to hide the detail in kdump_get_arch_recommend_crashkernel() instead of passing nr_cpus through param since currently we handle smmu, mlx5 in it.

What about calling find_nr_cpus() directly from kdump_get_arch_recommend_crashkernel().

Beside is it enough 1M per cpu? If dumping through ssh, does the makedumpile consume more memory in multi-thread?

makedumpfile take around 35 MB to capture dump at SSH remote target for both nr_cpus=1 and nr_cpus=16. So no there is no additional memory needed for the above case.

@pfliu I updated the patches as per you latest suggestions. Please review the changes and let me know if you any comments/suggestion.

The next patch in the series adds more CPUs to the capture kernel, which increases the memory requirement for the capture kernel. Experiments show that powerpc needs 1 MB of additional memory for every CPU added. Therefore, while calculating the crashkernel size, make sure to include an additional 1 MB for every CPU configured in the capture kernel. The changes are implemented in such a way that if the user changes the nr_cpus value in the kdump configuration, the script will adapt accordingly. Signed-off-by: Sourabh Jain <[email protected]>

Configure the kdump kernel with nr_cpus=16 to enable multi-threading in the makedumpfile core collector, allowing faster dump collection. Commit d428557 ("Use all available CPUs to collect dump") introduced multi-threading support in the core collector. There are two reasons for choosing nr_cpus=16: - Multiple experiments show that optimal performance is achieved when nr_cpus is between 16 and 30. The graph in commit d428557 supports this. - nr_cpus=16 is already used for the fadump kernel, so using the same value for kdump to maintain consistency. Signed-off-by: Sourabh Jain <[email protected]>

sourabhjains · 2025-10-14T07:47:16Z

Rebased the patches on top of today's main branch.

sourabhjains · 2025-10-31T05:45:21Z

Gentle reminder to review the pull request.

gemini-code-assist bot reviewed Jul 16, 2025

View reviewed changes

sourabhjains force-pushed the kdump-nr-cpus branch from 9a1deb0 to 488e5d1 Compare August 23, 2025 18:14

sourabhjains force-pushed the kdump-nr-cpus branch from 488e5d1 to 3afc6eb Compare October 14, 2025 07:41

sourabhjains added 2 commits October 14, 2025 13:16

sourabhjains force-pushed the kdump-nr-cpus branch from 3afc6eb to 3fca334 Compare October 14, 2025 07:46

sourabhjains requested a review from pfliu October 15, 2025 10:12

powerpc: Set nr_cpus=16 for kdump kernel #97

Are you sure you want to change the base?

powerpc: Set nr_cpus=16 for kdump kernel #97

Uh oh!

Conversation

sourabhjains commented Jul 16, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 16, 2025

Choose a reason for hiding this comment

Uh oh!

pfliu Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

sourabhjains Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

pfliu Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

pfliu Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

pfliu commented Jul 18, 2025

Uh oh!

sourabhjains commented Jul 25, 2025

Uh oh!

pfliu commented Aug 22, 2025 • edited by coiby Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sourabhjains commented Aug 23, 2025

Uh oh!

sourabhjains commented Aug 23, 2025

Uh oh!

coiby commented Sep 1, 2025

Uh oh!

coiby commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sourabhjains commented Sep 2, 2025

Uh oh!

sourabhjains commented Sep 3, 2025

Uh oh!

sourabhjains commented Sep 3, 2025

Uh oh!

pfliu commented Sep 10, 2025

Uh oh!

coiby commented Sep 19, 2025

Uh oh!

sourabhjains commented Oct 13, 2025

Uh oh!

sourabhjains commented Oct 13, 2025

Uh oh!

sourabhjains commented Oct 14, 2025

Uh oh!

sourabhjains commented Oct 14, 2025

Uh oh!

sourabhjains commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pfliu commented Aug 22, 2025 •

edited by coiby

Loading

coiby commented Sep 1, 2025 •

edited

Loading