Skip to content

Conversation

@sourabhjains
Copy link
Contributor

Configure the kdump kernel with nr_cpus=16 to enable multi-threading in the makedumpfile core collector, allowing faster dump collection.

Commit d428557 ("Use all available CPUs to collect dump") enabled the use of multi-threading for makedumpfile core collector.

There are two reasons for choosing nr_cpus=16:

  • Multiple experiments show that optimal performance is achieved when nr_cpus is between 16 and 30. The graph in commit d428557 supports this.

  • nr_cpus=16 is already used for the fadump kernel, so using the same value for kdump to maintain consistency.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @sourabhjains, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request optimizes the kdump process on PowerPC systems by adjusting the kdump kernel's CPU configuration. By increasing the number of available CPUs, the makedumpfile utility can operate in a multi-threaded manner, leading to faster and more efficient crash dump collection. This change also ensures consistency with existing fadump configurations.

Highlights

  • Performance Enhancement: The primary objective of this change is to significantly improve the performance of the makedumpfile core collector during kdump operations by enabling it to leverage multi-threading more effectively.
  • Kernel Parameter Update: The nr_cpus kernel boot parameter for the kdump kernel on ppc64le systems has been increased from 1 to 16. This allows the kdump kernel to utilize more CPU cores for dump collection.
  • Configuration Consistency: This update also brings the nr_cpus setting for the kdump kernel in line with the existing configuration for the fadump kernel, both now set to 16 for consistency.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the nr_cpus parameter for the kdump kernel on ppc64le from 1 to 16, aiming to improve dump collection performance and ensure consistency with the fadump configuration.

"hugepages hugepagesz slub_debug quiet log_buf_len swiotlb hugetlb_cma ignition.firstboot"
update_param KDUMP_COMMANDLINE_APPEND \
"irqpoll nr_cpus=1 noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd hugetlb_cma=0"
"irqpoll nr_cpus=16 noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd hugetlb_cma=0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The value 16 for nr_cpus is duplicated on line 105. To improve maintainability, define this value as a variable at the beginning of the ppc64le) case block and reuse it in both update_param calls.

Suggested change
"irqpoll nr_cpus=16 noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd hugetlb_cma=0"
PPC_NR_CPUS=16
update_param KDUMP_COMMANDLINE_REMOVE \
"hugepages hugepagesz slub_debug quiet log_buf_len swiotlb hugetlb_cma ignition.firstboot"
update_param KDUMP_COMMANDLINE_APPEND \
"irqpoll nr_cpus=${PPC_NR_CPUS} noirqdistrib reset_devices cgroup_disable=memory numa=off udev.children-max=2 ehea.use_mcs=0 panic=10 kvm_cma_resv_ratio=0 transparent_hugepage=never novmcoredd hugetlb_cma=0"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that comes to mind is nr_cpus enlarges PER_CPU area, which risks OOM. So we should consider how to adjust the crashkernel for PPC at the same time.

Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PER_CPU memory usage for the kdump kernel on RHEL9 and RHEL10 is as follows:

nr_cpus=1 :  7 MB
nr_cpus=16 : 14 MB

Even with nr_cpus=1, the number of CPUs online in the kdump kernel was 8. This might be due to SMT, which could explain why the PER_CPU usage is 7 MB for nr_cpus=1 and 14 MB for nr_cpus=16.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea to sync nr_cpus in both kdump_get_arch_recommend_crashkernel() and gen-kdump-sysconfig.sh.

What about use nr_cpus * PER_CPU in kdump_get_arch_recommend_crashkernel() to calculate the crashkernel value and leave notes in the config file to remind the user to update the crashkernel value when modifying the nr_cpus

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coiby @licliu , please share your thoughts on my idea.

Thanks

@pfliu
Copy link
Collaborator

pfliu commented Jul 18, 2025

I suggest to add crashkernel by nr_cpus * PER_CPU_AREA_SIZE in kdump_get_arch_recommend_crashkernel() for PPC. Where the default nr_cpus is 16 here.

@sourabhjains
Copy link
Contributor Author

Thanks for the suggestion @pfliu
I will do the necessary changes and update the PR.

@pfliu
Copy link
Collaborator

pfliu commented Aug 22, 2025

@sourabhjains, what about:

diff --git a/gen-kdump-sysconfig.sh b/gen-kdump-sysconfig.sh
index eb5287b..a756597 100755
--- a/gen-kdump-sysconfig.sh
+++ b/gen-kdump-sysconfig.sh
@@ -29,10 +29,12 @@ KDUMP_COMMANDLINE_REMOVE="hugepages hugepagesz slub_debug quiet log_buf_len swio
 
 # This variable lets us append arguments to the current kdump commandline
 # after processed by KDUMP_COMMANDLINE_REMOVE
-KDUMP_COMMANDLINE_APPEND="irqpoll maxcpus=1 reset_devices novmcoredd cma=0 hugetlb_cma=0"
+# PER_CPU_AREA roughly equals 1MB. Remember to keep nr_cpus moderate
+KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 reset_devices novmcoredd cma=0 hugetlb_cma=0"
 
 # This variable lets us append arguments to fadump (powerpc) capture kernel,
 # further to the parameters passed via the bootloader.
+# PER_CPU_AREA roughly equals 1MB. Remember to keep nr_cpus moderate
 FADUMP_COMMANDLINE_APPEND=""
 
 # Any additional kexec arguments required.  In most situations, this should
diff --git a/kdump-lib.sh b/kdump-lib.sh
index a3ef956..6d1e28c 100755
--- a/kdump-lib.sh
+++ b/kdump-lib.sh
@@ -1004,8 +1004,12 @@ _crashkernel_add()
 kdump_get_arch_recommend_crashkernel()
 {
 	local _arch _ck_cmdline _dump_mode
+	local _pcpu_area=0
+	# the basic formula is based on nr_cpus=1
+	local nr_cpus=1
 	local _delta=0
 
+
 	if [[ -z $1 ]]; then
 		if is_fadump_capable; then
 			_dump_mode=fadump
@@ -1047,6 +1051,11 @@ kdump_get_arch_recommend_crashkernel()
 			has_mlx5 && ((_delta += 150))
 		fi
 	elif [[ $_arch == "ppc64le" ]]; then
+		# per cpu area takes 1MB
+		_pcpu_area=1
+		nr_cpus=16
+
+		_delta=$(( _delta + _pcpu_area * nr_cpus ))
 		if [[ $_dump_mode == "fadump" ]]; then
 			_ck_cmdline="4G-16G:768M,16G-64G:1G,64G-128G:2G,128G-1T:4G,1T-2T:6G,2T-4T:12G,4T-8T:20G,8T-16T:36G,16T-32T:64G,32T-64T:128G,64T-:180G"
 		else

@sourabhjains
Copy link
Contributor Author

@pfliu I have used your approach for additional crashkernel memory calculation, but instead of hard-coding the nr_cpus value, I used a dynamic approach.

I added a function in kdumpctl to parse the configured nr_cpus value and passed it to the respective functions so that it is considered while calculating the crashkernel value.

The additional memory calculation is currently applicable only to PowerPC, but it can be easily extended if other architectures decide to increase their nr_cpus value in the future.

Please review my latest patches and let me if you any review comment. And apologies for delayed response.

@sourabhjains
Copy link
Contributor Author

@pfliu How do you prepare rpm package out of upstream source?

I see a spec file in the source but I didn't find Makefile that can create tar out of source.

I had a hard time testing these changes.

@coiby
Copy link
Member

coiby commented Sep 1, 2025

/packit build

@coiby
Copy link
Member

coiby commented Sep 1, 2025

Hi @sourabhjains,

Sorry to hear you have difficultly testing the changes. Do you think among the following three methods, which way works best for you?

  1. Let Github Action create RPM for you
    FYI, the rpm building job has been finished for Fedora-41 and Fedora-rawhide. You shall be able to download the PRMs now.

  2. Run "make install" to install the files directly
    You can use "git archive" to create a source tarball, copy it to the target machine, unpack it and then run make install

  3. Create RPM by yourself
    tools/run-integration-tests.sh has the code to build a RPM. If you prefer this, I'll extract the code and put it into Makefile.

@sourabhjains
Copy link
Contributor Author

Hi @sourabhjains,

Sorry to hear you have difficultly testing the changes. Do you think among the following three methods, which way works best for you?

  1. Let Github Action create RPM for you
    FYI, the rpm building job has been finished for Fedora-41 and Fedora-rawhide. You shall be able to download the PRMs now.
  2. Run "make install" to install the files directly
    You can use "git archive" to create a source tarball, copy it to the target machine, unpack it and then run make install
  3. Create RPM by yourself
    tools/run-integration-tests.sh has the code to build a RPM. If you prefer this, I'll extract the code and put it into Makefile.

Thanks @coiby for providing the steps to build the RPM on our own. Although I have verified this patch, I will create the RPM and try again for RHEL.

@sourabhjains
Copy link
Contributor Author

@coiby Thanks for sharing tools/run-integration-tests.sh I am able to create tar and rpm package.

I tested this patch on RHEL10 and things are working as expected.

@sourabhjains
Copy link
Contributor Author

  1. Create RPM by yourself
    tools/run-integration-tests.sh has the code to build a RPM. If you prefer this, I'll extract the code and put it into Makefile.

@coiby Yes, having a Makefile target to generate the kdump-util tar will be helpful.

@pfliu
Copy link
Collaborator

pfliu commented Sep 10, 2025

@sourabhjains ,

@pfliu I have used your approach for additional crashkernel memory calculation, but instead of hard-coding the nr_cpus value, I used a dynamic approach.

Great. That is what I expect.

I added a function in kdumpctl to parse the configured nr_cpus value and passed it to the respective functions so that it is considered while calculating the crashkernel value.

I suggest to hide the detail in kdump_get_arch_recommend_crashkernel() instead of passing nr_cpus through param since currently we handle smmu, mlx5 in it.

What about calling find_nr_cpus() directly from kdump_get_arch_recommend_crashkernel().

Beside is it enough 1M per cpu? If dumping through ssh, does the makedumpile consume more memory in multi-thread?

The additional memory calculation is currently applicable only to PowerPC, but it can be easily extended if other architectures decide to increase their nr_cpus value in the future.

Please review my latest patches and let me if you any review comment. And apologies for delayed response.

Np. I really like your solution. I am looking forward to the final version

Thanks.

@coiby
Copy link
Member

coiby commented Sep 19, 2025

  1. Create RPM by yourself
    tools/run-integration-tests.sh has the code to build a RPM. If you prefer this, I'll extract the code and put it into Makefile.

@coiby Yes, having a Makefile target to generate the kdump-util tar will be helpful.

To clarify, do you need a Makefile target to create kdump-utils tar or rpm?

@sourabhjains
Copy link
Contributor Author

  1. Create RPM by yourself
    tools/run-integration-tests.sh has the code to build a RPM. If you prefer this, I'll extract the code and put it into Makefile.

@coiby Yes, having a Makefile target to generate the kdump-util tar will be helpful.

To clarify, do you need a Makefile target to create kdump-utils tar or rpm?

target to build rpm. But if that is too much of work I am even happy with target to build tar.

@sourabhjains
Copy link
Contributor Author

@pfliu Apologies for the late response - I was caught up with some personal things.

@pfliu I have used your approach for additional crashkernel memory calculation, but instead of hard-coding the nr_cpus value, I used a dynamic approach.

Great. That is what I expect.

I added a function in kdumpctl to parse the configured nr_cpus value and passed it to the respective functions so that it is considered while calculating the crashkernel value.

I suggest to hide the detail in kdump_get_arch_recommend_crashkernel() instead of passing nr_cpus through param since currently we handle smmu, mlx5 in it.

What about calling find_nr_cpus() directly from kdump_get_arch_recommend_crashkernel().

Yeah it is better to keep find_nr_cpus() in kdump-lib.sh file itself.

The only things is I need to know the dump mode and command-line argument in kdump-lib.sh. Hope it is fine to access DEFAULT_DUMP_MODE and KDUMP_COMMANDLINE_APPEND/FADUMP_COMMANDLINE_APPEND in kdump-lib.sh

Beside is it enough 1M per cpu? If dumping through ssh, does the makedumpile consume more memory in multi-thread?

I will verify it before posting the next version.

The additional memory calculation is currently applicable only to PowerPC, but it can be easily extended if other architectures decide to increase their nr_cpus value in the future.

Please review my latest patches and let me if you any review comment. And apologies for delayed response.

Np. I really like your solution. I am looking forward to the final version

Thanks.

@sourabhjains
Copy link
Contributor Author

@sourabhjains ,

@pfliu I have used your approach for additional crashkernel memory calculation, but instead of hard-coding the nr_cpus value, I used a dynamic approach.

Great. That is what I expect.

I added a function in kdumpctl to parse the configured nr_cpus value and passed it to the respective functions so that it is considered while calculating the crashkernel value.

I suggest to hide the detail in kdump_get_arch_recommend_crashkernel() instead of passing nr_cpus through param since currently we handle smmu, mlx5 in it.

What about calling find_nr_cpus() directly from kdump_get_arch_recommend_crashkernel().

Beside is it enough 1M per cpu? If dumping through ssh, does the makedumpile consume more memory in multi-thread?

makedumpfile take around 35 MB to capture dump at SSH remote target for both nr_cpus=1 and nr_cpus=16. So no there is no additional memory needed for the above case.

@pfliu I updated the patches as per you latest suggestions. Please review the changes and let me know if you any comments/suggestion.

The next patch in the series adds more CPUs to the capture kernel,
which increases the memory requirement for the capture kernel.
Experiments show that powerpc needs 1 MB of additional memory for every
CPU added.

Therefore, while calculating the crashkernel size, make sure to include
an additional 1 MB for every CPU configured in the capture kernel.

The changes are implemented in such a way that if the user changes the
nr_cpus value in the kdump configuration, the script will adapt
accordingly.

Signed-off-by: Sourabh Jain <[email protected]>
Configure the kdump kernel with nr_cpus=16 to enable multi-threading in
the makedumpfile core collector, allowing faster dump collection.

Commit d428557 ("Use all available CPUs to collect dump")
introduced multi-threading support in the core collector.

There are two reasons for choosing nr_cpus=16:

 - Multiple experiments show that optimal performance is achieved when
   nr_cpus is between 16 and 30. The graph in commit d428557
   supports this.

 - nr_cpus=16 is already used for the fadump kernel, so using the same
   value for kdump to maintain consistency.

Signed-off-by: Sourabh Jain <[email protected]>
@sourabhjains
Copy link
Contributor Author

Rebased the patches on top of today's main branch.

@sourabhjains sourabhjains requested a review from pfliu October 15, 2025 10:12
@sourabhjains
Copy link
Contributor Author

Gentle reminder to review the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants