Skip to content

Conversation

takasurazeem
Copy link
Contributor

It's a documentation update for more clear wording on what -t parameter actually means.

@ericcurtin
Copy link
Collaborator

ericcurtin commented Sep 24, 2025

Are we sure this is correct? I was chatting with @doringeman about this recently. I tested this is the past and the default was certainly low, 4 threads (unless something changed in the meantime):

#define GGML_DEFAULT_N_THREADS 4

It was super apparent in the past when testing on an Ampere system with 100+ cores

@takasurazeem
Copy link
Contributor Author

takasurazeem commented Sep 25, 2025

Are we sure this is correct? I was chatting with @doringeman about this recently. I tested this is the past and the default was certainly low, 4 threads (unless something changed in the meantime):

#define GGML_DEFAULT_N_THREADS 4

It was super apparent in the past when testing on an Ampere system with 100+ cores

Oh, my bad, must have been an oversight, I will go through the code and confirm. In any case the docs could benefit from more descriptive wording, thanks for the review.

@adhusch
Copy link

adhusch commented Sep 26, 2025

Are we sure this is correct? I was chatting with @doringeman about this recently. I tested this is the past and the default was certainly low, 4 threads (unless something changed in the meantime):

#define GGML_DEFAULT_N_THREADS 4

It was super apparent in the past when testing on an Ampere system with 100+ cores

I can confirm that it now by default uses 100+ threads when you have 100+ cores. As this is likely not desired (but on the other hand the -1 default likely makes sense for the majority of users) the improved documentation is very valuable.

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table is auto-generated, changes here will be discarded. Make your changes to arg.cpp instead.

<!-- Note for contributors: The list below is generated by llama-gen-docs -->

@takasurazeem takasurazeem requested a review from ngxson October 9, 2025 01:58
@takasurazeem
Copy link
Contributor Author

This table is auto-generated, changes here will be discarded. Make your changes to arg.cpp instead.

<!-- Note for contributors: The list below is generated by llama-gen-docs -->

Addressed and added to correct file.

common/arg.cpp Outdated
add_opt(common_arg(
{"-t", "--threads"}, "N",
string_format("number of threads to use during generation (default: %d)", params.cpuparams.n_threads),
string_format("number of CPU threads to use during generation (default: %d, use all available.)", params.cpuparams.n_threads),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default is not to use all available, the logic is more complex than that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have taken out the "use all available" part, but kept the CPU because it makes it clearer. I will take a look at the logic and put up another PR with more descriptive message.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slaren I was thinking the same, saw it being set as 4 on an Ampere ARM machine with a bazillion cores in the past

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some logic to avoid using logical cores (e.g. from SMT), but it may not work well in the Ampere CPU.

@takasurazeem takasurazeem requested a review from slaren October 15, 2025 14:00
@ggerganov ggerganov merged commit 6f5d924 into ggml-org:master Oct 16, 2025
70 checks passed
@takasurazeem takasurazeem deleted the patch-2 branch October 18, 2025 02:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants