You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/opts.rst
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -128,7 +128,9 @@ Diagnostic options
128
128
Algorithm performance options
129
129
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
130
130
131
-
**nthreads**: Number of threads to use. This is capped at the number of available threads (eg, to prevent misuse of a single-threaded code). It then sets the number of threads FINUFFT will use in FFTW, bin-sorting, and spreading/interpolation steps. This number of threads also controls the batch size for vectorized transforms (ie ``ntr>1`` :ref:`here <c>`). Setting ``nthreads=0`` uses all threads available, usually recommended. However, for repeated small problems it can be advantageous to use a small number, even as small as 1.
131
+
**nthreads**: (Ignored in single-threaded library builds.) If positive, sets the number of threads to use throughout (multi-threaded build of) library, or if ``0`` uses the maximum number of threads available according to OpenMP. In the positive case, no cap is placed on this number. This number of threads is passed to bin-sorting (which may choose to use less threads), but is adhered to in FFTW and spreading/interpolation steps. This number of threads (or 1 for single-threaded builds) also controls the batch size for vectorized transforms (ie ``ntr>1`` :ref:`here <c>`).
132
+
For medium-to-large transforms, ``0`` is usually recommended.
133
+
However, for (repeated) small transforms it can be advantageous to use a small number, even as small as ``1``.
132
134
133
135
**fftw**: FFTW planner flags. This number is simply passed to FFTW's planner;
134
136
the flags are documented `here <http://www.fftw.org/fftw3_doc/Planner-Flags.html#Planner-Flags>`_.
Copy file name to clipboardExpand all lines: src/finufft.cpp
+13-6Lines changed: 13 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -591,15 +591,22 @@ int FINUFFT_MAKEPLAN(int type, int dim, BIGINT* n_modes, int iflag,
591
591
p->fftSign = (iflag>=0) ? 1 : -1; // clean up flag input
592
592
593
593
// choose overall # threads...
594
-
int maxnthr = MY_OMP_GET_MAX_THREADS();
595
-
int nthr = maxnthr; // use as many as OMP gives us
594
+
#ifdef _OPENMP
595
+
int ompmaxnthr = MY_OMP_GET_MAX_THREADS();
596
+
int nthr = ompmaxnthr; // default: use as many as OMP gives us
597
+
// (the above could be set, or suggested set, to 1 for small enough problems...)
596
598
if (p->opts.nthreads>0) {
597
-
nthr = min(maxnthr,p->opts.nthreads); // user override up to max avail
598
-
if (p->opts.nthreads > maxnthr) // if no OMP, maxnthr=1
599
-
fprintf(stderr,"%s warning: user requested %d threads, but only %d threads available; enforcing nthreads=%d.\n",__func__,p->opts.nthreads,maxnthr,nthr);
599
+
nthr = p->opts.nthreads; // user override, now without limit
600
+
if (p->opts.showwarn && (nthr > ompmaxnthr))
601
+
fprintf(stderr,"%s warning: using opts.nthreads=%d, more than the %d OpenMP claims available; note large nthreads can be slower.\n",__func__,nthr,ompmaxnthr);
600
602
}
603
+
#else
604
+
int nthr = 1; // always 1 thread (avoid segfault)
605
+
if (p->opts.nthreads>1)
606
+
fprintf(stderr,"%s warning: opts.nthreads=%d but library is single-threaded; ignoring!\n",__func__,p->opts.nthreads);
607
+
#endif
601
608
p->opts.nthreads = nthr; // store actual # thr planned for
602
-
// (this sets all downstream spread/interp, 1dkernel, and FFT thread counts)
609
+
// (this sets/limits all downstream spread/interp, 1dkernel, and FFT thread counts...)
603
610
604
611
// choose batchSize for types 1,2 or 3... (uses int ceil(b/a)=1+(b-1)/a trick)
605
612
if (p->opts.maxbatchsize==0) { // logic to auto-set best batchsize
0 commit comments