You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/opts.rst
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -124,7 +124,9 @@ Diagnostic options
124
124
Algorithm performance options
125
125
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
126
126
127
-
**nthreads**: Number of threads to use. This is capped at the number of available threads (eg, to prevent misuse of a single-threaded code). It then sets the number of threads FINUFFT will use in FFTW, bin-sorting, and spreading/interpolation steps. This number of threads also controls the batch size for vectorized transforms (ie ``ntr>1`` :ref:`here <c>`). Setting ``nthreads=0`` uses all threads available, usually recommended. However, for repeated small problems it can be advantageous to use a small number, even as small as 1.
127
+
**nthreads**: (Ignored in single-threaded library builds.) If positive, sets the number of threads to use throughout (multi-threaded build of) library, or if ``0`` uses the maximum number of threads available according to OpenMP. In the positive case, no cap is placed on this number. This number of threads is passed to bin-sorting (which may choose to use less threads), but is adhered to in FFTW and spreading/interpolation steps. This number of threads (or 1 for single-threaded builds) also controls the batch size for vectorized transforms (ie ``ntr>1`` :ref:`here <c>`).
128
+
For medium-to-large transforms, ``0`` is usually recommended.
129
+
However, for (repeated) small transforms it can be advantageous to use a small number, even as small as ``1``.
128
130
129
131
**fftw**: FFTW planner flags. This number is simply passed to FFTW's planner;
130
132
the flags are documented `here <http://www.fftw.org/fftw3_doc/Planner-Flags.html#Planner-Flags>`_.
Copy file name to clipboardExpand all lines: src/finufft.cpp
+13-6Lines changed: 13 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -604,15 +604,22 @@ int FINUFFT_MAKEPLAN(int type, int dim, BIGINT* n_modes, int iflag,
604
604
p->fftSign = (iflag>=0) ? 1 : -1; // clean up flag input
605
605
606
606
// choose overall # threads...
607
-
int maxnthr = MY_OMP_GET_MAX_THREADS();
608
-
int nthr = maxnthr; // use as many as OMP gives us
607
+
#ifdef _OPENMP
608
+
int ompmaxnthr = MY_OMP_GET_MAX_THREADS();
609
+
int nthr = ompmaxnthr; // default: use as many as OMP gives us
610
+
// (the above could be set, or suggested set, to 1 for small enough problems...)
609
611
if (p->opts.nthreads>0) {
610
-
nthr = min(maxnthr,p->opts.nthreads); // user override up to max avail
611
-
if (p->opts.nthreads > maxnthr) // if no OMP, maxnthr=1
612
-
fprintf(stderr,"%s warning: user requested %d threads, but only %d threads available; enforcing nthreads=%d.\n",__func__,p->opts.nthreads,maxnthr,nthr);
612
+
nthr = p->opts.nthreads; // user override, now without limit
613
+
if (p->opts.showwarn && (nthr > ompmaxnthr))
614
+
fprintf(stderr,"%s warning: using opts.nthreads=%d, more than the %d OpenMP claims available; note large nthreads can be slower.\n",__func__,nthr,ompmaxnthr);
613
615
}
616
+
#else
617
+
int nthr = 1; // always 1 thread (avoid segfault)
618
+
if (p->opts.nthreads>1)
619
+
fprintf(stderr,"%s warning: opts.nthreads=%d but library is single-threaded; ignoring!\n",__func__,p->opts.nthreads);
620
+
#endif
614
621
p->opts.nthreads = nthr; // store actual # thr planned for
615
-
// (this sets all downstream spread/interp, 1dkernel, and FFT thread counts)
622
+
// (this sets/limits all downstream spread/interp, 1dkernel, and FFT thread counts...)
616
623
617
624
// choose batchSize for types 1,2 or 3... (uses int ceil(b/a)=1+(b-1)/a trick)
618
625
if (p->opts.maxbatchsize==0) { // logic to auto-set best batchsize
0 commit comments