improve comments and macros_for_performance.md

hurchalla · hurchalla · commit a1dfea9e38da · 2025-08-26T22:27:06.000-07:00
diff --git a/README.md b/README.md
@@ -95,4 +95,4 @@ The particular ECM variant used by this library originated from Ben Buhrow's mic
 The Pollard-Rho-Brent algorithm uses an easy extra step that seems to be unmentioned in the literature.  The step is a "pre-loop" that advances as quickly as possible through a portion of the initial pseduo-random sequence before beginning the otherwise normal Pollard-Rho Brent algorithm.  The rationale for this is that every Pollard-Rho pseudo-random sequence begins with a non-periodic segment, and trying to extract factors from that segment is mostly wasted work since the algorithm logic relies on a periodic sequence.  Using a "pre-loop" that does nothing except iterate for a set number of times through the sequence thus improves performance on average, since it quickly gets past some of that unwanted non-periodic segment.  In particular, the "pre-loop" avoids calling the greatest common divisor, which would rarely find a factor during the non-periodic segment.  This optimization would likely help any form/variant of the basic Pollard-Rho algorithm.
 
 ## Performance Notes
-If you're interested in experimenting, predefining certain macros when compiling can improve performance - see [macros_for_performance.md](macros_for_performance.md).
+If you're interested in experimenting, defining certain macros when compiling can improve performance - see [macros_for_performance.md](macros_for_performance.md).
diff --git a/include/hurchalla/factoring/detail/experimental/README_pollard_rho.md b/include/hurchalla/factoring/detail/experimental/README_pollard_rho.md
@@ -1,13 +1,13 @@
 
 The PollardRho\*.h header files in this folder are experimental functors for Pollard-Rho trials.  
-To use an experimental functor, predefine the macro HURCHALLA_POLLARD_RHO_TRIAL_FUNCTOR_NAME and give it the name of the experimental functor.  For example, if you want PollardRhoTrial and you are compiling with clang, you could invoke clang as follows:  
+To use an experimental functor, define the macro HURCHALLA_POLLARD_RHO_TRIAL_FUNCTOR_NAME and give it the name of the experimental functor.  For example, if you want PollardRhoTrial and you are compiling with clang, you could invoke clang as follows:  
 clang++ -DHURCHALLA_POLLARD_RHO_TRIAL_FUNCTOR_NAME=PollardRhoTrial  ...more options and files...  
 
 valid names to give HURCHALLA_POLLARD_RHO_TRIAL_FUNCTOR_NAME are:  
 PollardRhoTrial  
 PollardRhoBrentTrial  
 PollardRhoBrentTrialParallel  
-PollardRhoBrentSwitchingTrial  (currently this is the default, so it's not necessary to predefine the macro to this functor)  
+PollardRhoBrentSwitchingTrial  (currently this is the default, so it's not necessary to define the macro to this functor)  
 
 The PollardRhoBrentSwitchingTrial functor or the PollardRhoBrentTrialParallel functor will likely perform best on your system, but you can try others.
 
diff --git a/include/hurchalla/factoring/detail/experimental/get_single_factor.h b/include/hurchalla/factoring/detail/experimental/get_single_factor.h
@@ -16,7 +16,7 @@
 
 #ifndef NDEBUG
 // you can remove this if you are *sure* you really want to run with asserts enabled..  It is SLOW.
-#  error "Performance will be severely harmed if you don't predefine the standard macro NDEBUG."
+#  error "Performance will be severely harmed if you don't define the standard macro NDEBUG."
 #endif
 
 
diff --git a/include/hurchalla/factoring/detail/impl_factorize.h b/include/hurchalla/factoring/detail/impl_factorize.h
@@ -26,7 +26,7 @@ namespace hurchalla { namespace detail {
 
 // Do *NOT* change the values for the macro in the code immediately below.  If
 // you wish to use a different value for the macro (which is fine), please
-// predefine the macro when compiling.
+// define the macro when compiling.
 #ifndef HURCHALLA_FACTORING_ECM_THRESHOLD_BITS
 #  define HURCHALLA_FACTORING_ECM_THRESHOLD_BITS 34
 #endif
diff --git a/include/hurchalla/factoring/detail/impl_greatest_common_divisor.h b/include/hurchalla/factoring/detail/impl_greatest_common_divisor.h
@@ -25,9 +25,9 @@ struct impl_greatest_common_divisor {
 // The binary GCD algorithm is usually considerably faster than the Euclidean
 // GCD algorithm.  However for native types T, some new CPUs have very fast
 // dividers that potentially could make the Euclidean GCD implementation faster
-// than the Binary GCD implementation.  You can predefine the macro 
+// than the Binary GCD implementation.  You can define the macro
 // HURCHALLA_PREFER_EUCLIDEAN_GCD in such a case.  You will also need to make
-// sure that you have predefined HURCHALLA_TARGET_CPU_HAS_FAST_DIVIDE.
+// sure that you have defined HURCHALLA_TARGET_CPU_HAS_FAST_DIVIDE.
 
 #if defined(HURCHALLA_PREFER_EUCLIDEAN_GCD) && \
         defined(HURCHALLA_TARGET_CPU_HAS_FAST_DIVIDE)
diff --git a/include/hurchalla/factoring/detail/trial_divide_mayer.h b/include/hurchalla/factoring/detail/trial_divide_mayer.h
@@ -23,7 +23,7 @@ namespace hurchalla { namespace detail {
 
 
 #if defined(HURCHALLA_USE_TRIAL_DIVIDE_VIA_INVERSE)
-#  error "HURCHALLA_USE_TRIAL_DIVIDE_VIA_INVERSE must not be predefined"
+#  error "HURCHALLA_USE_TRIAL_DIVIDE_VIA_INVERSE must not be defined"
 #endif
 #if defined(HURCHALLA_TARGET_ISA_HAS_NO_DIVIDE) || \
     !defined(HURCHALLA_TARGET_CPU_HAS_FAST_DIVIDE)
diff --git a/macros_for_performance.md b/macros_for_performance.md
@@ -1,9 +1,9 @@
 
-Optional macros to predefine to tune performance
-------------------------------------------------
-There are a number of macros you can optionally predefine to tune the
+Optional macros you can define to tune performance
+--------------------------------------------------
+There are a number of macros you can optionally define to tune the
 performance on your system for the factoring and primality testing functions.
-You would predefine one or more of these macros when compiling *your* sources,
+You would define one or more of these macros when compiling *your* sources,
 given that this is a header-only library.
 
 For example, if you are compiling using clang or gcc from the command line, you would
@@ -28,15 +28,15 @@ HURCHALLA_TRIAL_DIVISION_SIZE - this macro specifies the number of small primes
 division stage of factoring.  The default is 139.
 
 HURCHALLA_TRIAL_DIVISION_TEMPLATE - this is the name of the template that
-performs the trial division during factoring; if you predefine it, you must set it to either
+performs the trial division during factoring; if you define it, you must set it to either
 PrimeTrialDivisionWarren or PrimeTrialDivisionMayer.  PrimeTrialDivisionWarren
 is the default.  If you have a CPU with very fast division instructions
 (some CPUs from 2019 or later), you might be able to improve performance by
-predefining this macro to PrimeTrialDivisionMayer while also predefining the
+defining this macro to PrimeTrialDivisionMayer while also defining the
 macro HURCHALLA_TARGET_CPU_HAS_FAST_DIVIDE.  If you need to minimize resource
-usage, then predefine this macro to PrimeTrialDivisionMayer, since the Mayer
+usage, then define this macro to PrimeTrialDivisionMayer, since the Mayer
 template uses about a fifth of the memory of PrimeTrialDivisionWarren.  If you
-do predefine this macro, you will very likely also want to predefine
+do define this macro, you will very likely also want to define
 HURCHALLA_TRIAL_DIVISION_SIZE to a new value that works best with this macro choice.
 
 HURCHALLA_POLLARD_RHO_TRIAL_FUNCTOR_NAME - the name of the algorithm/functor
@@ -52,47 +52,47 @@ Macros for is_prime():
 HURCHALLA_ISPRIME_TRIALDIV_SIZE - the number of small primes (starting
 at 2,3,5,7, etc) that will be trialed as potential factors via trial division
 before testing primality with Miller-Rabin.  21 is the default.  You can
-predefine this macro to a different number that is better tuned for your system.
+define this macro to a different number that is better tuned for your system.
 \
 \
 Macros for IsPrimeIntensive:
 
 HURCHALLA_ISPRIME_INTENSIVE_TRIALDIV_SIZE - the number of small primes
 (starting at 2,3,5,7, etc) that will be trialed as potential factors via trial
 division before testing primality with Miller-Rabin.  75 is the default.
-You can predefine this macro to a different number that is better tuned for your
+You can define this macro to a different number that is better tuned for your
 system.  Note that for IsPrimeIntensive, when testing primality of uint32_t and
 smaller types, this macro has no effect because IsPrimeIntensive tests primality
 of those types solely using the Sieve of Eratosthenes.
 
 HURCHALLA_ISPRIME_INTENSIVE_TRIALDIV_TYPE - the name of the template
-that performs the trial division; if you predefine it, you must set it to either
+that performs the trial division; if you define it, you must set it to either
 PrimeTrialDivisionWarren or PrimeTrialDivisionMayer.  PrimeTrialDivisionWarren
 is the default.  If you have a CPU with very fast division instructions (some
 CPUs from 2019 or later), you might be able to improve performance by
-predefining this macro to PrimeTrialDivisionMayer while also predefining the
+defining this macro to PrimeTrialDivisionMayer while also defining the
 macro HURCHALLA_TARGET_CPU_HAS_FAST_DIVIDE.  If you need to minimize resource
-usage, then predefine this macro to PrimeTrialDivisionMayer, since the Mayer
+usage, then define this macro to PrimeTrialDivisionMayer, since the Mayer
 template uses about a fifth of the memory of PrimeTrialDivisionWarren.  If you
-do predefine this macro, you will very likely also want to predefine
+do define this macro, you will very likely also want to define
 HURCHALLA_ISPRIME_INTENSIVE_TRIALDIV_SIZE to a new value that works well with
 this macro choice.
 \
 \
 Miscellaneous macros:
 
-HURCHALLA_TARGET_CPU_HAS_FAST_DIVIDE - predefine this macro if your CPU has very
+HURCHALLA_TARGET_CPU_HAS_FAST_DIVIDE - define this macro if your CPU has very
 fast division instructions (usually this is present only in CPUs from around
 2019 or later).
 
-HURCHALLA_TARGET_ISA_HAS_NO_DIVIDE - you should usually predefine this macro if
+HURCHALLA_TARGET_ISA_HAS_NO_DIVIDE - you should usually define this macro if
 your microprocessor lacks a division instruction.
 
-HURCHALLA_FACTORIZE_NEVER_USE_MONTGOMERY_MATH - if predefined, this macro
+HURCHALLA_FACTORIZE_NEVER_USE_MONTGOMERY_MATH - if defined, this macro
 causes the factorization functions to use standard division for modular
 arithmetic, instead of montgomery arithmetic.  By default this macro is not
 defined.  If you have a CPU with very fast division instructions, you might be
-able to improve performance by predefining this macro while also predefining the
+able to improve performance by defining this macro while also defining the
 macro HURCHALLA_TARGET_CPU_HAS_FAST_DIVIDE.
 
 HURCHALLA_PRB_GCD_THRESHOLD - the maximum number of iterations of
@@ -112,19 +112,19 @@ the nonperiodic segment while doing minimal work.  This increases performance,
 so long as the 'starting_length' is not too much greater than the actual
 nonperiodic segment's length.  Note: if you expect to have only large factors,
 you will generally be able to improve the performance of Pollard-Rho-Brent
-factoring by predefining this macro to a larger value than the default, because
+factoring by defining this macro to a larger value than the default, because
 both the periodic and nonperiodic segments will tend to be large when you have
 large factors.
 
-HURCHALLA_PREFER_EUCLIDEAN_GCD - predefining this macro allows you to use the
+HURCHALLA_PREFER_EUCLIDEAN_GCD - defining this macro allows you to use the
 Euclidean algorithm for greatest common divisor (GCD), rather tham the default
 Binary/Stein algorithm.  However, you will only get the Euclidean GCD if you
-also predefine HURCHALLA_TARGET_CPU_HAS_FAST_DIVIDE.  Note that even with both
-macros predefined, the Binary GCD will still be used for any integer type T that
+also define HURCHALLA_TARGET_CPU_HAS_FAST_DIVIDE.  Note that even with both
+macros defined, the Binary GCD will still be used for any integer type T that
 is larger than the CPU's native bit width.
 
-HURCHALLA_FACTORING_DISALLOW_INLINE_ASM - predefining this macro will prevent
-any inline asm from being compiled.  If this macro is not predefined, this
+HURCHALLA_FACTORING_DISALLOW_INLINE_ASM - defining this macro will prevent
+any inline asm from being compiled.  If this macro is not defined, this
 library by default will use inline asm for improved performance.  However,
 inline asm is difficult to thoroughly test, because the code surrounding the
 inline asm under test in part determines what binary instructions are generated