Skip to content

Commit f5a220a

Browse files
Sensible batch multiplier
1 parent 5ee006b commit f5a220a

File tree

3 files changed

+9
-5
lines changed

3 files changed

+9
-5
lines changed

FindAFactor/find_a_factor.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ def find_a_factor(n,
88
gear_factorization_level=int(os.environ.get('FINDAFACTOR_GEAR_FACTORIZATION_LEVEL')) if os.environ.get('FINDAFACTOR_GEAR_FACTORIZATION_LEVEL') else 11,
99
wheel_factorization_level=int(os.environ.get('FINDAFACTOR_WHEEL_FACTORIZATION_LEVEL')) if os.environ.get('FINDAFACTOR_WHEEL_FACTORIZATION_LEVEL') else 5,
1010
thread_count=int(os.environ.get('FINDAFACTOR_THREAD_COUNT')) if os.environ.get('FINDAFACTOR_THREAD_COUNT') else 0,
11-
batch_multiplier=int(os.environ.get('FINDAFACTOR_BATCH_MULTIPLIER')) if os.environ.get('FINDAFACTOR_BATCH_MULTIPLIER') else 256,
11+
batch_multiplier=float(os.environ.get('FINDAFACTOR_BATCH_MULTIPLIER')) if os.environ.get('FINDAFACTOR_BATCH_MULTIPLIER') else 3.0,
1212
smoothness_bound_multiplier=float(os.environ.get('FINDAFACTOR_SMOOTHNESS_BOUND_MULTIPLIER')) if os.environ.get('FINDAFACTOR_SMOOTHNESS_BOUND_MULTIPLIER') else 1.0):
1313
return int(_find_a_factor._find_a_factor(str(n),
1414
use_congruence_of_squares,

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ factor = find_a_factor(
3232
gear_factorization_level=11,
3333
wheel_factorization_level=5,
3434
thread_count=0,
35-
batch_multiplier=256,
35+
batch_multiplier=3.0,
3636
smoothness_bound_multiplier=1.0
3737
)
3838
```
@@ -45,7 +45,7 @@ The `find_a_factor()` function should return any nontrivial factor of `to_factor
4545
- `gear_factorization_level` (default value: `11`): This is the value up to which "wheel (and gear) factorization" and trial division are used to check factors and optimize "brute force," in general. The default value of `11` includes all prime factors of `11` and below and works well in general, though significantly higher might be preferred in certain cases.
4646
- `wheel_factorization_level` (default value: `5`): "Wheel" vs. "gear" factorization balances two types of factorization wheel ("wheel" vs. "gear" design) that often work best when the "wheel" is only a few prime number levels lower than gear factorization. Optimized implementation for wheels is only available up to `13`. The primes above "wheel" level, up to "gear" level, are the primes used specifically for "gear" factorization.
4747
- `thread_count` (default value: `0` for auto): Control the number of threads used for separate Gaussian elimination or parallel brute-force instances. For value of `0`, the total number of hyper threads on the system will be detedted and used. When `use_congruence_of_squares=True`, this acts as a multiplier on overall memory usage. If you exceed system memory, turn it down to some manual value. (Gaussian elimination is not easily parallelizable, except to run as many separate instances as will fit in memory.)
48-
- `batch_multiplier` (default value: `256`): controls how many items are processed in a batch before Gaussian elimination. `batch_multiplier` times the number of "smooth" primes is the batch size for "semi-smooth" numbers, to be collected before sieving and then Gaussian elimination. Besides thread count, this `batch_multiplier` can help tune overall memory usage and multiprocessor utilization.
48+
- `batch_multiplier` (default value: `3.0`): controls how many items are processed in a batch before Gaussian elimination. `batch_multiplier` times the number of "smooth" primes is the batch size for "semi-smooth" numbers, to be collected before sieving and then Gaussian elimination. Besides thread count, this `batch_multiplier` can help tune overall memory usage and multiprocessor utilization.
4949
- `smoothness_bound_multiplier` (default value: `1.0`): starting with the first prime number after wheel factorization, the congruence of squares approach (with Quadratic Sieve) takes a default "smoothness bound" with as many distinct prime numbers as bits in the number to factor (for default argument of `1.0` multiplier). To increase or decrease this number, consider it multiplied by the value of `smoothness_bound_multiplier`.
5050

5151
All variables defaults can also be controlled by environment variables:
@@ -62,7 +62,7 @@ The developer anticipates this single-function set of parameters, as API, is the
6262

6363
Advantage for `use_congruence_of_squares` is beyond the hardware scale of the developer's experiments, in practicality, but it can be shown to work correctly (at disadvantage, at small factoring bit-width scales). The anticipated use case is to turn this option on when approaching the size of modern-day RSA semiprimes in use.
6464

65-
If this is your use case, you want to specifically consider `smoothness_bound_multiplier`, `batch_multiplier`, and `thread_count`. By default, as many primes are kept for "smooth" number sieving as bits in the number to factor. This is multiplied by `smooth_bound_multiplier` (and cast to a discrete number of primes in total). `batch_multiplier` is how many times this count of primes, after `smooth_bound_multiplier`, is multiplied for "smooth number part" batching. Turning this down uses less memory and gets to Gaussian elimination faster but decreases CPU utilization. However, the higher `batch_multiplier` is set for CPU utilization, the higher the memory used is.
65+
If this is your use case, you want to specifically consider `smoothness_bound_multiplier`, `batch_multiplier`, and `thread_count`. By default, as many primes are kept for "smooth" number sieving as bits in the number to factor. This is multiplied by `smooth_bound_multiplier` (and cast to a discrete number of primes in total). `batch_multiplier` is how many times this count of primes, after `smooth_bound_multiplier`, is multiplied for "smooth number part" batching. Turning this down uses less memory and gets to Gaussian elimination faster but might or might not decrease CPU utilization. However, the higher `batch_multiplier` is set, as to maximize CPU utilization, the higher the memory used is.
6666

6767
Hence, you only want to set a manual `thread_count` below default to recover full CPU utilization within available system memory footprint. Ideally, you don't want to _have_ to change thread_count from `0`/default, indicating to automatically use all hyper threads, but keeping full utilization depends on both available system memory footprint and the scale of the number to factor, inherently.
6868

find_a_factor.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ def main():
1818
gear_factorization_level = int(os.environ.get('FINDAFACTOR_GEAR_FACTORIZATION_LEVEL')) if os.environ.get('FINDAFACTOR_GEAR_FACTORIZATION_LEVEL') else 11
1919
wheel_factorization_level = int(os.environ.get('FINDAFACTOR_WHEEL_FACTORIZATION_LEVEL')) if os.environ.get('FINDAFACTOR_WHEEL_FACTORIZATION_LEVEL') else 5
2020
thread_count=int(os.environ.get('FINDAFACTOR_THREAD_COUNT')) if os.environ.get('FINDAFACTOR_THREAD_COUNT') else 0
21+
batch_multiplier=float(os.environ.get('FINDAFACTOR_BATCH_MULTIPLIER')) if os.environ.get('FINDAFACTOR_BATCH_MULTIPLIER') else 3.0
2122
smoothness_bound_multiplier = float(os.environ.get('FINDAFACTOR_SMOOTHNESS_BOUND_MULTIPLIER')) if os.environ.get('FINDAFACTOR_SMOOTHNESS_BOUND_MULTIPLIER') else 1.0
2223

2324
if argv_len > 2:
@@ -32,7 +33,9 @@ def main():
3233
if argv_len > 7:
3334
thread_count = int(sys.argv[7])
3435
if argv_len > 8:
35-
smoothness_bound_multiplier = float(sys.argv[8])
36+
batch_multiplier = float(sys.argv[8])
37+
if argv_len > 9:
38+
smoothness_bound_multiplier = float(sys.argv[9])
3639

3740
start = time.perf_counter()
3841
result = find_a_factor(
@@ -43,6 +46,7 @@ def main():
4346
gear_factorization_level = gear_factorization_level,
4447
wheel_factorization_level = wheel_factorization_level,
4548
thread_count = thread_count,
49+
batch_multiplier = batch_multiplier,
4650
smoothness_bound_multiplier = smoothness_bound_multiplier
4751
)
4852
print(time.perf_counter() - start)

0 commit comments

Comments
 (0)