Skip to content

Commit 647967a

Browse files
Update README
1 parent 200b786 commit 647967a

File tree

1 file changed

+13
-29
lines changed

1 file changed

+13
-29
lines changed

README.md

Lines changed: 13 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -31,55 +31,39 @@ factor = find_a_factor(
3131
to_factor,
3232
method=FactoringMethod.PRIME_SOLVER,
3333
node_count=1, node_id=0,
34-
trial_division_level=2**20,
3534
gear_factorization_level=11,
3635
wheel_factorization_level=11,
36+
sieving_bound_multiplier=1.0,
3737
smoothness_bound_multiplier=1.0,
38-
batch_size_multiplier=512.0,
39-
batch_size_variance=2,
40-
ladder_multiple=5,
41-
skip_trial_division=False
38+
gaussian_elimination_row_multiplier=1.2,
39+
check_small_factors=False
4240
)
4341
```
4442

4543
The `find_a_factor()` function should return any nontrivial factor of `to_factor` (that is, any factor besides `1` or `to_factor`) if it exists. If a nontrivial factor does _not_ exist (i.e., the number to factor is prime), the function will return `1` or the original `to_factor`.
4644

4745
- `method` (default value: `PRIME_SOLVER`/`0`): `PRIME_SOLVER`/`0` will prove that a number is prime (by failing to find any factors with wheel and gear factorization). `FACTOR_FINDER`/`1` is optimized for the assumption that the number has at least two nontrivial factors.
48-
- `node_count` (default value: `1`): `FindAFactor` can perform factorization in a _distributed_ manner, across nodes, without network communication! When `node_count` is set higher than `1`, the search space for factors is segmented equally per node. If the number to factor is semiprime, and brute-force search is used instead of congruence of squares, for example, all nodes except the one that happens to contain the (unknown) prime factor less than the square root of `to_factor` will ultimately return `1`, while one node will find and return this factor. For best performance, every node involved in factorization should have roughly the same CPU throughput capacity.
46+
- `node_count` (default value: `1`): `FindAFactor` can perform factorization in a _distributed_ manner, across nodes, without network communication! When `node_count` is set higher than `1`, the search space for factors is segmented equally per node. If the number to factor is semiprime, and brute-force search is used instead of congruence of squares, for example, all nodes except the one that happens to contain the (unknown) prime factor less than the square root of `to_factor` will ultimately return `1`, while one node will find and return this factor. For best performance, every node involved in factorization should have roughly the same CPU throughput capacity. For `FACTOR_FINDER` mode, this splits the sieving range between nodes, but it does not actually coordinate Gaussian elimination rows between nodes.
4947
- `node_id` (default value: `0`): This is the identifier of this node, when performing distributed factorization with `node_count` higher than `1`. `node_id` values start at `0` and go as high as `(node_count - 1)`.
50-
- `trial_division_level` (default value: `2**20`): Trial division is carried out as a preliminary round for all primes up this number. If you need more primes for your smoothness bound, increase this level.
51-
- `gear_factorization_level` (default value: `11`): This is the value up to which "wheel (and gear) factorization" and trial division are used to check factors and optimize "brute force," in general. The default value of `11` includes all prime factors of `11` and below and works well in general, though significantly higher might be preferred in certain cases.
52-
- `wheel_factorization_level` (default value: `11`): "Wheel" vs. "gear" factorization balances two types of factorization wheel ("wheel" vs. "gear" design) that often work best when the "wheel" is only a few prime number levels lower than gear factorization. Optimized implementation for wheels is only available up to `13`. The primes above "wheel" level, up to "gear" level, are the primes used specifically for "gear" factorization.
53-
- `smoothness_bound_multiplier` (default value: `1.0`): starting with the first prime number after wheel factorization, the congruence of squares approach (with Quadratic Sieve) has a "smoothness bound" unit with as many distinct prime numbers as bits in the number to factor (for argument of `1.0` multiplier). To increase or decrease this number, consider it multiplied by the value of `smoothness_bound_multiplier`.
54-
- `batch_size_multiplier` (default value: `512.0`): For `FACTOR_FINDER`/`1` method, each `1.0` increment of the multiplier adds `k ln k` Markov mixing replacement steps for `k` count of smooth primes, before reseeding Monte Carlo.
55-
- `batch_size_variance` (default value: `2`): For `FACTOR_FINDER`/`1` method, `k ln k` is the right proportionality for a Markov mixing process, but a linear factor in front is hard to predict. As such, it can be useful to dynamically vary the batch size, as if to cover and amortize the cost of several different batch sizes at once. In sequence, each batch size will be multiplied by `2 ** i` for `i` in `range(batch_size_variance)`, repeating from `0`.
56-
- `ladder_multiple` (default value: `6`): Controls how many times randomly-selected square prime multiplication is repeated with the same square prime per random selection, in ascending a "ladder" of smooth perfect squares. A random number between `1` and `ladder_multiple` is selected for how many times the current smooth perfect square is multiplied each randomly selected square prime, while division still occurs 1 square prime multiple at a time. (Any smooth perfect square can be multiplied by any square prime in the factor base, or any other smooth perfect square, and produce a different smooth perfect square.)
57-
- `skip_trial_division` (default value: `False`): `False` performs initial-phase trial division, and `True` skips it. Note that `trial_division_level` still functions as the input to Sieve of Eratosthenes to collect a list of primes from which (`FACTOR_FINDER`) smooth primes can be selected. If `skip_trial_division=True`, simply set `trial_division_level` high enough to avoid truncation warnings, unless otherwise desired.
48+
- `gear_factorization_level` (default value: `1`): This is the value up to which "wheel (and gear) factorization" are applied to "brute force." A value of `11` includes all prime factors of `11` and below and works well for `PRIME_PROVER`, though significantly higher might be preferred in certain cases.
49+
- `wheel_factorization_level` (default value: `1`): "Wheel" vs. "gear" factorization balances two types of factorization wheel ("wheel" vs. "gear" design) that often work best when the "wheel" is only a few prime number levels lower than gear factorization. Optimized implementation for wheels is only available up to `13`. The primes above "wheel" level, up to "gear" level, are the primes used specifically for "gear" factorization. Wheel factorization is also applied to "peturb" `FACTOR_FINDER` mode into non-multiples on the wheel, if the level is set above `1` (which might not actually work, but we leave it for your experimentation).
50+
- `sieving_bound_multiplier` (default value: `1.0`): This controls the sieving bound and is calibrated such that it linearly multiplies the square root of the number to factor (for each `1.0` increment). While this might be a huge bound, remember that sieving termination is primarily controlled by when `gaussian_elimination_row_multiplier` is exactly satisfied.
51+
- `smoothness_bound_multiplier` (default value: `1.0`): This controls smoothness bound and is calibrated such that it linearliy multiplies `exp(0.5 * std::sqrt(log(N) * log(log(N))))` for `N` being the number to factor (for each `1.0` increment). This was a heuristic suggested by Elara (an OpenAI custom GPT).
52+
- `gaussian_elimination_row_multiplier` (default value: `1.2`): This controls the number of rows sieved for Gaussian elimination before terminating the sieve and is calibrated such that it linearly multiplies the number of smooth prime columns in the Gaussian elimination matrix (for each `1.0` increment). So long as this setting is appropriately low enough, `sieving_bound_multiplier` can be set basically arbitrarily high.
53+
- `check_small_factors` (default value: `False`): `True` performs initial-phase trial division up to the smoothness bound, and `False` skips it.
5854

5955
All variables defaults can also be controlled by environment variables:
6056
- `FINDAFACTOR_METHOD` (integer value)
6157
- `FINDAFACTOR_NODE_COUNT`
6258
- `FINDAFACTOR_NODE_ID`
63-
- `FINDAFACTOR_TRIAL_DIVISION_LEVEL`
6459
- `FINDAFACTOR_GEAR_FACTORIZATION_LEVEL`
6560
- `FINDAFACTOR_WHEEL_FACTORIZATION_LEVEL`
61+
- `FINDAFACTOR_SIEVING_BOUND_MULTIPLIER`
6662
- `FINDAFACTOR_SMOOTHNESS_BOUND_MULTIPLIER`
67-
- `FINDAFACTOR_BATCH_SIZE_MULTIPLIER`
68-
- `FINDAFACTOR_BATCH_SIZE_VARIANCE`
63+
- `FINDAFACTOR_GAUSSIAN_ELIMINATION_ROW_MULTIPLIER`
64+
- `FINDAFACTOR_CHECK_SMALL_FACTORS` (`True` if set at all, otherwise `False`)
6965

7066
## About
7167
This library was originally called ["Qimcifa"](https://github.com/vm6502q/qimcifa) and demonstrated a (Shor's-like) "quantum-inspired" algorithm for integer factoring. It has since been developed into a general factoring algorithm and tool.
7268

73-
Let's try to explain the algorithm of `FACTOR_FINDER` mode as briefly as possible. If you want background in Quadratic Sieve or GNFS, and what terms like "smooth numbers" and "smooth perfect squares" mean, you're likely going to have to do some background reading to have basis in Dixon/QS/GNFS-family factoring algorithms, which are regarded the fastest at cryptographically-relevant scales.
74-
75-
QS/GNFS have one _ultimate_ goal: factor via a congruence of squares. To serve this goal, they have one _penultimate_ goal: find a lot of smooth perfect squares. Anything that gets you more smooth perfect squares asymptotically faster is an improvement.
76-
77-
This is the kick in the head that no one's noticed since only the '80s: there is a (totally **non**-exotic) structure like a "ladder" (or "web," or "net") where basically all it takes is a single multiplication, or a single division, and some RNG, to successively produce all the (nearly unique and distinct) smooth perfect squares you could need, once you have any _single_ smooth perfect square: multiply (or divide) the single working smooth perfect square by any square of a smooth prime number.
78-
79-
- A positive integer `s` is a "perfect square" if some number `x` exists such that `s = x^2`.
80-
- A square prime is `p^2` for some number `p` that is prime.
81-
- This is the **kick in the head**: `s * p^2 = x^2 * p^2 = (x * p)^2 = s'`
82-
83-
It's that **painfully simple.** No sieving, no Gaussian elimination, no matrix storage and row operations, no memory requirements larger than a _single_ arbitrary precision "big integer" per thread, embarrassingly parallel, trivial to distribute, having no sequential dependence: **just apply RNG on this concept so it gets close to uniform coverage of the space**.
84-
8569
**Special thanks to OpenAI GPT "Elara," for indicated region of contributed code!**

0 commit comments

Comments
 (0)