Skip to content

Latest commit

 

History

History
261 lines (166 loc) · 9.2 KB

File metadata and controls

261 lines (166 loc) · 9.2 KB

Index-calculus algorithm

We are trying to find log_a(b) in some cyclic group G of order n generated by a (means finding x for which a^x = b).

We choose a subset S = {p_1, p_2,...,p_t) of G (named factor base), such that most/many of the elements in G can be efficiently expressed as a product of elements from S.

We compute logarithms of elements in S (that means log_a(p_i) for all p_i) using the following method. We select random integers k from {0, 1,..., n-1} until a^k can be written as a product of elements in S:

a^k = p_1^c_1 * p_2^c_2 * ... * p_t^c_t

We know:

k = c_1 * log_a(p_1) + c_2 * log_a(p_2) + ... + c_t * log_a(p_t) (mod n)

This is a linear equation where log_a(p_i) are unknown. We obtain enough such equations to be able to solve the system of equations.

We now select random integers k until a^k * b can be written as a product of elements from S.

a^k * b = p_1^d_1 * p_2^d_2 * ... * p_t^d_t

It holds:

k + log_a(b) = d_1 * log_a(p_1) + ... +  d_t * log_a(p_t) (mod n)

from where we obtain log_a(b).

Pohlig-Hellman algorithm

Pohlig-Hellman in multiplicative groups Z_p

Pohlig-Hellman algorithm computes discrete logarithm in a group G by computing discrete logarithm in subgroups of a group G. If we want to compute x from y:

y = g^l (mod p)

we can do it by computing discrete logarithms in subgroups of order p_1, p_2, ..., p_r where n = p_1^e_1 * p_2^e_2 * ... * p_r^e_r and n is order of g.

Now we define l_i = l mod p_i^e_i for each 1 <= i <= r. It holds:

  • l = l_1 (mod p_1^e_1)
  • l = l_2 (mod p_2^e_2)
  • ...
  • l = l_r (mod p_r^e_r)

We know that factors p_i^e_i are pairwise coprime, so by Chinese Remainder Theorem we know that there exists a solution l to the equations above (and all solutions are congruent modulo n.

Let's define g0 = g^(n/p_1) and y0 = y^(n/p_1). It holds y0 = g0^l. The order of g0 is p_1.

Let's observe also that g0^l = g0^l_1 (see CRT in number_theory.md).

g0^l = g0^(l_1 * M_1 * N_1 + ... + l_r * M_r * N_r) 

where N_i = n / p_i^e_i and M_i * N_i + m_i * n_i = 1. We can see that l_i * M_i * N_i for i > 1 contains p_1 (order of g0) and thus:

g0^l = g0^(l_1 * M_1 * N_1) 

We also know M_1 * N_1 + m_1 * n_1 = 1 and thus:

g0^(l_1 * M_1 * N_1) = (g0^(1 - m_1 * n_1)^l_1 = (g0/g0^(m_1 * n_1))^l_1 = g0^l_1

because g0^(m_1 * n_1) = 1 as n_1 is p_1^e_1 in our case and p_1 is order of g0.

Let's write now l_1 in base p_1:

l_1 = z_0 + z_1*p_1 + z_2*p_1^2 + ... + z_(e1-1)*p_1^(e1-1) where z_i are from {0,1,2,...,p_1-1}

We now have:

g0^l = g0^l_1 = g0^(z_0 + z_1*p_1 + z_2*p_1^2 + ... + z_(e1-1)*p_1^(e1-1)) = g0^z_0

This is now a dlog in a group of order p_1 (because we know the value of g0^l). If we are able to solve this dlog, we have z_0. Then we calculate other z_i analogously. For z_1 we define g0 = g/p_1^2 and we get g0^l_1 = g0^(z_0 + z_1*p_1), so g0^(l_1-z_0) = (g0^p_1)^z_1` which is a dlog in a group of order p_1 again ...

Pohlig-Hellman in elliptic curve groups

It is pretty much the same as in the section above. We have points P and Q on elliptic curve and it holds l * P = Q (we are solving elliptic curve discrete logarithm problem - ECDLP).

Pohlig-Hellman reduces ECDLP in <P> to ECDLP in the prime order subgroups of <P>. It constructs l using <P> subgroups.

Let us say that the order of P is n and n = p_1^e_1 * p_2^e_2 * ... * p_r^e_r. Now we define l_i = l mod p_i^e_i for each 1 <= i <= r. It holds:

  • l = l_1 (mod p_1^e_1)
  • l = l_2 (mod p_2^e_2)
  • ...
  • l = l_r (mod p_r^e_r)

We know that l < n (n is order of P, thus n*P = 0; we get all multiples of P (also Q) by multiplying P with elements from {0, 1, 2,..., n-1}). We also know that factors p_i^e_i are pairwise coprime. By Chinese Remainder Theorem we know that there exists a unique solution to the equations above inside {0,1,2,...,n-1} (there are other solutions which are congruent modulo n), thus this solution is l. We will use this fact at the end of the algorithm to construct l from l_i.

Now we will calculate l_1 (for other l_i we use the same approach). We write l_1 in base-p_1 presentation:

l_1 = z_0 + z_1*p_1 + z_2*p_1^2 + ... + z_(e1-1)*p_1^(e1-1) where z_i are from {0,1,2,...,p_1-1}

Now, we calculate z_i one at a time.

We define P_0 = (n/p_1)*P and Q_0 = (n/p_1)*Q.

The order of P_0 is p_1 (because p_1*(n/p_1)*P = n*P=0).

Q_0 = (n/p_1)*Q = (n/p_1)*l*P = l*(n/p_1)*P = l*P_0

And further (because p_1 * P_0 = 0):

l*P_0 = (z_0 + z_1*p_1 + z_2*p_1^2 + ... + z_(e1-1)*p_1^(e1-1)) * P_0 = z_0 * P_0

Thus: Q_0 = z_0 * P_0

This means if we are able to break ECDLP in <P_0>, we partially break ECDLP in <P>. We continue to extract other parts.

We define P_1 = (n/(p_1^2))*P and Q_1 = (n/(p_1^2))*Q.

The order of P_1 is p_1^2.

Q_1 = (n/(p_1^2))*Q = (n/(p_1^2))*l*P = l*(n/(p_1^2))*P = l*P_1

And further (because p_1^2 * P_1 = 0):

l*P_1 = (z_0 + z_1*p_1 + z_2*p_1^2 + ... + z_(e1-1)*p_1^(e1-1)) * P_1 = (z_0 + z_1*p_1) * P_1

Thus, when we solve Q_1 = x * P_1, we can obtain z_1 as:

x = z_0 + z_1*p_1
z_1 = (x - z_0)/p_1

Similarly, we compute all coefficients z and we have l_1. We repeat the process for l_2, l_3,..., l_r. Once we have all l_i, we can construct l using Chinese Remainder Theorem. See scripts/pohlig-hellman.py.

Pollard's rho attack

Once again we have points P and Q on elliptic curve and it holds l * P = Q (we are solving ECDLP).

Let us define n as order of P.

If we have f:<P> -> <P>, we can define a finite set: {X_i; i>0} where X_i = f(X_(i-1)) and X_0 from <P> (simply put: {X_0, f(X_0), f(f(X_0)), f(f(f(X_0))),...})

This set is finite because <P> is finite. So there exist t and s such that X_t = X_(t+s). It holds X_i = X_(i-s) for all i >= t+s.

We can find a pair X_i, X_j where X_i = X_j and i != j by searching pairs X_i, X_(2*i). We know that in the interval [t, t+s) there is a multiple of s. Let us say t <= u*s < t+s. Now we know X_(u*s) = X_(u*s + u*s) = X_(2*u*s).

This is because X_m = X_(m + u*s) for m >= t and each integer u.

Let us construct f now. We split <P> into partitions {S_1, S_2,...,S_32} of roughly the same size (for example according to the five least significant bits of x-coordinate of the point - the point goes into S_j if these five bits represents integer j-1).

We define function H: H(X) = j if X from S_j. We define some a_j, b_j from [0, n-1] for 1 <= j <= 32.

Function f is defined as: f(X) = X + a_j * P + b_j * Q where j = H(X)

Function f is random due to how function H is defined. Randomness is needed for performance reasons (to diminish the probability that we end up having some patological f which makes finding the desired pair slow).

So, in two steps: X -> H(X) -> X + a_(H(X)) * P + b_(H(X)) * Q

Is f(X) in <P>? Yes, because X_0 = u*P for some integer u. And:

f(u*P) = u*P + a_j * u * P + b_j * l * P = (u + a_j * u + b_j * l) * P

Thus, f(X_0) = f(u*P) is from <P>. Similarly, f(f(X_0)) is from <P>...

We can take P as X_0 and observe {P, f(P), f(f(P)),...}. But we do a little trick and observe elements of this set as being written in coordinates (a, b):

X_i = a * P + b * Q

The first element P is (1, 0): P = 1 * P + 0 * Q. We define two pointers: tortoise and hare. The tortoise will move one step at a time, hare will move two steps at a time. At the beginning both point to P. We make function f to receive coordinates and return the resulting point and its coordinates.

tortoise = P
hare = P
tortoise_coordinates = (1,0)
hare_coordinates = (1,0)
while True:
    tortoise, tortoise_coordinates = f(tortoise_coordinates)
    hare, hare_coordinates = f(hare_coordinates)
    hare, hare_coordinates = f(hare_coordinates)
    if tortoise == hare:
	break	

When tortoise == hare, we find X_i = X_(2i). Note that tortoise is most probably not the first element that gets repeated, so in most cases it will hold also X_(i-1) = X_(2i-1). However, each X_i can be expressed in many different ways using coordinates, for example:

X_i = a * P + b * Q
X_i = a * P + b * l * P # the same as above
X_i = (a - 1) * P + (b * l + 1) * P # an example how it can be expressed using different coordinates than (a, b)

So in most cases we will have X_i and X_(2*i) expressed with different coordinates:

X_i = a1 * P + b1 * Q
X_(2*i) = a2 * P + b2 * Q

and

a1 * P + b1 * Q = a2 * P + b2 * Q
(a1 - a2) * P = (b2 - b1) * Q
(a1 - a2) * inverse_mod(b2 - b1, n) * P = Q

Thus, l = (a1 - a2) * inverse_mod(b2 - b1, n). See bellow the output of pollard_rho.py script for points and their coefficients:

(155 : 63 : 1) (30, 45)
(132 : 86 : 1) (6, 72)
(20 : 222 : 1) (178, 29)
(125 : 100 : 1) (111, 225)
(40 : 45 : 1) (140, 31)
(176 : 184 : 1) (73, 227)
(57 : 124 : 1) (6, 184)
(116 : 119 : 1) (35, 229)
(115 : 4 : 1) (207, 186)
(170 : 83 : 1) (183, 213)
(147 : 22 : 1) (66, 105)
(170 : 146 : 1) (42, 132)
----
(9 : 211 : 1) (164, 24)
(170 : 83 : 1) (193, 69)
(147 : 22 : 1) (76, 200)
(170 : 146 : 1) (52, 227)
(9 : 211 : 1) (174, 119)
(170 : 83 : 1) (203, 164)
(147 : 22 : 1) (86, 56)
(170 : 146 : 1) (62, 83)
(9 : 211 : 1) (184, 214)
(170 : 83 : 1) (213, 20)
(147 : 22 : 1) (96, 151)
(170 : 146 : 1) (72, 178)