Modular Mathematics, Discrete Logs

Discrete Logs

If n is prime, and a primitive element b has been found, and the primes dividing n-1 are relatively small, an efficient algorithm computes discrete logarithms base b.

If c = by for some unknown y, we can first ask whether y is even. If so, then c(n-1)/2 = 1. If y is odd then divide c by b, so that y becomes even, and add 1 to an accumulator that will, at the end of the day, equal y, which is the log of c.

If y is divisible by 4 than c(n-1)/4 = 1. If not then divide c by b2 and add 2 to the accumulator. Repeat until we have exhausted all the factors of 2 in n-1.

If p is a small odd prime dividing (n-1)/2k, we can do the same for p. If y is divisible by p then c(n-1)/2k/p = 1, and we can move on to p2, or the next odd prime. Otherwise, divide c by b2k, and add 2k to the accumulator. Within p iterations we will reach our goal.

Do this for all the prime powers dividing n-1. The log of c is computed in time proportional to the sum of the primes (including multiplicities) dividing n-1. Of course computation time also increases with the number of bits in n. I won't go into the details here; let's just say discrete logs can be computed quickly when the prime factors of n-1 are reasonably small.

There is nothing special about the integers mod n; the same approach works for arbitrary cyclic groups, assuming you can multiply an element by an integer and see if it becomes the group identity. Factor the order of the group and apply the foregoing procedure. For example, take logs mod nk, since the group of units is cyclic.

Public Key Cryptography

Alice and Bob agree on a large prime p, and a primitive root b. Alice generates a random number k, which is her private key, and computes bk mod p, which is her public key. Bob does the same.

Alice sees Bob's public key, and raises it to the power of her private key. Bob does the same. Both parties now have a number between 0 and p, which is b raised to the product of their private keys. They do not know each other's private keys, but they do know b raised to the k1k2. This is a secret that they share - their "shared" key.

The secret that Alice and Bob know, that is inaccessible to everyone else, is typically used to run other, simpler encryption schemes that require a shared key. (These are not described here.) If repeated use of the shared key might jeopardize its security, Alice and Bob can renogotiate a new key any time they wish. They select new random numbers, which act as private keys, post the corresponding public keys, combine their results to create a new shared key, and off they go.

In summary, the intractability of discrete logs can be used to jump-start a public key cryptography system, where any party can communicate securely with any other party, and/or authenticate his own communications (digital signature). The security rests on the difficulty of computing discrete logs and reverse engineering the private keys. Be sure to select p so that p-1 contains large prime factors, and is basically unfactorable. Thus private keys remain private, and secure communication is assured.

1 mod p inside pk

The group of units mod pk consists of two cycles in parallel, having orders pk-1 and p-1. Force the second cycle to 0 and find a subgroup of order pk-1. These are the units that are 1 mod p.

This is a cyclic group, and multiplication within the cycle is easily implemented by exponentiation mod pk, hence discrete logs are well defined. Given a generator b, the log of c base b is y, if by = c.

You may be wondering about p = 2. The multiplicative group of odd elements mod 2k is not cyclic, and logs don't make sense. However, the units that are 1 mod 4 do form a cyclic group. When this theorem is applied to 2k, b and c are assumed to be 1 mod 4.

The subgroups inside our cyclic group are the elements that are 1 mod pi, as i runs from 1 to k. This is a descending chain of subgroups, from the largest cycle of order pk-1 down to the element 1. An element is a generator iff it belongs to the largest cycle, iff it is 1 mod p and not 1 mod p2. It is often convenient to set b = 1+p.

In general, b could be 1 mod pi, and not 1 mod pi+1, whence b acts as a base for discrete logs in the group of units equal to 1 mod pi. In the formula that follows, c-1 has at least as many factors of p as b-1, and if p = 2, b is at least 1 mod 4.

If p is small, we can use the earlier procedure to derive discrete logs, but p may be large, perhaps 50 digits, so what do we do?

Strange as it may seem, the formula from calculus gives the right answer. At this point I'm going to change notation, regarding b and c. Let 1+bp be the base, and use the following formula to find log(1+cp) with respect to this base.

cp - (cp)2/2 + (cp)3/3 - (cp)4/4 + …
over
bp - (bp)2/2 + (bp)3/3 - (bp)4/4 + …

Let y be the discrete log, which is an integer between 0 and pk-1, not yet determined. As integers, (1+bp)y = 1+cp+jpk. Set d = c+jpk-1. Thus (1+bp)y = 1+dp. Calculus now tells us y = log(1+dp)/log(1+bp).

dp - (dp)2/2 + (dp)3/3 - (dp)4/4 + …
over
bp - (bp)2/2 + (bp)3/3 - (bp)4/4 + …

Each term in either infinite series is a rational number, which embeds in the reals using the traditional metric. The partial sums are also rational numbers, which approach a real number, since the series is convergent in a complete metric space. At the same time, each term, and each partial sum, is a p-adic number. Even a rational like 3/67 mod 7 is 3 over 4+2×7+49, which expands into a convergent series, which is a p-adic number.

At this point we need to do some bookkeeping on the terms of log(x), to show that the powers of p in the numerator overrun the powers of p in the denominator, whence the series converges in the p-adic numbers. Assume d has no powers of p - a worst case scenario. The first term, dp, has valuation 1. The nth term has n powers of p on top, and down below we find at most logp(n) factors of p. Clearly the valuation approaches infinity. Also, the valuation of each term is positive (we'll need this later).

In summary, log(1+dp) and log(1+bp) are well defined, as real numbers and as p-adic numbers. Their quotient is also well defined, and when viewed as a real number, it equals y.

The quotient of the limit of two sequences is the limit of the quotients, so take the limit of the quotients of the partial sums as n approaches infinity. Let this sequence be q1 q2 q3 etc. Each point in this sequence is a rational number, which can be viewed as a real number or a p-adic number. The limit, under the euclidean metric, is y. The limit, under the p-adic metric, exists.

Subtract y from the terms of the sequence and the limit becomes 0. As p-adic numbers, the limit is z, yet to be determined.

What would it look like if z also equals 0? The fractions in q have more and more powers of p in the numerator, and even larger denominators, devoid of p, to assure convergence in the reals. Something as simple as qn = pn/(p+1)n will do, so this seems plausible.

Suppose z is nonzero. Select an ε, and one can move out along the sequence q, such that qn is within ε of 0, and the first m terms of z are fixed. These first m terms sum to a rational number v. Decrease ε further, and the next approximation to z is v plus an integer. Any additional terms are powers of p with coefficients in [0,p-1]. If v is 234.3, for example, it will not be possible to shrink v down to 0.2. This is a contradiction, hence z = 0. The p-adic limit equals the real limit.

At this point we know that log(1+dp)/log(1+bp) equals y, as a p-adic number. Cancel p from both logs. Since each term has a positive valuation, this is no problem. The valuation of d is at least the valuation of b, and the same holds for their respective series.

Unlike the reals, the representation of a p-adic number as a series is unique. The two logs divide perfectly, leaving a finite linear combination of powers of p that is equal to y. We only care about y mod pk-1, since that is the order of the cyclic multiplicative group generated by 1+bp. Change each d back to c, and every term in the difference has valuation k-1 or higher. The quotient could change, but it is still equal to y mod pk-1. For our purposes, we can return to log(1+cp)/log(1+bp), and nothing has changed.

Next, turn each series into a finite sum by ignoring terms with valuation k-1 or higher. In most cases we can simply cut off the series at k-1 terms. However, if p divides k, we must include ckpk-1/k, as its valuation is below k-1. This is the formula for discrete logs, and it is quite manageable.

1 mod t inside m

A similar formula produces discrete logs base 1+bt mod m, provided m has the same prime factors as t, raised to higher powers. Verify that the units equal to 1 mod t form a cyclic group, and 1+bt is a generator iff b and t are coprime. If y is the log, reduce the problem mod pk, and y is still the log, or more accurately, y mod pk-1 becomes the log. Thus the formula above gives us the value of y mod pk-1. Do this for all primes dividing m and apply the chinese remainder theorem. The result has to be y. The best way to evaluate the formula for all the primes in parallel is to operate mod m. Use the largest exponent k, extant in the factorization of m/t, to turn each infinite series, above and below, into a finite sum.

If you are deriving many discrete logs relative to a fixed base, precompute log(1+bt). This becomes a constant in your formula.