Quaternions, The Quaternions mod p

The Quaternions Mod p

We would like to understand the quaternions mod p. This is a finite ring of order p⁴. Addition in the ring is straightforward, but what about multiplication? The units form a nonabelian group, but what kind of group? As usual, x is a unit iff its norm is a unit. In this case |X| is nonzero. Thus x is a unit or a zero divisor.

If a quaternion integer is a zero divisor mod p, it is divisible by one of the primes over p, but how many primes are there? In a commutative ring extension of the integers there would be at most four, but in the quaternions there could be millions.

While we're in the neighborhood, we may as well analyze the quaternions over an arbitrary finite field. Set the finite field to Z_p to produce the quaternions mod p. It isn't that much harder to solve the problem for finite fields, so why not? If you're not familiar with finite fields, you can pretend our base field F is always Z_p, the integers mod p.

This theorem begins with a lemma, which characterizes the sum of squares mod p. We'll slog through that, then analyze the quaternions mod p.

Three Types of Quadratic Extensions Within the Quaternions

Let F be a finite field of order w. Adjoining an indeterminant x creates F[x], a pid. Let p(x) be a quadratic polynomial that splits into two distinct factors x-a and x-b. Now x-a and x-b span 1, so by the chinese remainder theorem, F[x]/p is isomorphic to the direct product of F[x]/(x-a) and F[x]/(x-b). These rings are, in turn, isomorphic to F. Thus the units in F[x]/p are the cross product of F^* with F^*. Since the units of F form a cyclic group, the group of units in this extension is Z_w-1². This is a ring extension of type 1.

Next assume p has a double root, that is, p = x²-2bx+b². Associate x-b with y, and the ring is isomorphic to F[y]/y². Again F^* is a group of units of order w-1. For any a, and b nonzero, ay-b is a unit; multiply by ay+b to find something in F. So the order of the group is (w-1)w. These are relatively prime, so the w-1 cycle of F^* is joined by a group of order w. If each unit is written ay+b, set y to 0 to map to the quotient group F^*. The kernel, with b set to 1, is the additive group of F, which is simply p cycles in parallel. Thus the group is the product of cycles of order w-1 and order p. This is a ring extension of type 2.

Finally let p(x) be irreducible over F. This produces a field extension of order w², and the units form a cyclic group of order w²-1. This is a ring extension of type 3.

In each case F[x] is a ring of size w².

We could have adjoined any other element in this ring, outside of F, and obtained the same ring. Adjoin x+b, subtract b, and get x back again. Thus the elements that produce quadratic extensions can be gathered together into equivalence classes, where each class defines a commutative ring extension of F.

Every x in the quaternion extension of F, other than x ∈ F, defines a quadratic extension. So the quaternions comprise F and a number of ring extensions, each bringing in another w²-w elements. There are w⁴ elements in total, so there are w²+w+1 different ring extensions of F in the quaternions. These are all of type 1 2 or 3.

Distribution of the Sum of Squares

Let F be a finite field of order w. Consider ordered pairs a,b drawn from F. Let two ordered pairs be in the same equivalence class if one is a scale multiple of the other, and the scalar is a unit in F. The class 0,0 has size 1, and all other classes have size w-1.

If a²+b² = c, the other members of this class yield c times the various squares of F. When c = 0 the entire class produces 0, so assume c ≠ 0. If w is even, each unit has a unique square root, hence the class covers all the units of F evenly. In other words, the members of the class map 1-1 onto the units of F. When w is odd, each unit in F has two square roots or none. If the discrete log is even there are two square roots; if it is odd there are none. thus all the squares or all the nonsquares are hit exactly twice, depending on the residue class of c.

Assume w is even. Show that a²+b² = 0 iff a = b. Thus w different ordered pairs satisfy a²+b² = 0. Meanwhile, w classes (select any a), each of size w-1 (select any b ≠ a), spread their values evenly across the nonzero values of F. The distribution is perfectly flat.

Next assume w is odd. Suppose only squares are accessible via a²+b². That means 2 is a square, since it is the sum 1+1. By induction, every integer mod p is a square. Using similar reasoning, y+n is a square iff y is a square, for every integer n mod p. Squares and nonsquares group together in equivalence classes of size p. Yet we know there are (w-1)/2 nonsquares, corresponding to g^k, where g generates F and k is odd. This quantity cannot be divisible by p, so we have reached a contradiction. There are cases when a²+b² produces a nonsquare.

Since both squares and nonsquares are generated, every value of F is the image of a²+b² for some a and b. Eachclass (that does not lead to 0) coveres the squares, or the nonsquares, evenly; we need to show that nonsquares are generated as often as squares.

First assume -1 is not a square. So we don't have to worry about a²+b² = 0, except for a = b = 0.

Adjoin i to produce the unique field of order w². Let g be a generator for this larger field. Let complex conjugation, an automorphism, take g to g^k, where k is relatively prime to w²-1. Let the exponent e range from 1 to w²-1. Thus g^e covers all the units in the larger field - all the nontrivial values of a+bi.

Now a²+b² = g^e times its conjugate, or g^(k+1)e.

Let h = g^k+1. Since h is fixed by conjugation it is in F, and k+1 is some multiple c times w+1. In other words, the discrete log of h in the group F^* is c.

Let x be the gcd of c and w-1. If x is greater than 1, then all the powers of h have a discrete log divisible by x. This means some of the values of F are not accessible as powers of h, and cannot be represented as a²+b². Since all the values of F are accessible, this is a contradiction. Therefore c and w-1 are coprime.

As we raise h to various powers, from 1 to w-1, all the values in F^* are hit exactly once. Let the exponent run all the way up to w²-1, and each value is hit w+1 times. The distribution of a²+b² is even across F^*, and 0 is produced only once, when a = b = 0.

Next let -1 be a square in F. Thus the classes represented by 1,i and 1,-i yield 0. Since p is odd, i cannot equal -i; these are different classes. And any a²+b² that equals 0 belongs to one of these two classes. Of course 0,0 produces 0. Thus 0 is created in 2w-1 different ways.

Build a map from a,b to a+bi,b. Verify that this function is 1-1, and since it maps a finite set into itself, it is onto. If we consider all the pairs a+bi,b we are considering all the pairs a,b.

Evaluate the sum of squares and get a²+2abi. Fix a nonzero value of a. As b runs through its paces, a²+2abi covers all of F. And this holds for each nonzero value of a. Thus each value of F is hit w-1 times. Set a = 0 and 0 is hit another w times. Therefore, 0 is hit 2w-1 times, and each value in F^* is hit w-1 times.

Scaling one of the Squares

For m nonzero, the distribution of a²+mb² is even across F^*. This is a generalization of the above.

If m is a square we can simply scale b by sqrt(m), and apply the previous lemma. We only need consider m a nonsquare. This means w is odd.

As before, each class of values hits the squares or the nonsquares twice, or else a² + mb² = 0. Set a = 1 and b = 0 to produce a square - set b = 1 and a = 0 to produce a nonsquare. So squares and nonsquares are hit. Are they both hit equally?

First assume -m is not a square, and adjoin y = sqrt(-m) to produce the unique field of order w². Now we don't have to worry about a²+mb² = 0, except for a = b = 0. As before, g^e covers all the units in the field extension, all the nonzero pairs of a and b. And a² + mb² is g^e times its conjugate, or g^(k+1)e. The earlier argument demonstrates an even distribution, hence each unit in the base field F is generated w+1 times.

Next let -m be a square in F, so that y² = -m. Build a 1-1 correspondence as above, mapping a,b to a+yb,b. Fix some nonzero value of a and run b through its paces, generating a²+2aby, which is all of F. Set a = 0 to find another w instances of 0. As before, 0 is produced 2w-1 times, while every unit appears w-1 times.

By symmetry, the distribution of ma²+b² is the same as the distribution of a²+mb².

Counting Quaternion Units and Zero Divisors

Let F be a finite field of order w. We already showed that its quaternion extension consists of units and zero divisors. If we know how many units we know how many zero divisors, and vice versa. Let's count the zero divisors, to determine the number of units, and hence the size of the quaternion multiplicative group.

Recall that x is a unit iff its norm is a unit, iff its norm is nonzero in F. Thus x is a zero divisor iff its norm is 0.

Assume w is even. Whatever the value of a²+b², there are w combinations of c²+d² that yield the same value. That's w³ quaternions with norm 0. With w³ zero divisors, we are left with w³(w-1) units. This is confirmed when w = 2, 8 units and 8 zero divisors.

Next assume w is odd. Choose the first two coefficients randomly, and assume their squares sum to c. How many ways can we choose the last two coefficients so that their squares sum to -c?

Assume -1 is not a square in F. If c is 0 then the first two coefficients are 0, as are the last two. If c is nonzero there are w+1 combinations for the last two. Thus we have 1 + (w²-1)*(w+1).

When -1 is a square, c comes up nonzero (w-1)² times, and each time we generate -c w-1 times. To this we add (2w-1)². For any odd w, there are w³+w²-w zero divisors, and (w-1)²*w*(w+1) units.

The Size of the Kernel

Remember that the size of the kernel (which I will call K) is the size of the quaternion group divided by the size of the image under the norm homomorphism. Every square in F is covered, and we showed above that some a+bi has a nonsquare for a norm, hence the nonsquares are also covered. The order of K is (w-1)×w×(w+1).

Quaternions Over Z₂

In characteristic 2, the quaternions form a commutative ring. Thus the group of units is abelian.

There are 8 units, when an odd number of components are set to 1. These all have norm 1, which is the only unit in Z₂.

The square of any unit is 1. Thus the group of units is Z₂³.

Quaternion Units Over An Even Finite Field

If the order of the finite field F is w, and w is even, the order of the group of units is w³(w-1).

The norm is a group homomorphism into F^*. Since everything in F has a square root in F, the norm maps onto F^*. Thus the group of units has a homomorphic image that is Z_w-1. This is only meaningful when w > 2, but we already analyzed w = 2 above.

Since the group is abelian, with quotient group Z_w-1, all we need do is characterize the kernel K, which has order w³.

Verify that x times x = x times its conjugate, which is the norm of x. Thus x² = 1 for every x in K. Everything in K has order 2, and K is the direct product of 3log₂(w) copyies of Z₂. Join this with Z_w-1 and you have the entire group of units.

Elements of Order p

Let x = a+bi+cj+dk, and assume x has order p. Since a commutes with bi+cj+dk, it is ok to use the frobenius homomorphism. Thus x^p = a^p + (bi+cj+dk)^p. The square of the second factor is -b²-c²-d², an element in F. Raise this to the (p-1)/2 power to get another element in F. Finally multiply by bi+cj+dk. If x^p = 1, then this piece is suppose to be 0. That can only happen if b²+c²+d² = 0. Finally, a^p = 1, whence a = 1.

Conversely, let b²+c²+d² = 0, and a = 1. Expand this as above to get 1. We can tell at a glance whether x has order p.

How many different elements have order p? How many triples have b²+c²+d² = 0? We need to run separate calculations for w = 1 or 3 mod 4.

Let -1 be a square in F. Set b = 0, and c and d can be chosen in 2w-1 different ways. When b is nonzero we need a sum of squares that equals -b², and this can be done in w-1 different ways. That's 2w-1 + (w-1)(w-1) = w².

Next assume -1 is not a square in F. When b = 0, c = d = 0. When b is nonzero, there are w+1 choices for c and d. That's 1 + (w-1)(w+1), or w² possibilities. Either way there are w² elements of order p. Actually we must subtract one, since 1 fits our criteria, yet it has order 1, rather than order p. So there are w²-1 elements of order p.

p-Sylow Subgroups

By macay's theorem, let x have order p. The ring extension F[x] has to be an extension of type 2, with units isomorphic to Z_w-1 cross the additive group of F. There are w-1 units of order p in this group. And this holds for every p-sylow subgroup.

Let g generate one of these sylow subgroups. Let g have norm z in F, and raise g to the p, again and again, until you reach g^w. This raises z to the p, again and again, and yields 1. Yet only 1, raised to the p, yields 1 in a group of order w-1. Therefore the norm of g is 1. All the sylow subgroups of order w live in K.

As you recall, the quadratic ring extensions are disjoint, except for F. If two sylow subgroups intersect in anything other than 1, adjoin that common element, and both sylow subgroups live in F[x], hence they are the same. Therefore sylow subgroups are disjoint.

Each sylow subgroup contains w-1 elements of order p, and we showed above that there are w²-1 elements of order p, hence there are w+1 sylow subgroups. Use the third sylow theorem as a sanity check. The number of sylow subgroups, w+1, is 1 mod p, and is a factor of w²-1, which is the index of the sylow subgroup within K.

Multiply 1+uy by 1+vy, and throw away y², since that equals 0. The result is 1+(u+v)y. Thus the sylow subgroup associated with 1+y comprises 1 + all the scale multiples of y. This confirms what we saw earlier; each sylow subgroup looks like addition in F.

Counting Field Extensions

We've counted the ring extensions of type 2, how about the ring extensions of type 3, i.e. the field extensions? Each extension inccludes two square roots of v, which is a nonsquare in F. How many ways can x be the square root of v?

If x² lies in F then it has no constant term, and the value of x² is -b²-c²-d². How many ways can b²+c²+d² = -v?

Let -1 be a square, hence -v remains a nonsquare, and -v-b² is never 0. Select b in w different ways, and c²+d² in w-1 different ways. That's w²-w combinations. Two of these group together in each field extension, so there are (w²-w)/2 extensions of type 3.

If -1 is not a square then -v-b² could equal 0 in two different ways. That's 2 + (w-2)(w+1), or w²-w. Again, (w²-w)/2 field extensions.

There are w+1 extensions of type 2, and w²+w+1 extensions all together. That leaves (w²+w)/2 extensions of type 1.

Semidirect Product

When w is even, the quaternions are a direct product of kernel and quotient under the norm homomorphism. When p is odd, we have a semidirect product. Build a reverse homomorphism that takes F^* back into the quaternions. If g generates F, we are looking for the reverse image of g.

Earlier we showed that a²+b² covers squares and nonsquares. The same holds for a²-b², using essentially the same proof. If only squares are covered, and y is a square, then so is y-1, and y-2, and so on, and the reasoning flows from there. So there is a difference of squares equal to a nonsquare, and when you scale a and b appropriately, you have a²-b² = g.

Find three squares that sum to -b². Since two squares can sum to anything, we have no trouble doing this. Make these three numbers the coefficients on i j and k, and call this y.

Show that y² = b², and |a+y| = a²-b² = g. If a+y has order w-1, we're done.

Since |a+y| is a unit, a+y is a unit. Raise this to the w, then divide by a+y to get (a+y)^w-1. Raise a+y to the p using the frobenius homomorphism, and get a^p+y^p. Do this again and again and get a^w+y^w. Replace a^w with a. Since y² is a square, y^w-1 = 1. Thus y^w = y. We get a+y back again, and when you divide by a+y, you get 1. That completes the reverse homomorphism, and the quaternions are split exact - a semidirect product.

If this is a direct product then a+y commutes with everything, which means y commutes with everything, which means iy = yi. If y contains cj then we have ck = -ck, hence c = 0. Similarly, the coefficient on k is 0. Thus y is a multiple of i, and then it doesn't commute with j. It is indeed a semidirect product.

Odd Sylow Subgroups Inside K

Recall that the p-sylow subgroups all live in K. Let's look for more sylow subgroups in K. Let q be an odd prime that divides w+1, hence it does not divide w-1. Map an element of order q through the norm homomorphism, into a group of size w-1, and the norm must equal 1. The q-sylow subgroups all live in K. Each belongs to an extension of type 3, and each is cyclic, even if we're talking about q raised to some power. If g generates the quadratic field extension, then the sylow subgroup is generated by g to the (w²-1)/q, or perhaps a power of q if appropriate. This is part of a larger cycle of order w+1 that lives in K. We already counted the number of field extensions of type 3, hence there are (w²-w)/2 q-sylow subgroups. Reduce this mod w+1 and get -1 times -2/2, or 1. The count is 1 mod q, as it should be.

The last possibility is q, or a power of q, dividing w-1. Let z be an element with this order and place z in an extension of type 2. Remember that these units map onto the units of F, with the p cycles living in the kernel. It's a direct product, and |z| also has order q in F. The norm is not 1, and z is not in K.

Place z in an extension of type 3. A generator g, raised to the w-1, again and again and again, produces the elements with norm 1. So a multiple of w-1 = (w²-1)/q. Some integer times q = w+1. But q does not divide w+1, so this is impossible.

Finally, z lives in an extension of type 1. We will show this extension is cyclic when restricted to K. Adjoin y, where y² = m² in F. By the chinese remainder theorem, the extension is isomorphic to F[s]/(s-m) cross F[t]/(t+m). The isomorphism is realized by reducing a+by mod y±m. The image is a±bm. These are the variables I call s and t. The norm of a+by is a+by times a-by, or a²-m²b². This also happens to be s times t. We want this to be 1. Let s step through the units of F in multiplicative order. Let t be the inverse of s, stepping through F in the reverse order. This is a cyclic group in K. The q-sylow subgroup lives in this cyclic group, and is cyclic. There are (w²+w)/2 such extensions, and hence this many sylow subgroups. This is 1 mod w-1, and 1 mod q, as it should be.

K mod its Center is Simple

Two sections from now, I will build an isomorphism between the quaternions and the 2 by 2 matrices. In fact, quaternions with norm 1 map to matrices with determinant 1. Mod out by the center ±1, and that matrix group is simple. So you get, as a corrolary, that the quaternions with norm 1, mod ±1, is a simple group. (An exception is made for p = 3.) You can read about matrix groups here.

Units mod p²

Consider the quaternions mod p². Once again x is a unit iff |x| is a unit. Downwstairs, the norm is a unit iff it is a unit mod p. Thus the ring homomorphism that reduces the quaternions mod p induces a group homomorphism from units onto units. The quotient group is the group of units that we analyzed above. To find the kernel, consider tuples that are 1 plus a quaternion multiple of p. The product of two such elements adds the coefficients on p. Thus the kernel is Z_p⁴.

Is the group a semidirect product? Let's try to build the reverse homomorphism that proves the sequence is split exact. Expand (-1+sp)², throwing out s²p², and s must equal 0. (This assumes p is odd.) Thus the preimage of -1 is -1.

Expand (j+sp)². The real and j components of s must be 0. The i and k components could be any multiple of p. Similarly, the preimage of k is k+tp, where t has i and j components. Now consider (j+sp)*(k+tp). The i component of s survives as j, and that can't happen. Similarly the i component of t survives as k. These have to be 0. Do the same for i*k and j*k, and the preimage of Q₈ is Q₈.

Let v be primitive mod p and evaluate (v+up)^p-1. Since v and up commute, you can use the binomial theorem. Solve for u and get u = (v^p-1-1)/p*v; In the reals this would be enough for a semidirect product, actually a direct product, but we have i j and k to worry about.

How about (1+j)². This is a group homomorphism, not a ring homomorphism, so the preimage isn't necessarily 1+j. Throw in an sp and expand. The result has to be 2 (with its extra real multiple of p), times j. The i and k components of s are 0. The real and j components are equal. Their sum is the piece that is brought in by 2. Call the preimage 1+j+rp+rpj, where 2rp goes with 2. A similar analysis shows the preimage of 1-j is 1-j+rp-rpj. Evaluate (1+j)*(1-j), and the extra piece is 4rp. yet it should be 2rp. Therefore r is 0. The extra piece brought in by 2 is 0. Raise 2 to its order and get 1 mod p². However, 4 is not 1 mod 9, and 16 is not 1 mod 25, and 8 is not 1 mod 49, and 1024 is not 1 mod 121, and so on. Even if there is a prime where this works out, something else will probably break down. I'm going to stop here and just say that its not a semidirect product.

The quaternions mod p³ have the quaternions mod p² as a homomorphic image, and so on through the higher powers of p.

with this in hand, the quaternions mod n can be analyzed by the chinese remainder theorem.

Primes over Primes

Let p be an odd prime. There are p³+p²-p zero divisors in the quaternions mod p. If z is such a divisor then zy becomes a quaternion divisible by p. This means one of the quaternion primes lying over p divides z. Another zero divisor will have another prime lying over p, and so on.

How many primes lie over p anyways? I'm going to make a really gross assumption, that the zero divisors are scattered uniformly throughout the hypercube of volume p⁴. The true primes are now the zero divisors that are close to one of the 16 corners. The norm is p, and the distance is sqrt(p).

The volume of a 4 dimensional sphere of radius sqrt(p) is basically 5p². Divide this by p⁴, then multiply by the number of zero divisors, and there are approximately 5p primes lying over p. In practice it equals 8(p+1). Why the discrepancy? Because we can't have primes too close to the origin. the norm must be exactly p, not ≤ p, so a volume argument is a bit bogus.

Why 8(p+1)? Quaternions of norm p naturally clump together in sets of size 8, since there are 8 units. Let x represent such a set of 8. Now x generates a left ideal whose elements all have norms divisible by p. Conversely a left ideal generated by some g with |g| divisible by p can also be generated by some x with |x| = p. Run the gcd algorithm on g and p to find x. And if x and y are in the same left ideal then their ratio is a unit, and they belong to a set of 8. So there is a correspondence between elements with nor p and their left ideals. Finally count left ideals by switching to the 2 by 2 matrices. I know, you haven't seen that yet, but it is coming up. The quaternions mod p are the same as the 2 by 2 matrices mod p, and there are p+1 left ideals in the latter ring. That completes the proof.

Supported by a computer program.

This function is multiplicative. If n = p*q then a quaternion x with norm n is not irreducible, as shown earlier. Write x as a quaternion p₁ times a quaternion q₁. Then let g be the gcd of x and p. Now g is a multiple of p₁, and its norm is a multiple of p. The norm can't be any larger, hence g is another quaternion over p. The factor that takes you from p₁ to g is a unit, hence g is simply an associate of p₁. This implies q₁. The separation is unique, and x is uniquely a product of a quaternion with norm p by a quaternion with norm q. From here it is simply combinatorics. Thus there are 768 quaternions with norm 77, and 4608 quaternions with norm 385.

Be careful of p². A prime and its conjugate commute, so the formula is not simply 8*(p+1)². In fact it seems to be 8*(p+1)*p. Assume x, other than p, has norm p², hence x and its associates generate a left ideal, which becomes a left ideal in the quotient ring mod p². This left ideal has index p⁴ in the ring. (This is the volume of the hypercube spanned by x.) Conversely, let H be a left ideal with index p⁴. It is principal, hence generated by g, where |g| = p². If y, having the same norm, is in H then y is merely an associate of g. Once again we are counting left ideals in the ring of 2 by 2 matrices mod p², having index p⁴, or size p⁴, and not counting the ideal where all four entries are multiples of p. The right column could be l times the left, for any scaling factor l from 0 to p²-1, or the left column could be l times the right, where l is a multiple of p. That's p²+p ideals, giving the formula 8p(p+1). You get the same formula through combinatorics; start with a prime over p and multiply by any prime other than its conjugate. This proves primes over p do not commute unless they are conjugate.

Continuing on, p³ yields 8*(p+1)*(p²+1). I haven't analyzed this one, or the higher powers of p.