Jordan Forms, Nilpotent Transformation

Nilpotent Transformation

A transformation is nilpotent, with index p, if p is minimal, and applying the transformation p times squashes the entire vector space down to 0. If the transformation is implemented as a matrix M, M^p = 0.

If a nilpotent function M has index p, some vector z satisfies M^p-1*z ≠ 0. This is a consequence of p being minimal.

Build a set of vectors z, Mz, M²z, M³z, and so on up to M^p-1z. We will show that these vectors are independent, and span a space of dimension p.

Suppose a linear combination of these vectors produces 0. multiply through by M^p-1 on the left. Every term drops to 0 except the first term, cz. Yet this must still equal 0, hence the first coefficient is 0. Remove that term from the combination, and multiply through by M^p-2. This leaves only the second term, and its coefficient must be 0. Continue this process until all coefficients are 0. Thus the p vectors are linearly independent.

Since M moves each of these vectors onto another one, the space spanned by these vectors is invariant under M.

Set M = 0 for the trivial case, a nilpotent map with index 1. Since Mz = 0, any nonzero z will serve, spanning a space of dimension 1.

Complementary Space Exists

With M and z as above, the repeated images of z span a space of dimension p. This space always has a complementary subspace relative to M. The proof is a bit tedious, so hang on.

Let W be a vector space, and let M be a nilpotent transformation on W with index p. Select a vector z such that M^p-1z is nonzero. Recall that Mⁱ*z, as i runs from 0 to p-1, forms a set of p independent vectors.

Let the subspace N_i be spanned by M^jz, for all j ≥ i. This is a descending chain of spaces, where N_i has dimension p-i, and N_p = 0.

Let T_i be Mⁱ*W. This is another descending chain of spaces, with T₀ = W, and T_p = 0. Note that T_i contains N_i.

Let V_i and N_i be disjoint subspaces, such that the direct sum V_i*N_i = T_i, and V_i is invariant under M. (We already know N_i is M invariant.)

When i = p, V_p exists, and is equal to 0. Since M has index p, M^p maps all of W into 0, thus T_p = 0. With N_p and V_p = 0, we have N_p*V_p = T_p. Of course N_p and V_p are M invariant. Therefore V_p satisfies our criteria.

Take one step backwards, setting i = p-1. Now N_i is the one dimensional space spanned by y, where y = Mⁱz. Build a basis for T_i, starting with y, then let V_i be the space spanned by the basis vectors of T_i other than y. Clearly V_i and y are linearly independent, and together they span T_i. Also, V_i and y are invariant under M; in fact they are squashed down to 0. Thus V_i meets our criteria.

If p = 1 we are done. For p > 1, take the inductive step, working backwards from p-1 down to 0. For notational convenience, step back from 3 to 2.

By induction, there is a subspace V₃ corresponding to N₃. Let K₃ be the subspace satisfying M*K₃ ⊆V₃. Verify that K₃ is indeed a subspace.

If x is in V₃ then it is in T₃, which is the image of T₂, hence the map is onto. As spaces, M*K₃ = V₃.

Since V₃ is invariant, applying M to V₃ remains inside V₃. This means V₃ meets the definition of K₃, hence V₃ lies in K₃. In other words, K₃ is an extension of V₃.

Applying M to K₃ yields V₃, which is in K₃, hence K₃ is M invariant.

Next we need to show that K₃ and N₂ span T₂. Let x be any vector in T₂. Let y = Mx. By induction, y is a unique sum of two vectors, one from N₃ and one from V₃. Call these vectors q and r respectively. Remember that q is spanned by M³z, M⁴z, M⁵z, etc. Pull out a common factor of M, and q = Ms, where s is spanned by M²z, M³z, M⁴z, etc. Thus q = Ms, where s is in N₂. Write Mx = Ms+r, or r = M*(x-s), and x-s is in K₃. An arbitrary x is spanned by N₂ and K₃.

Next, show N₂∩K₃ is contained in N₃. Intersect M*N₂ and M*K₃, which is the same as N₃∩V₃, or 0. Our intersection lives in N₂, and when multiplied by M, the result is 0. Elements in N₂ that are one step away from 0 are in fact in N_p-1, i.e. a scale multiple of M^p-1z. Vectors of this form belong to N₃, hence N₂∩K₃ lies in N₃.

Consider N₂∩K₃∩V₃. With the first intersection in N₃, and N₃ and V₃ disjoint, the result is 0. Since V₃ is contained in K₃, N₂∩V₃ = 0. We built V₃ to be disjoint from N₃, and it is, but it is also disjoint from N₂.

Let S be the direct sum of N₂∩K₃ and V₃. With N₂ and V₃ disjoint, the direct sum is well defined. Clearly S contains V₃.

elements of S are produced by members of V₃ and K₃, hence S is contained in K₃. In other words, V₃ ⊆S ⊆K₃.

If we have a basis for S, extend it to include all of K₃. Let the additional vectors span a space called K₂. Thus the direct sum of S and K₂ gives K₃. Furthermore, K₂ and V₃ are disjoint, with V₃ entirely in S.

Finally we are ready to build V₂. Let V₂ be the direct sum of K₂ and V₃.

both K₂ and V₃ lie in K₃, hence V₂ is in K₃. Now M*V₂ lies in V₃, which is part of V₂, hence V₂ is M invariant.

Consider N₂∩V₂. Suppose x lies in both, and write x as q+r, where q is in V₃ and r is in K₂. Remember that N₂ and V₃ are linearly independent, hence no part of x lies in V₃. Therefore x lies in K₂, which is a subset of K₃. With x in K₃, and N₂, x is in their intersection, which is a set we called S. Now S and K₂ are disjoint, so x = 0. Therefore N₂ and V₂ are disjoint.

Finally we have the last criteria, N₂ and V₂ span T₂. Remember that V₂ is the direct sum of K₂ and V₃. We need to show N₂, K₂, and V₃ span T₂. We know that N₂ and K₃ span T₂ (shown earlier). Find a representation for an arbitrary vector x as p+q+r, where p is in N₂, q is in S, and r is in K₂. Review the definition of S. Part of q comes from N₂, and the other part comes from V₃. The former is obviously in our span. Since V₃ lies in V₂, the latter is also in our span. There is no trouble with p and r, hence x is spanned. Conversely, show that N₂, V₃, and K₂ (inside K₃) are all in T₂. The span of N₂ and V₂ is precisely T₂. The space V₂ satisfies our criteria, and that completes the inductive step.

March all the way down to T₀, which equals W. The space N₀ has a complementary space V₀, such that W is their direct sum, and both subspaces are M invariant.

Build a basis for M, starting with N₀ (i.e. z and its recursive images), and then V₀. The resulting matrix is block diagonal, with N₀ in the upper left and V₀ in the lower right. In fact we know what the upper block looks like. The transformation moves each basis vector onto the next one, hence it has ones just below the main diagonal, and zeros elsewhere. Multiply by one of the basis vectors on the right, and the 1 shifts down the column, giving the next basis vector. The last basis vector in N₀ maps to 0, as you would expect from a nilpotent transformation.

Repeat the Process

When M is restricted to V₀, as shown by the lower right block, M is once again a nilpotent transformation. Apply the above procedure to V₀, giving two new complementary subspaces. This splits the lower right block into two blocks. Thus our matrix now has three blocks. The first is unchanged, and the second looks like the first. It has ones on the subdiagonal and zeros elsewhere. The third block, in the lower right, represents yet another subspace with a nilpotent transformation. Apply the process again, and again, until the entire matrix consists of these blocks.

All you have to do is change the basis, i.e. establish a new coordinate system, and the matrix becomes block diagonal, where each block has ones on its subdiagonal. In other words, every nilpotent matrix is similar to a matrix that is zero everywhere, except for scattered ones along the subdiagonal. The intermittent zeros on the subdiagonal delimit the blocks. For instance, consider a 6×6 matrix with 2 blocks [1,2][3,6]. The first block comes from N₀ and the second from V₀, which does not split further in this example. Notice that 0, in position 3,2, separates the two blocks.

0	0	0	0
1	0	0	0
0	0	0	0
0	1	0	0
0	0	1	0
0	0	0	1

Geometric Interpretation

Let M be a nilpotent transfoormation on n space. There are n independent vectors that act as a basis, and these vectors can be partitioned into sets. Within a set, M moves one vector onto the next, onto the next, onto the next, and so on, until the last vector is squashed down to 0. A set of size one is a lone vector that is mapped to 0. Each set becomes a block in the block diagonal matrix that defines M. If all sets are lone vectors, all vectors are mapped to 0, and M is the zero matrix.

The index of M, as a nilpotent transformation, is the size of the largest set, or block. Thus the maximum index of a nilpotent transformation is n, the dimension of the space. This happens when all n vectors participate in the march to zero, and all entries in the subdiagonal of M are 1. Verify that the powers of M shift the ones down and to the left, until M^n-1 has 1 in the lower left corner, and Mⁿ is zero.