If a nilpotent function M has index p, some vector z satisfies Mp-1*z ≠ 0. This is a consequence of p being minimal.
Build a set of vectors z, Mz, M2z, M3z, and so on up to Mp-1z. We will show that these vectors are independent, and span a space of dimension p.
Suppose a linear combination of these vectors produces 0. multiply through by Mp-1 on the left. Every term drops to 0 except the first term, cz. Yet this must still equal 0, hence the first coefficient is 0. Remove that term from the combination, and multiply through by Mp-2. This leaves only the second term, and its coefficient must be 0. Continue this process until all coefficients are 0. Thus the p vectors are linearly independent.
Since M moves each of these vectors onto another one, the space spanned by these vectors is invariant under M.
Set M = 0 for the trivial case, a nilpotent map with index 1. Since Mz = 0, any nonzero z will serve, spanning a space of dimension 1.
Let W be a vector space, and let M be a nilpotent transformation on W with index p. Select a vector z such that Mp-1z is nonzero. Recall that Mi*z, as i runs from 0 to p-1, forms a set of p independent vectors.
Let the subspace Ni be spanned by Mjz, for all j ≥ i. This is a descending chain of spaces, where Ni has dimension p-i, and Np = 0.
Let Ti be Mi*W. This is another descending chain of spaces, with T0 = W, and Tp = 0. Note that Ti contains Ni.
Let Vi and Ni be disjoint subspaces, such that the direct sum Vi*Ni = Ti, and Vi is invariant under M. (We already know Ni is M invariant.)
When i = p, Vp exists, and is equal to 0. Since M has index p, Mp maps all of W into 0, thus Tp = 0. With Np and Vp = 0, we have Np*Vp = Tp. Of course Np and Vp are M invariant. Therefore Vp satisfies our criteria.
Take one step backwards, setting i = p-1. Now Ni is the one dimensional space spanned by y, where y = Miz. Build a basis for Ti, starting with y, then let Vi be the space spanned by the basis vectors of Ti other than y. Clearly Vi and y are linearly independent, and together they span Ti. Also, Vi and y are invariant under M; in fact they are squashed down to 0. Thus Vi meets our criteria.
If p = 1 we are done. For p > 1, take the inductive step, working backwards from p-1 down to 0. For notational convenience, step back from 3 to 2.
By induction, there is a subspace V3 corresponding to N3. Let K3 be the subspace satisfying M*K3 ⊆V3. Verify that K3 is indeed a subspace.
If x is in V3 then it is in T3, which is the image of T2, hence the map is onto. As spaces, M*K3 = V3.
Since V3 is invariant, applying M to V3 remains inside V3. This means V3 meets the definition of K3, hence V3 lies in K3. In other words, K3 is an extension of V3.
Applying M to K3 yields V3, which is in K3, hence K3 is M invariant.
Next we need to show that K3 and N2 span T2. Let x be any vector in T2. Let y = Mx. By induction, y is a unique sum of two vectors, one from N3 and one from V3. Call these vectors q and r respectively. Remember that q is spanned by M3z, M4z, M5z, etc. Pull out a common factor of M, and q = Ms, where s is spanned by M2z, M3z, M4z, etc. Thus q = Ms, where s is in N2. Write Mx = Ms+r, or r = M*(x-s), and x-s is in K3. An arbitrary x is spanned by N2 and K3.
Next, show N2∩K3 is contained in N3. Intersect M*N2 and M*K3, which is the same as N3∩V3, or 0. Our intersection lives in N2, and when multiplied by M, the result is 0. Elements in N2 that are one step away from 0 are in fact in Np-1, i.e. a scale multiple of Mp-1z. Vectors of this form belong to N3, hence N2∩K3 lies in N3.
Consider N2∩K3∩V3. With the first intersection in N3, and N3 and V3 disjoint, the result is 0. Since V3 is contained in K3, N2∩V3 = 0. We built V3 to be disjoint from N3, and it is, but it is also disjoint from N2.
Let S be the direct sum of N2∩K3 and V3. With N2 and V3 disjoint, the direct sum is well defined. Clearly S contains V3.
elements of S are produced by members of V3 and K3, hence S is contained in K3. In other words, V3 ⊆S ⊆K3.
If we have a basis for S, extend it to include all of K3. Let the additional vectors span a space called K2. Thus the direct sum of S and K2 gives K3. Furthermore, K2 and V3 are disjoint, with V3 entirely in S.
Finally we are ready to build V2. Let V2 be the direct sum of K2 and V3.
both K2 and V3 lie in K3, hence V2 is in K3. Now M*V2 lies in V3, which is part of V2, hence V2 is M invariant.
Consider N2∩V2. Suppose x lies in both, and write x as q+r, where q is in V3 and r is in K2. Remember that N2 and V3 are linearly independent, hence no part of x lies in V3. Therefore x lies in K2, which is a subset of K3. With x in K3, and N2, x is in their intersection, which is a set we called S. Now S and K2 are disjoint, so x = 0. Therefore N2 and V2 are disjoint.
Finally we have the last criteria, N2 and V2 span T2. Remember that V2 is the direct sum of K2 and V3. We need to show N2, K2, and V3 span T2. We know that N2 and K3 span T2 (shown earlier). Find a representation for an arbitrary vector x as p+q+r, where p is in N2, q is in S, and r is in K2. Review the definition of S. Part of q comes from N2, and the other part comes from V3. The former is obviously in our span. Since V3 lies in V2, the latter is also in our span. There is no trouble with p and r, hence x is spanned. Conversely, show that N2, V3, and K2 (inside K3) are all in T2. The span of N2 and V2 is precisely T2. The space V2 satisfies our criteria, and that completes the inductive step.
March all the way down to T0, which equals W. The space N0 has a complementary space V0, such that W is their direct sum, and both subspaces are M invariant.
Build a basis for M, starting with N0 (i.e. z and its recursive images), and then V0. The resulting matrix is block diagonal, with N0 in the upper left and V0 in the lower right. In fact we know what the upper block looks like. The transformation moves each basis vector onto the next one, hence it has ones just below the main diagonal, and zeros elsewhere. Multiply by one of the basis vectors on the right, and the 1 shifts down the column, giving the next basis vector. The last basis vector in N0 maps to 0, as you would expect from a nilpotent transformation.
All you have to do is change the basis, i.e. establish a new coordinate system, and the matrix becomes block diagonal, where each block has ones on its subdiagonal. In other words, every nilpotent matrix is similar to a matrix that is zero everywhere, except for scattered ones along the subdiagonal. The intermittent zeros on the subdiagonal delimit the blocks. For instance, consider a 6×6 matrix with 2 blocks [1,2][3,6]. The first block comes from N0 and the second from V0, which does not split further in this example. Notice that 0, in position 3,2, separates the two blocks.
000000
100000
000000
001000
000100
000010
The index of M, as a nilpotent transformation, is the size of the largest set, or block. Thus the maximum index of a nilpotent transformation is n, the dimension of the space. This happens when all n vectors participate in the march to zero, and all entries in the subdiagonal of M are 1. Verify that the powers of M shift the ones down and to the left, until Mn-1 has 1 in the lower left corner, and Mn is zero.
000000
100000
010000
001000
000100
000010