Matrix Polynomials, Power Series and Exponential Functions

Power Series

Let f be an analytic function with a radius of convergence of r. Let M be a matrix with spectral radius less than r. Expand f into its power series and evaluate it at M. Does it converge?

Let a nonsingular matrix P convert M to jordan canonical form. Apply P to the power series that defines f(M). The original series converges iff the new series converges. If the original sum was l, the new sum is Pl/P.

Concentrate on a simple jordan block with eigen value z. Remember that z is inside our circle of convergence. The main diagonal becomes f(z), which is convergent. If there are ones below the main diagonal, apply the binomial theorem. The subdiagonal of the nth term is annzn-1. Add these up and the subdiagonal becomes f′(z). Since f is analytic, the derivative exists, hence the subdiagonal converges. The next diagonal converges to f′′(z)/2!, and so on. Put this all together and f(M) converges.

Exponential

The exponential of a matrix M, written exp(M), or EM, is computed using the power series for Ez. The circle of convergence is the entire plane, hence EM converges for every matrix M.

When A and B Commute

It doesn't happen often, but assume the matrices A and B commute, and expand EA+B. Then expand each term using the binomial theorem. On the other side, expand EA and EB, and multiply these series together. This gives a product series, which converges, regardless of the order of the terms, because EA and EB converge absolutely.

How do we make the terms correspond? Consider the nth diagonal on the right. This includes An/n! times 1, and 1 times Bn/n!, and everything in between. Multiply this by n!, for our convenience. The result is the sum of (n:i)AiBn-i, which is simply (A+B)n. Divide by n! and get (A+B)n/n!. This is the nth term of EA+B. The terms correspond, and EA+B = EAEB.

Using Jordan Form to Compute the Exponential

Given a matrix M, convert to jordan form, evaluate EM, then apply the inverse transformation to recover the true sum.

Let's consider a simple jordan block. If the eigen value is l, the diagonal becomes El, and the kth subdiagonal converges to the kth derivative over k!, which is El/k!.

Adjusting the Exponential by t

Hold M fixed, and let s and t be scalars, i.e. complex numbers, or if you prefer, scaled versions of the identity matrix. consider EsM.

Use P to convert M to jordan form, then multiply by s. Concentrate on a jordan block with eigen value z. Multiply by s and evaluate, and the diagonal becomes Esz.

Move to the subdiagonal. Remember that the diagonal of the jordan block has become sz, and the subdiagonal has become s, instead of 1. Now the subdiagonal on the nth term becomes sn(sz)n-1/n!. add these terms together, factor out s, and get s times the derivative of the exponential, evaluated at sz. This is sEsz. The kth subdiagonal becomes skEsz/k!.

Use the same procedure to find EtM, and consider EsM times EtM. Base change commutes with product, so apply the transformation that turns M into jordan form. Look at the product of two corresponding blocks. The diagonals are, respectively, Esz and Etz. The diagonal of the product is EszEtz, or E(s+t)z.

Bravely move on to the subdiagonal. The subdiagonal of the product is the main diagonal of the first block times the subdiagonal of the second, plus the subdiagonal of the first times the main diagonal of the second. This simplifies to (s+t)E(s+t)z. This happens to agree with the subdiagonal that results from E(s+t)M.

You can probably see where this is going, but let's check the next diagonal down. The common factor E(s+t)z is multiplied by s2/2+st+t2/2, or (s+t)2/2. This agrees with the evaluation of E(s+t)M.

You'll need some combinatorics to verify the kth subdiagonal. Pull out the common factor E(s+t)z, and look at the remaining sum. When multiplied by k!, you have the sum of (k:i)sk-iti. This is (s+t)k. Divide by k! and the expression becomes (s+t)kE(s+t)z/k!, which agrees with E(s+t)M. In conclusion, EsMEtM = E(s+t)M.

Of course we didn't have to work this hard. Since sM and tM commute, we can invoke the earlier theorem, whence EsM+tM = EsMEtM.

As a corollary, set s = 1 and t = -1, and EM is invertible, with inverse E-M.

Making t a Variable

Instead of being a fixed scalar, let t be a real or complex variable. Consider EMt as a matrix function of t, and differentiate. Expand EMt using its power series, which is analytic across the entire matrix, and along each entry. The entry functions can be differentiated term by term, hence each matrix function in the power series can be differentiated term by term. The nth term becomes Mntn-1/(n-1)!. The first term was constant, and dropped out, so we're really starting with n = 1. Reindex, and start with n = 0. Now the terms are Mn+1tn/n!. This is M times EMt.

Since M commutes with Mn, M commutes with EMt. Therefore the derivative of EMt can be expressed as mEMt, or EMtM.

Differential Equation

Consider the differential equation y′(t) = y(t)M, where M is a fixed matrix, and y(t) is a matrix function of t. This looks like one differential equation in one variable, when written in matrix notation, but there are really many functions of t, one for each entry in the matrix, and many equations that relate these functions to their derivatives. So this is really a system of linear differential equations. Whatever you call it, let's look for solutions.

The function y(t) = EMt is a solution, as shown above. Furthermore, if c is a constant matrix, cEMt is a solution. If matrices are n×n, then c provides n2 linearly independent solutions. This makes sense, since there are n2 individual functions of t inside the matrix function y(t). To complete the characterization, we need to show there are no other solutions.

Suppose there is some other solution z(t). Since EMt is everywhere nonzero, divide z by EMt and call the quotient q. In other words, our solution is q(t)EMt. Use the product rule to differentiate.

q′EMt + qEMtM = qEMtM

q′EMt = 0

Q′ = 0  (multiplying both sides by E-Mt)

q = c  (for a constant matrix c)

All solutions have been identified, and the solutions form a vector space of dimension n2, as dictated by the constant matrix c.

The analogous differential equation y′ = My has solutions EMtc, where the constant matrix c is multiplied on the right.

Generalizing Mt

If M is a matrix of constants, the derivative of Mt, with respect to t, is M. Thus the derivative of EMt is EMt times the derivative of the exponent. This begins to look like the familiar formula from calculus. Let's try to generalize this to arbitrary functions in the exponent.

Let U(t) be a matrix of functions in t, where each function is differentiable, or analytic if t is complex. Consider the derivative of EU(t), at a particular time t, which I will call t = 0 for convenience. Let U′(0) = M, a matrix of constants.

Start with the difference quotient: (EU(h)-EU(0))/h. Focus on the first exponent U(h). Replace this with h×(U(h)-U(0))/h + U(0). The first term becomes h times something arbitrarily close to M, say h times Q, where Q is M±ε. Now the difference quotient looks like this.

{[EhQ × EU(0)] - EU(0)} / h

Replace the first exponential with its power series, and pull 1 out. This yields 1 times EU(0), which cancels -EU(0) in the numerator. That leaves us free to divide through by h.

{Q + hQ2/2 + h2Q3/6 + …} × EU(0)

Now it's a matter of continuity. As h moves to 0, all the terms, other than Q, move to 0. To be rigorous, bound all the entries of Q below b for some positive norm b. Then each entry of Q2 is below nb2, and each entry in Q3 is below n2b3, and so on. Absolute convergence is no problem, bounded below Enb. Multiply through by h, which is bigger than any higher powers of h for small h. Then let h go to 0 and the series goes to 0. That leaves Q, which goes to M as h approaches 0. Therefore the derivative of EU is MEU, or U′EU.

It is easy to run the same proof with the terms in a different order, so that the derivative becomes EUU′.

Now return to our differential equation. This time consider y′ = yV, where V is a function of t. A proof, similar to the one shown above, characterizes the solutions as cEU, where c is a matrix of constants, and U′ = V. Similarly, the solutions to y′ = Vy are EUc.

This EU result is bizarre! And beautiful! And Useful! Useful, because it is essential for the proof of the fundamental theorem of ordinary differential equations. If the nth derivative of y is a linear combination of lesser derivatives of y, where functions of x act as coefficients, then there is indeed a unique solution. The connecction to EU is not obvious at all, until you have seen the proof, then it all kinda makes sense.