Differential Equations, Existance and Uniqueness

Existance and Uniqueness

This is sometimes called the fundamental theorem of ordinary differential equations, because it asserts the existence, and uniqueness, of a solution.

You are given a linear equation of order n, where all coefficients are continuous functions of x, well defined throughout an open interval, or an open disk if you are working in the complex plane. Assume this interval, or disk, contains the origin. After all, we can always move the origin by replacing x with x+s, for some s in the domain, then replace x with x-s in the solution.

The lead coefficient (on the n^th derivative) must be 1. If the lead coefficient is some other function of x, assume that function is nonzero about the origin, whence we can divide through to get a lead coefficient of 1.

Finally, the initial conditions are known at the origin. We are given y(0), y′(0), y′′(0), and so on.

When all these conditions are met, The equation has a solution, and the solution is unique.

The procedure outlined below may not be particularly useful for solving the equation. It merely demonstrates the existence, and uniqueness, of a solution.

Although it is counterintuitive, we need to turn the problem into a system of n differential equations in n variables. That actually makes the problem easier.

Let y₀(x) be y(x), the function that we seek. Let y₁(x) equal y′. Let y₂ = y′′, and so on up to y_n-1, which is the n-1^st derivative. (We won't need a separate variable for the n^th derivative.)

The initial value for the i^th derivative of y is assigned to y_i(0). These are the initial constraints that are provided along with the differential equation.

For each i in [0,n-2], write the equation:

y_i′ = y_i+1

The n functions y₀, y₁, y₂, etc, use to be independent, but they're not any more. Each is the derivative of the previous, and by induction, the i^th function y_i equals the i^th derivative of y(x).

For the last equation, rewrite the original differential equation, substituting y_i for each of the derivatives in turn. The leading term, the n^th derivative of y, becomes y_n-1′. We now have a system of n first order differential equations in n variables.

Verify that the system of first order equations has a solution iff the original differential equation has a solution.

The task before us is to solve the system of simultaneous first order equations. If you think we're going to use matrices, you're right.

Build an n×n matrix P that is zero everywhere, except for the following entries. The n-1 entries just above the main diagonal are set to -1. The bottom row contains the coefficients on the derivatives of the original differential equation, starting with the coefficient on y at the lower left, and ending with the coefficient on the n-1^st derivative of y at the lower right. Remember, these coefficients are functions of x, hence the bottom row of P contains functions of x, and P is a matrix of functions.

Let y be the column vector [y₀,y₁,y₂…y_n-1].

Let y′ be the term by term derivative of y. Thus y′ is a column vector [y₀′,y₁′,y₂′…y_n-1′].

Consider the expression y′ + Py, where Py is evaluated using matrix multiplication. The result is a column vector; let's take it from the top.

Take the dot product of y and the top row of P to get -y₁. This is added to the first entry in y′, which is y₀′. The result, y₀′-y₁, is equal to 0.

Move on to the next entry. Take the dot product of y and the second row of P to get -y₂. Add this to y₁′, and again, the sum is 0. This continues through the penultimate row of P. In other words, the first n-1 entries in y′+Py are zero.

To compute the last entry, take the dot product of y with the bottom row of P, then add the derivative of y_n-1. The result mirrors the left hand side of our original differential equation, and the answer has to be r(x), the right hand side of that equation.

Let r be the column vector that is all zeros except for r(x) at the bottom. Verify that our system of n equations has a solution iff we can find a column vector y satisfying y′ + Py = r. The system of simultaneous equations is the same, we're just writing it using concise matrix notation.

At this point I'm going to skate quickly across an important branch of mathematics that deals with matrix functions and their derivatives. You can read all the details, or you can simply follow along as I hit the highlights.

Consider matrices of functions, similar to P. Matrices are added and multiplied in the usual way. They can be inverted as well, although the function that implements the determinant of a nonsingular matrix may fall to zero for certain values of x, and these values represent singularities in the inverse matrix. However, as long as the determinant is not the zero function, the matrix is invertible.

Finally, a matrix can be differentiated or integrated term by term. If u and v are two such matrices, the derivative of the sum is the sum of the derivatives. You can probably prove this in your head, but less obvious is the fact that the derivative of uv is u′v+uv′ (just like calculus). Consider one entry in the product uv, and differentiate it. This is a sum of products, so differentiate each product in turn, using the product rule that we know and love. Then rearrange the terms into two sets. The first set contains the terms where entries of u are differentiated, and the second set contains terms where entries of v are differentiated. These sets contain, respectively, terms from u′v, and terms from uv′. This holds across the entire matrix, hence the product rule is valid for matrices.

Don't assume that the derivative of u² is 2uu′. Multiplication is not commutative, and uu′ may not be the same as u′u. Build a 2x2 matrix with zeros on the diagonal, x in the upper right, and 1 in the lower left. Note that u′u has 1 in the upper left corner, while uu′ has 1 in the lower right.

If u is a differentiable matrix, and v is its inverse matrix, v can be written as sums of products of entries in u, over the determinant of u. Since the determinant is nonzero, each entry in v is a differentiable function, and v is differentiable. Write uv = 1 and differentiate, giving uv′ = -u′v, or v′ = -vu′v.

Next, consider sequences and series of matrices. Since matrices are added term by term, this is just n² little series running in parallel. If they all converge, the matrix series is convergent. If even one of them diverges, the entire series diverges. And remember, we're talking about series of functions. If each of the n² component series is uniformly convergent, then the matrix series is uniformly convergent. If each component series is integrable, or differentiable, i.e. we can interchange integration and summation, then the same holds true for the matrix series.

If U is a matrix of functions, let E^U be the taylor series expansion for E^x, evaluated at U.

A common way to show convergence is to diagonalize the matrix U. If t is any taylor series, show that t(U) = bt(D)/b, where D is diagonal, and bD/b = U. Now the taylor series can be applied to the components of D, term by term. Thus E^U converges everywhere.

It can be shown that E^U can be differentiated and integrated, (provided the same holds for U), that the derivative of E^U is E^UU′, and that E^U always produces an invertible matrix. You can see the proof here.

Review the procedure for solving a first order equation, and apply it to this situation. We are trying to solve y′+Py = r, which looks like a first order linear differential equation, except the variables are matrices of functions, rather than individual functions. Still, the same reasoning applies.

Let S be the matrix produced by integrating -P and taking the exponential of the result. Let y = Sz, as we did before. Here z is a column vector of functions, just like y. Multiply S by z, using matrix multiplication, to get y. There is always such a z, because S is invertible. Then apply the product rule to (Sz)′. This leads to the same equations we saw before:

(Sz)′ + PSz = r

Sz′ - PSz + PSz = r

z′ = (1/S)×r

z′ = E^∫P×r

When we solve for z, a constant of integration appears. In this case the constant c is a vector of length n. Thus y = S*(z+c). At t = 0, the integral of -P is 0, and E to this power is 1, hence S, at time 0, is the identity matrix. Therefore the first entry of c is y(0) (in the original equation), or y₀(0) in our system of first order equations. The second entry has to be y′(0), or y₁(0). The third entry is y′′(0), or y₂(0), and so on. Thus the function y is completely determined. It exists, and it is unique, and that completes the proof.