Multivariable Calculus, Local Maximum and Minimum

Local Maximum and Minimum

If f is a function of several variables, where are the local maxima and minima? We already answered this question in one variable. The first derivative must be 0, and sometimes the second derivative can be pressed into service to distinguish between a minimum, maximum, and point of inflection. As we shall see, similar results hold in multiple dimensions.

For convenience, start at the origin and move in a direction of v. The speed doesn't matter; we are moving along the line determined by v. We can assume v is a unit vector. The composition of f and vt gives a function in one dimension, the height of the surface as you move in the direction of v. A local minimum or maximum implies the same in any direction, and the derivative must be 0. Apply the chain rule, and ∇f.v has to equal 0. This holds for every v, and in particular, it holds for the coordinate vectors in n space. These in turn extract the partial derivatives. Therefore all partial derivatives must equal 0. This is a necessary (though not sufficient) condition for a local minimum or maximum.

If all partials are 0 then ∇f = 0, and ∇f.v = 0, and the surface is level in every direction. If you are an ant, sufficiently small, the surface looks like the plains of Nebraska. Of course it may be curving away, as the earth does; you just can't see it.

Saddle Point

If the partials are 0, yet the surface is not constant, and is not a maximum or minimum, the point is called a saddle point. this is similar to an inflection point in one dimension, though there are more possibilities. Certainly z = x³ has a point of inflection at the origin, since f increases or decreases with x, but this doesn't really look like a saddle. A more typical example is z = x²-y². Again, the partials are 0 at the origin. Along the x axis, you have a parabola opening up, and along the y axis you have a parabola opening down. Put your legs over this downward parabola and you are sitting in a saddle, whence the origin is a saddle point. (If you don't know anything about horses or saddles, picture a Pringles potato chip.)

Second Derivative

To find the second derivative of f(vt), look again at the first derivative. The chain rule gives us a linear combination of partials, using the components of v as coefficients. Each partial is itself a surface in n space. Thus, one of the terms of the first derivative might look like v₁f₁(vt), where f₁ is the first partial of f. Apply the chain rule again and get v₁×∇f₁.v. Do the same for v₂f₂, v₃f₃, and so on through v_nf_n, and add the resulting expressions together. Here is a concise formulation.

Let M be an n×n matrix whose entries are the mixed partials of f. In other words, the entry in row i, column j, is the second partial of f with respect to x_i, with respect to x_j. This is called the hessian matrix of f. Write the expression v*M*v, where v is a row vector on the left and a column vector on the right, and * is matrix multiplication. The result is the expression derived above, i.e. the second derivative. Evaluate M at 0 to find the second derivative of the path through the origin in the direction of v.

Taking advantage of the theorems in one dimension, a positive second derivative implies a local minimum, and a negative second derivative implies a local maximum. Yet v could be any unit vector, so how can we evaluate the sign of v*M*v?

Assume the mixed partials exist near the origin, and are continuous at the origin, hence the mixed partials are equal. This makes M a symmetric matrix.

At this point I refer you to the theory of quadratic forms, which says v*M*v is positive iff all the eigen values of M are positive, and v*M*v is negative iff all the eigen values of M are negative. This provides a minimum maximum test similar to the concavity test in one variable. An extremal point has to have first partials equal to 0. If the matrix of second partials has positive eigen values, the point is a local minimum. If the matrix of second partials has negative eigen values, the point is a local maximum. If some eigen values are positive and some are negative, we have a saddle point. If all eigen values are 0, we don't know. One can employ higher order derivative tests, just as we did in one dimension, but that gets very complicated very quickly.