Quadratic Forms, Rotation

Rotation

Ok, we've talked about quadratic curves and surfaces in great detail; but why is all this stuff under linear algebra? Except for an occasional light ray bouncing off a mirror, I haven't seen a straight line yet.

Quadratic surfaces are here, under linear algebra, because the complete analysis of an arbitrary quadratic form involves matrices, eigen vectors, and eigen values. These are used to rotate the surface into position, so that the axes of the surface line up with the coordinate axes. Turn your head, and suddenly the conic section looks familiar. From there you can quickly determine if it is an ellipse, parabola, or hyperbola.

The equation xy = 1 is a classic example. It's a hyperbola, but how do you know that? It doesn't look like ax²-by² = 1.

Well - you can transform the equation into standard form by turning your head 45 degrees. Replace x with u+v, and y with u-v, and get u²-v² = 1, which is obviously a hyperbola.

Now this isn't precisely fair, because the change of basis shrinks the image. Ideally, the linear transformation should rotate the plane without stretching, shrinking, or distorting the shape in any way. We want a rigid rotation, as implemented by an orthonormal matrix. No problem; replace x with (u+v)/sqrt(2), and y with (u-v)/sqrt(2). The new equation is u²-v² = 2. This is exactly the same curve as xy = 1, but we're looking at it through a new coordinate system, defined by the u and v axes, which are oriented 45 degrees to the x and y axes. Note that the hyperbola is sqrt(2) units away from the origin at its closest point. This is true in the uv plane or the xy plane. And it ought to be, because the linear transformation merely rotates the plane, with no stretching or shrinking in any direction.

Can this be done for any quadratic expression in two dimensions? In n dimensions? It can, but to prove it you must be familiar with the theory of similar matrices. That is a rather advanced topic in linear algebra. If you're not familiar with it, you may not understand the following proof. That's ok, as long as you can visualize the concept. Rotate the axes in just the right way, and the mixed terms go away, leaving a quadratic expression that is easy to analyze. Its terms are all squares, with perhaps one linear term and perhaps one constant.

Here is the proof in n dimensions. Start with a quadratic form in n variables. This is an equation of degree 2, i.e. at least one term has degree 2. Move the linear and constant terms to the side. We'll bring them back later. This leaves a series of squared and mixed terms in the variables x₁ x₂ x₃ … x_n. In the introduction, which was a ways back I know, we represented this expression using a fixed symmetric matrix M. The sum of squared and mixed terms is equal to xMx^T, where x is a row vector containing the n variables.

Since M is a symmetric matrix, there is a change of basis, a rigid rotation to be precise, that turns M into a diagonal matrix. Let Q be the orthonormal basis that implements the rotation in Rⁿ. Run the quadratic form through Q, and the matrix is diagonal, and all the mixed terms go away.

Once the mixed terms are gone, bring the linear terms back in. These can be folded into the squared terms using translation. The result is a quadratic form consisting of squared terms, a constant, and at most one linear term. We've already analyzed these equations in 2 and 3 dimensions. They define the conic sections (2 dimensions) and the quadratic surfaces (3 dimensions).

An efficient process, using Gaussian elimination, establishes the eigen values and vectors of any matrix. The eigen vectors of M are the new coordinates that put our quadratic expression into standard form. Even if there are hundreds of dimensions, a computer can find the rotation that eliminates the mixed terms, leaving only the squared terms and at most one linear term.

Let's follow this through in 2 dimensions. Here is a general formula for a conic section.

ax² + by² + cxy + dx + ey + f = 0

Assume a b or c is nonzero, else we simply have a line in the plane.

The first task is to build the symmetric matrix M. It is presented below. Note that the row vector (x,y) times M times the column vector (x,y) produces ax² + by² + cxy.

M = (a   | c/2)
    (c/2 | b  )

Subtract s from the main diagonal and the characteristic polynomial becomes s²-(a+b)s+ab-c²/4. The discriminant h is (a-b)²+c², which is never negative. There are always real eigen values. The eigen values are (a+b±sqrt(h))/2. We always have two different eigen values, unless a = b and c = 0.

As a sanity check, set c = 0. Thus h = a-b, and the eigen values are a and b, as we would expect from a diagonal matrix. The eigen vectors are 1,0 and 0,1, and when we normalize the matrix, we essentially replace x with u and y with v, and nothing changes. Nor should it; the expression is already in standard form.

We could find the eigen values s using the quadratic formula, solve for the eigen vectors, invert the matrix to represent x and y in terms of the new coordinates, and substitute, but that's a lot of algebra. Let's cut to the chase. If the new coordinates are u and v, x and y will be perpendicular vectors in the uv plane. Set x = u+gv and y = -gu+v. Abter substitution the mixed terms should disappear. This implies the following equation.

-cg² + 2(a-b)g + c = 0

-½cg² + (a-b)g + ½c = 0

h = sqrt((a-b)² + c²) (same discriminant as we saw before)

g = (a-b ±h) over c (quadratic formula)

So we have g, and x and y are now functions of u and v. Of course we would like x and y to be unit vectors. Let s be the length of u+gv, that is, s is the square root of g²+1.

s² = g²+1 = 2((a-b)² + (a-b)h + c²) over c²

Now replace x with (u+gv)/s and y with (-gu+v)/s in the general conic equation. We know the terms that contain uv will drop out. Here is what remains.

(bg²-cg+a)/s² u² +
(ag²+cg+b)/s² v² +
(d-eg)u + (e+dg)v + f

To illustrate, consider this conic, which is, at the start, rather difficult to predict.

21x² + 6xy + 13y² - 114x + 34y + 73 = 0

Follow along as we apply the earlier formulas.

a = 21, b = 13, c = 6, d = -114, e = 34, f = 73

h = 10 (square root of discriminant)

g = (21-13+10)/6 = 3

s = sqrt(10)

12u² + 22v² - 216u/s - 308v/s + 73 = 0

12(u-9/s)² + 22(v-7/s)² = 132

Now we know a lot about our conic section. The coefficients on the squared terms are positive, and the constant is also positive, so it's an ellipse.

The semimajor axis is the square root of 132/12, or about 3.12. The semiminor axis is the square root of 132/22, or about 2.45. The ellipse is about 27% longer than it is wide.

We know where the center of the ellips is in the uv plane, but what about the xy plane? Remember that x = (u+3v)/s and y = (-3u+v)/s. Substitute u = 9/s and v = 7/s and get x = 3 and y = -2.

The major axis runs along the u coordinate, u = 1 and v = 0. Substitute to get the vector 1,-3 in the xy plane. The ellipse is tilted downward at a steep angle. You can draw the major axis by starting at the center (3,-2) and drawing a segment of length sqrt(11) down and to the right, having slope -1/3. Verify that the end of this segment, at x = 4.0488 and y = -5.1464, satisfies the original equation. Another segment of the same length is drawn up and to the left to complete the major axis. Draw perpendicular segments of length sqrt(6) for the minor axis.

We now have a clear picture of the shape, size, location, and orientation of our conic section. Obviously all this can, and should, be automated by a computer.