Metric Spaces, Cauchy Schwarz Inequality, Triangular Inequality

Cauchy Schwarz Inequality, Triangular Inequality

In the last section we proved the triangular inequality for n dimensional space, but that was rather indirect. Finite products of arbitrary metric spaces produce a valid metric, and when this general result is applied to coordinate axes you get n dimensional space, with its euclidean metric. Hence the triangular inequality is valid in Rn. But that proof is hard to understand without a substantial background in point-set topology, and it doesn't address the Cauchy Schwarz inequality. In this section we will prove the triangular inequality and the Cauchy (biography) Schwarz (biography) inequality using pure algebra.

Remember that the triangular inequality says that one side of a triangle is never longer than the sum of the other two sides. Well this is obvious really. Try drawing a triangle whose sides are 2 3 and 17. But sometimes we need to prove the obvious, because sometimes the obvious isn't true in higher dimensions, or in spaces that are substantially different from the world around us. An electron obviously can't pass through two slits at once, but it does. So let's prove the triangular inequality in n dimensions. You'll sleep better at night.

Follow Your Nose

I'd like to begin with a simple, follow-your-nose proof, where you assume the triangular inequality, do some algebra, and run into something that has to be true; hence the original inequality is true. I'll also restrict attention to the xy plane, where the algebra is simpler. Keep in mind however, that this proof does not generalize to other spaces, such as the continuous functions on a closed interval. (We'll deal with those spaces later.)

Although this proof is restricted to the xy plane, it is actually more powerful than it first appears. Start with any triangle in 3 space, or in n space for that matter, and move it down into the xy plane. You can always move the triangle, as a rigid body, without changing the lengths of its sides. So if the triangular inequality holds in the xy plane, it applies to the same triangle floating in 3 space, or n space. This simple assertion masks a great deal of linear algebra, but the intuition is so compeling, you almost don't need to prove it. We'll just assume you can move the triangle down into the xy plane, or if you prefer, relabel the coordinates of n space so that the (new) xy plane contains the triangle. Prove the theorem in R2, and you have proved it for Rn as well.

Given any triangle in the xy plane, move the triangle, or relabel the coordinates, so that one of the three corners is at the origin, another is at a,0 (on the x axis), and the third is at b,c. Now the sides have lengths:

a
sqrt(b2+c2)
sqrt((a-b)2+c2)

We want to show that the first distance + the second is at least as large (and probably larger) than the third.

a + sqrt(b2+c2) ≥ sqrt((a-b)2+c2)

As a number gets larger, its square gets larger. So we can square both sides without disturbing the inequality.

a2+b2+c2 + 2a×sqrt(b2+c2) ≥ a2+b2+c2 - 2ab

This simplifies to:

sqrt(b2+c2) ≥ -b

The left side is positive. If b is positive we're done. So let b be negative. Now the right side is the absolute value of b. The left side would be the same, if c was 0. If c is nonzero then b2+c2 is larger, and its square root is larger than |b|. The inequality holds.

It is a precise equality when b is negative and c is 0. This makes sense; just draw the triangle. It's flat, and the sum of the two sides equals the third.

The thing about a follow-your-nose proof is, you started by assuming what you wanted to prove, then ran into something that has to be true. This is ok, as long as you know all the steps are reversible, so that something that is always true implies what you want to prove. In this case everything is reversible, the steps are all iff, and we're ok.

Here's a follow-your-nose proof that doesn't work. Start with 3 = -3, something we want to prove. Square both sides, and sure enough, 9 = 9, which is always true. So I guess 3 = -3, right? Wrong, because the steps cannot be reversed. But we can reverse the steps in the triangular inequality proof, because I knew the distances were all positive from the outset. It's ok to take the square root of both sides of the second inequality, to get back to the triangular inequality. We'll see this again in the next section, when we connect the triangular inequality with Cauchy Schwarz. All steps are reversible, and either implies the other.

Cauchy Schwarz Inequality

The Cauchy Schwarz inequality is equivalent to the triangular inequality, at least in n space. But Cauchy Schwarz generalizes to other spaces, such as complex continuous functions on closed intervals. That's why advanced textbooks often prove Cauchy Schwarz first, then extend this result to various abstract spaces, then bring the triangular inequality back in, almost as a corollary. Let's give it a whirl.

Let a1 a2 … an and b1 b2 … bn be two vectors in n space. Let p be the sum of ai2, let q be the sum of aibi, and let r be the sum of bi2. Using the notation of dot products, p = a.a, q = a.b, and r = b.b. We will show that q2 ≤ p×r. This is the Cauchy Schwarz inequality.

Let x be any real number. Consider the sum of (ai×x+bi)2. Since this is a sum of squares, it is 0 only if each term is 0. In other words, this sum is 0 only if the bector b is a multiple of the vector a, where x is the scaling factor. For all other values of x, the sum is positive.

Write our nonnegative sum as px2 + 2qx + r, where p q and r are defined as above. Note that p and r are sums of squares, and cannot be negative. If either p or r is 0, then a or b is the zero vector, q is 0, and we have q2 = pr.

With p nonzero, the quadratic expression attains its minimum when x = -q/p. Substitute x = -q/p and obtain -q2 + pr ≥ 0. This gives q2 ≤ pr, which proves the Cauchy Schwarz inequality.

As mentioned before, the inequality is strict, unless one vector is a scalar multiple of the other.

So what does this have to do with the triangular inequality? Let a and b be vectors in n space. We would like to show that the length of a+b is less than or equal to the length of a plus the length of b. In fact they are only equal if one vector is a scale multiple of the other. square |a+b| ≤ |a| + |b|, and cancel the squared terms from each side. The left side is twice the sum of (aibi). This was 2q in our earlier notation. Continuing that notation, the right side becomes 2sqrt(pr). Square both sides again to get q2 ≤ pr. Revers these steps to show Cauchy Schwarz implies the triangular inequality in n space.

Verify that the set of absolutely convergent series is a real vector space. For any two series a and b, we can build the product series with terms ai×bi. Once a and b drop below 1, they both dominate the product series, hence it converges absolutely. The same is true when the terms of a series are squared (special case of a×b).

To find the distance between two series, subtract them, giving another absolutely convergent series, square the terms, take the sum, and take the square root. The distance is 0 only if the two series are identical.

Extend the definitions of p q and r above. We are still adding ai2, aibi, and bi2, but this time the sums are infinite. We would like to compare q2 and p×r.

Start with the sum of (ai×x+bi)2 as before, but this time the sum is infinite. Rewrite the sum as px2+qx+r ≥ 0. This rearrangement is possible thanks to absolute convergence. From here the proof of Cauchy Schwarz, and the triangular inequality, is the same. Therefore the absolutely convergent series form a metric space.

Next consider continuous functions on a closed interval. The product of two functions is continuous, and integrable. If p, q, and r are the integrals of a2, a×b, and b2, we can prove Cauchy Schwarz. Divide the interval into n subintervals, and Cauchy Schwarz holds for each Riemann sum. The inequality must hold in the limit. Use continuity to show equality iff the functions a and b are scale multiples of each other. Thus we have a metric space, where distance is given by the square root of the integral of (a-b)2.

This theorem generalizes to piecewise continuous functions over a closed interval. We leave the functions undefined at the borders, where the continuous pieces touch. Thus two functions are equal across the entire interval iff their distance is 0.

Sometimes we can measure the distance between functions over an open or infinite interval, using indefinite integrals, but keep in mind, the product of two functions may not be integrable. This is illustrated by 1/x2 and x, which are both integrable on (0,1], yet their product is not.

Vectors in Complex Space

Let a and b be vectors in complex space, i.e. arrays of complex numbers. In this case the dot product, a.b, is the sum of (ai × bi conjugate). Thus a.a is a real number, and sqrt(a.a) is a good candidate for our distance function.

Realize that an array of n complex numbers is also an array of 2n real numbers; just take the real and imaginary parts separately. This provides a 1-1 correspondence between complex space and real space, the latter having twice as many dimensions as the former. The distance functions in the two spaces give exactly the same answer; both give the square root of the sum of the squares of the components. And subtraction (the difference between two points) also gives the same vector in both spaces. Since the triangular inequality holds in real space, it is valid in complex space.

Moving from the triangular inequality back to Cauchy Schwarz is a bit trickier in the world of complex numbers. Let's review our notation. Remember that p = a.a, which is the sum of ai times ai conjugate. And r is still b.b. No problem there. Let q1 equal a.b, which is the sum of ai times bi conjugate. Let q2 be the conjugate of q1, that is, the sum of bi times ai conjugate. Now start again with the triangular inequality.

|a+b| ≤ |a| + |b|

Note that this is an equality iff b is a real scale multiple of a.

Square the triangular inequality and subtract a.a and b.b from both sides. The left side becomes the sum of ai times bi conjugate, plus the sum of bi times ai conjugate. In other words, the left side is q1+q2.

The right side is twice the square root of (a.a)×(b.b). Put this all together to get the following.

q1 + q2 ≤ 2sqrt(pr)

Divide through by 2, and remember that q1 and q2 are conjugate. So the left side becomes the real component of q1, or q2 if you prefer.

re(q1) ≤ sqrt(pr)

Square both sides again and get complex Cauchy Schwarz.

re(q1)2 ≤ pr

Again, we have equality only when b is a real scale multiple of a.

Can we say anything about re(a.b)? If one of the components of a is u+vi, and the corresponding component of b is x+yi, multiply the former by the conjugate of the latter and get a complex number whose real component is ux+vy. This begins to look like the traditional dot product. In fact, the real part of a.b is exactly the same as the traditional dot product of a and b when they are viewed as real vectors, having twice as many dimensions. In this sense, Cauchy Schwarz is no surprise at all. We have derived exactly the same equation. We just took a walk through complex arithmetic to get there.