Linear Transforms, Generalized Matrix

Generalized Matrix

When a linear transform acts on traditional vectors, rather than functions, it is equivalent to a matrix. Write the input as a row vector, multiply by the matrix, and obtain the output as another row vector. (You can transpose the matrix and use column vectors if you like; it doesn't really matter.) Each linear map is implemented by a matrix, and each matrix implements a different linear map. Let's generalize this to an infinite series.

Let a linear transform t accept a series of real numbers, producing another series of real numbers. (You can use complex numbers if you like.) If you want some peace of mind, assume all series are absolutely convergent, then there is no trouble.

Assume t is hyperlinear, that is, linear across infinite sums. Now t can be implemented as an infinite matrix, and the representation, as a matrix, is unique.

Let's go from transform to matrix, then back again. Take the series [1,0,0,0,0,0…] and feed it to t(), and place the output series, the result of t(), in the first row of an infinite matrix m. Thus the top row of m is a series that goes on forever. Our matrix has a left edge, but it has no right edge; there are infinitely many columns.

As you might guess, t([0,1,0,0,0,0,0…]) becomes the second row of m, t([0,0,1,0,0,0,0…]) becomes the third row of m, and so on. When the 1 is in position j, the output becomes the j^th row of m. This continues forever, hence m has infinitely many rows.

By linearity, t([3,1,7,0,0,0,0…]) has to be 3 times the first row of m, plus the second row, plus 7 times the third. The rows of m are scaled and addded together. This can be realized by matrix multiplication. Rotate the input series 90 degrees, so it becomes a column vector, and lay it along the first column of m. Multiply corresponding entries and add everything up, producing the first term of the output series. Then move the input vector over to the second column of m. Multiply, and add, giving the second term of the output series. This process continues forever, generating t(a). This is simply matrix multiplication, but the matrix is infinite, and the sums are infinite too.

Conversely, an infinite matrix m implies a hyperlinear transform t. Use matrix multiplication to map an input series onto an output series.

If s and t are different transforms, they must disagree on one of the basis vectors. Suppose s and t disagree on the series [0,0,0,0…1…0,0,0,0…], where the 1 is in position j. The matrix associated with s has, as its j^th row, s(b_j), while the matrix associated with t has, as its j^th row, t(b_j). These rows differ at some point, hence the matrices are distinct. Different transforms lead to different matrices. In other words, transforms and matrices coincide 1-1, just as they did in finite dimensions.

Our matrix m has a main diagonal that goes on forever. Thus m has a transpose, and if m equals its transpose, it is symmetric. If m equals its tranjugate, m is hermitian. The linear operator t is hermitian iff its infinite matrix m is hermitian. Apply the finite dimensional proof. It works just as well when m is infinite.

Next, I would like to change the matrix convention just a bit. Reflect the matrix m, so that the rows go up, not down. The bottom row is row 1, the next row is row 2, and so on up the page. When you are ready to premultiply by a row vector v, you need to rotate it up, not down. Thus v runs up the page, just like m. Lay v along the first column of m, whence the first entry of v can be multiplied by the lower left entry of m, denoted m_1,1. Pairwise products are calculated, and added together in an infinite sum, going up the page. This produces the first entry in the output series, and subsequent values are computed in the same way, as we move v to the right. This helps us visualize the continuous case, where the y coordinate increases as you move up the page, not down.

Next, let t act on the vector space of real valued functions that are continuous on [0,1]. The result is another continuous function on [0,1]. (This works for piecewise continuous functions as well.)

We will build t using a real valued function m(x,y), such that m is continuous on the unit square in the xy plane.

Given an input function f, the output function t(f), evaluated at x, is the integral of m(x,y)×f(y), as y runs from 0 to 1. If you look at this geometrically, it is just matrix multiplication, but the vectors, and the matrix, are continuous, and summation becomes integration. Take the input function f and tip it up 90 degrees, then lay it along the xy plane at the designated x coordinate. Move up the vertical line, multipllying f(y) by m(x,y), and add up the results. Since this is a continuous operation, we're talking about integration.

Since integration is the limit of summation, we expect t to be linear, but let's verify it. Let f and g be input functions, and let h(y) = m(x,y) for a fixed x. As a function, (f+g)×h is the same as fh + gh. Integrals are additive, so the integral of (f+g)h is the sum of the integral of fh plus the integral of gh. At the value of x, t(f+g) = t(f)+t(g), and this holds for every x, hence t is linear. (You can verify t(c×f) = c×t(f).)

Let's show that t(f) is continuous, i.e. t(f) belongs to our vector space of functions. Select h very small, so that a change in x, no larger than h, does not change m(x,y) by more than ε. We need uniform continuity for this step, but m is continuous on a closed bounded reagion, hence it is uniformly continuous. Let be be an upper bound for |f|. The integral, as y runs from 0 to 1, of m(x,y)×f(y), cannot change by more than b×ε, as long as x does not stray by more than h. Yet the integral is the value of t(f) at x, hence t(f) is continuous. The transform, as defined by m, is valid.

A transform, built from a function m(x,y), is hermitian iff m is hermitian. The proof is almost the same as the one you have seen before, but one direction is a little harder, because we don't have a handy set of basis vectors to extract and examine the point m(x,y). We must use the properties of continuity instead.

Suppose t is hermitian, and for some x and y, m(x,y) is not the conjugate of m(y,x). In fact m(x,y) differs from the conjugate of m(y,x) by a value we will call d. Since m is continuous, we can move x and y about, just a little bit, and m(x,y) will still differ fromt the conjugate of m(y,x) by at least d/2.

Let f be the function that is zero everywhere, except for a spike at y. This is a tall skinny rectangle, with area 1, centered at y. Lay this function along a vertical line at or near x. The resulting integral is pretty close to m(x,y). Then take the dot product with a function g, which has a skinny rectangle of area 1 centered at x. The result is close to m(x,y). By making the rectangles of f and g taller and skinnier, we can make t(f).g as close to m(x,y) as we like.

At the same time, f.t(g) is arbitrarily close to the conjugate of m(y,x). Since t is hermitian, these are equal. They cannot differ by d/2. Therefore m(x,y) and m(y,x) are conjugate, and m is hermitian.

So far we have stressed the similarities between the continuous and discrete models, but there are some important differences. Here is an example.

When dealing with discrete vectors, every transform had an associated matrix. This does not hold in the continuous world. Let t() be the differentiation operator, and suppose there is a continuous function m(x,y) that implements this operator. Let b be the upper bound of m. Let f be a function that floats between 0 and 1, yet for some point x, f′(x) exceeds b. Thus t(f), evaluated at x, is larger than b. However, the integral of m(x,y)×f(y) is bounded by b.