This course, Applications in Linear Algebra, describes various ways we can use linear algebra in the real world.

We begin by describing a brief but important theorem for this course.

Theorem: Uniqueness of Invertible Linear Systems

Suppose we have a linear system

with variables and equations. Then, if is an invertible matrix, then the linear system has 1 and only 1 solution, and we can find it by taking

Notes

Notes for this course are given below.


Matrix Exponentials and Rotations

Contextualization

Suppose we want to solve

Recall that from Heat Diffusion, we can find a solution , and to compute this, we want to diagonalize to find

But when we try to compute this, we find complex eigenvalues with eigenvectors!

This will give us diagonalization

And subsequent solution

By a theorem (Euler’s Formula), it is true that for any real number , . So,

This gives us final solution !

We make some interesting notes from this:

  • Even though we had to work with imaginary numbers, our final answer is a real answer! This should make sense. as has to be a real matrix given that is real (a series of real matrices must also be real).
  • Our matrix gives us a CCW rotation by radians!

We ask, why does this happen? What proeprties of cause this to happen?

It’s not because started as a rotation matrix! If we tried this with another rotation matrix, we may not get the same answer.

Matrix Exponentials

We first continue our discussion of matrix exponentials.


Analogous to , for the matrix,


What about ? Is it true for matrices that ?

No! And in fact, it occurs because matrix multiplication is not commutative. While in , all terms are of the form (all matrices first, then matrices after), powers like

flip the order of the matrix multiplication!

However, if we have matrices , whose product are commutative, then this property holds!

Proposition

If , then

Corollary

Let be any matrix.

  1. For any scalars ,
  2. So, has an inverse, given as for any square matrix .

Theorem

For an matrix ,

This happens because the transpose operation is linear and continuous, so we can transpose the series term-by-term!

Theorem

Let be an matrix.

  1. If is an eigenvalue for , then is an eigenvalue for .
  2. More precisely, if is an eigenvector for with eigenvalue , then is an eigenvector for with eigenvalue .

The trace of an matrix is the sum of the diagonal entries of denoted .

Theorem

For an matrix ,

  1. equals the sum of the eigenvalues of .
  2. equals the product of the eigenvalues of .

Using these facts, we can explain the following.

Theorem

In particular, if has real entries, then the determinant of will always strictly be positive.

We can also use this to know if a matrix isn’t an exponential of any matrix!

Rotations

So, for what is a rotation? To answer this question, we must first define what exactly a “rotation” matrix is.

An matrix is orthogonal if satisfies

Theorem

Let be . The following are equivalent:

  1. is an orthogonal matrix.
  2. The columns of form an orthonormal basis for , meaning they are orthogonal to each other and are unit vectors.
  3. for all . In other words, preserves the length of vectors.

Rotations and reflections are orthogonal matrices by condition (4) of the theorem! How do we know what an orthogonal matrix is classified as?

Note that every orthogonal satisfies

It turns out, rotations have determinant 1, and reflections have determinant -1.

So, we can define a rotation matrix as an orthogonal matrix with determinant 1. So, for such that is a rotation, we need to be orthogonal with .

We know that for any . Furthermore, for to be orthogonal, we need that

So, this relation holds whenever . is called skew-symmetric if .

Theorem

If is skew-symmetric, then will be a rotation matrix for any real !

Example: 2-Dimensional Rotations

Observe that this is a skew-symmetric matrix, and we saw that

We ask, are there any other skew-symmetric matrices? For to be skew-symmetric,

So, we need , , , and . Thus forces , and must be opposites of each other. So, we have general form

So, there are no other skew-symmetric matrices!


Similarly, for the 3-dimensional case, we can find general form

So, will give us a family of rotations! But what rotation does it actually represent? In other words, what is the rotations’ axis and angle?

Note that the axis must always go through the origin, as the transformation is linear.

Consider a skew-symmetric matrix of the form

Observe that we can make a vector , which is an eigenvector for matrix with eigenvalue 0! This implies that is also an eigenvector for (similarly, ), with eigenvalue .

This means that the transformation does nothing to the vector ! Meaning, if is a rotation matrix, then must be the axis of rotation!

So, the family of rotations given by has the axis of rotation given as the line through the origin given by vector . Furthermore, as changes, so does the angle of rotation.

may not be exactly the angle of rotation though, since may stretch the vectors!

Theorem: 3D Rotation Matrices

Let be a unit vector, and let

Then, the rotation about the line through the origin in the direction of by radians is given by !

This theorem is 3D specific! Don’t try to generalize it to higher dimensions.

Example: Rotation Matrices

Use the above theorem to find .

Here, we want to rotate about unit vector . We can find our rotation by taking

And taking the exponential

Example: Rotation Matrices (3)

We know is some 3D rotation. What is it’s axis / angle?

We can use the theorem to find this. First, we need to recognize as the matrix built from a unit vector, by finding scalar multiple

So, we find that we have a rotation about

With rotation radians.

Singular Value Decomposition

Review: Symmetric Matrices and the Spectral Theorem

An matrix is symmetric if

Theorem: Spectral Theorem for Real Symmetric Matrices

Let be an symmetric matrix with real entries. Then,

  1. It has all real eigenvalues
  2. Any pair of eigenvectors for with different eigenvalues are going to be orthogonal
  3. is diagonalizable
  4. There is an orthonormal basis for that consists of eigenvectors for this matrix.
  5. can be orthogonally diagonalized, in other words, for orthogonal matrix ,

    Recall that is equivalent to having orthonormal columns.

Example: Orthogonally Diagonalizing a Symmetric Matrix

Notice how is a symmetric matrix, so its orthogonally diagonalizable. Let’s orthogonally diagonalize it.

  • We find eigenvalues .
  • We find eigenvectors .

Our eigenvectors are orthogonal, but not orthonormal! Thus, we need to rescale them to get unit eigenvectors.

So, we orthogonally diagonalize as

Singular Value Decompositions (SVDs)

Let be an matrix with real entries. A singular value decomposition (SVD) for is a factorization of the form

Where

  • is an orthogonal matrix
  • is diagonal matrix, with non-negative (real) diagonal entries
    known as the singular values of .
  • is an orthogonal matrix, where the columns are known as the right singular vectors of .

Singular value decompositions are very general!

Note that even for a square diagonalizable , its diagonalization need not be its singular value decomposition!

Example

How to we find singular value decompositions? First, note that for any matrix ,

  • and are symmetric matrices of size and , respectively. Hence, by the spectral theorem, they are orthogonally diagonalizable.
  • The eigenvalues of and are non-negative real numbers.
  • and have the same eigenvalues (with the same multiplicities), except for .

Now suppose that has a singular value decomposition.

Then,

This is an orthogonal diagonalization of our matrix! Similarly, we can find

So, we can find

  • as an orthonormal basis of eigenvectors for ( in the orthogonal diagonalization)
  • as the (positive) square roots of ’s eigenvalues
  • as the orthonormal basis of eigenvectors for ( in the orthogonal diagonalization)

Note that this equation is equivalent to

So, an additional requirement for these systems is that for any , , and furthermore, if , then this implies

Theorem: Singular Value Decompositions

Every matrix has a singular value decomposition

Where

  • The diagonal entries of are the square roots of the eigenvalues for
  • The columns of are an orthonormal basis of eigenvectors for
  • The columns of are an orthonormal basis of eigenvectors for

Chosen such that

For each .

So, one strategy to find an SVD for is as follows:

  1. Find the eigenvectors and eigenvalues of to get the ’s and ‘s.
  2. Use the fact that to get the ’s for when is non-zero.
  3. If necessary, get the rest of the ’s (for ) by finding eigenvectors for .

By convention, we order the singular values in decreasing order.

This can become very unreasonable to do by hand for many matrices! We can use MATLAB to do a SVD for us, using command [U, S, V] = svd(A).

Inverses and Pseudoinverses

Motivation

Using SVDs, we can define the concept of a “pseudoinverse” for non-invertible matrices!

Suppose is and invertible with SVD

Where all (otherwise, would be non-invertible). Then,

This is the SVD of ’s inverse matrix!

Note how got swapped, and all ’s get inverted!

This gives us a notion to find inverse matrices, even for matrices that don’t have a inverse! This defines a “pseudoinverse”.

Now, consider a general matrix with SVD

We can define the Moore-Penrose Pseudoinverse of to be

Where is the transpose of the matrix , where all non-negative singular values are inverted ().

Some properties of the pseudo-inverse are as follows:

  • If is and invertible, then .
  • If is with linearly independent columns, then .

Now consider a linear system .

  • If is and invertible, then
  • If is with linearly independent columns, then
    The unique least-squares solution to the system!

What if has infinitely many solutions? Or what if is inconsistent but has infinitely many least-squares solutions? What does mean in these cases?

Theorem

The vector is the least-squares solution of the system , with the smallest possible norm .

Image Compression

Matrix Approximations

Consider the matrix , with SVD .

We ask, how could we approximate with a lower rank matrix?

has rank equal to the number of non-zero singular values!

Well, for a rank , one way we could approximate is by dropping all singular values between to !

We can find that this is actually the best approximation to possible.

Theorem: Eckart-Young Theorem

is the best approximation to , with error given by the magnitude of the singular values we dropped.

This is known as the Frobenius norm, and we can also find it by taking the sum of the squares of the entries in the matrix (then square rooting).

Example

Find the best rank 1 approximation to

We can find SVD

With this, we can find rank 1 approximation by dropping the smallest singular value.

And furthermore, according to the theorem above, we can find error

Now say we do a rank 1 approximation on

We find ! Because is much larger than our previous example, we find a higher error, so we should expect our approximation to be a lot worse.

Image Compression

But why do we want to be able to approximate matrices like this?

Well, if we write

Each of these terms yields a matrix!

Theorem

If has rank , then

Consequently, the lower rank approximations to are:

We can find the lower rank approximations by dropping the smallest terms!

This gives us a way to store lower rank approximations! For example, instead of explicitly storing , we only need to store , and the computer can reconstruct the original matrix for us!

This is a lot cheaper than storing . If is , for example, then instead of storing the entire matrix (1 million entries), we only need to store entries! If we store these entries in a file, then the computer can take these entries and regenerate the original image!

Generalizing, we can compute matrix provided we know and store the collection

Which would take entries, opposed to the original entries of the matrix! In fact we can actually have our approximation take entries, if we multiply the ’s into one of the vectors!

This will be useful if

So suppose we have a (grayscale) picture that is pixels (could be ). THe color of each pixel is a shade of gray, encoded as a number between 0 (black) and 1 (white).

This gives us a marix , which we can compress using rank approximations!

Now, is there a quantitative way to gauge the quality of our compressed image?

  1. The error in approximation is found as the Frobenius Norm,
  2. But the above error can be great if you have many pixels! So, normalizing the above error, we can find
    It is convenient to work with the square of this! This tells us how bad the compression is.
    Now, if we want to know how good it is, we can subtract it from 1! This is known as the compression rate / image quality, tracking the percentage of variance preserved from the compression.