This course, Applications in Linear Algebra, describes various ways we can use linear algebra in the real world.
We begin by describing a brief but important theorem for this course.
Theorem: Uniqueness of Invertible Linear Systems
Suppose we have a linear system
with variables and equations. Then, if is an invertible matrix, then the linear system has 1 and only 1 solution, and we can find it by taking
Notes
Notes for this course are given below.
- Leontief Input-Output
- Applications of Graphics
- Least Squares
- Markov Chains
- Heat Diffusion
- Matrix Exponentials and Rotations
- ..
- ..
Matrix Exponentials and Rotations
Contextualization
Suppose we want to solve
Recall that from Heat Diffusion, we can find a solution , and to compute this, we want to diagonalize to find
But when we try to compute this, we find complex eigenvalues with eigenvectors!
This will give us diagonalization
And subsequent solution
By a theorem (Euler’s Formula), it is true that for any real number , . So,
This gives us final solution !
We make some interesting notes from this:
- Even though we had to work with imaginary numbers, our final answer is a real answer! This should make sense. as has to be a real matrix given that is real (a series of real matrices must also be real).
- Our matrix gives us a CCW rotation by radians!
We ask, why does this happen? What proeprties of cause this to happen?
It’s not because started as a rotation matrix! If we tried this with another rotation matrix, we may not get the same answer.
Matrix Exponentials
We first continue our discussion of matrix exponentials.
Analogous to , for the matrix,
What about ? Is it true for matrices that ?
No! And in fact, it occurs because matrix multiplication is not commutative. While in , all terms are of the form (all matrices first, then matrices after), powers like
flip the order of the matrix multiplication!
However, if we have matrices , whose product are commutative, then this property holds!
Proposition
If , then
Corollary
Let be any matrix.
- For any scalars ,
So, has an inverse, given as for any square matrix .
Theorem
For an matrix ,
This happens because the transpose operation is linear and continuous, so we can transpose the series term-by-term!
Theorem
Let be an matrix.
- If is an eigenvalue for , then is an eigenvalue for .
- More precisely, if is an eigenvector for with eigenvalue , then is an eigenvector for with eigenvalue .
The trace of an matrix is the sum of the diagonal entries of denoted .
Theorem
For an matrix ,
- equals the sum of the eigenvalues of .
- equals the product of the eigenvalues of .
Using these facts, we can explain the following.
Theorem
In particular, if has real entries, then the determinant of will always strictly be positive.
We can also use this to know if a matrix isn’t an exponential of any matrix!
Proof
If the eigenvalues of are , then the eigenvalues of are , and as the determinant is the product of eigenvalues,
Rotations
So, for what is a rotation? To answer this question, we must first define what exactly a “rotation” matrix is.
An matrix is orthogonal if satisfies
Theorem
Let be . The following are equivalent:
- is an orthogonal matrix.
- The columns of form an orthonormal basis for , meaning they are orthogonal to each other and are unit vectors.
- for all . In other words, preserves the length of vectors.
Rotations and reflections are orthogonal matrices by condition (4) of the theorem! How do we know what an orthogonal matrix is classified as?
Note that every orthogonal satisfies
Proof
It turns out, rotations have determinant 1, and reflections have determinant -1.
So, we can define a rotation matrix as an orthogonal matrix with determinant 1. So, for such that is a rotation, we need to be orthogonal with .
We know that for any . Furthermore, for to be orthogonal, we need that
So, this relation holds whenever . is called skew-symmetric if .
Theorem
If is skew-symmetric, then will be a rotation matrix for any real !
Example: 2-Dimensional Rotations
Observe that this is a skew-symmetric matrix, and we saw that
We ask, are there any other skew-symmetric matrices? For to be skew-symmetric,
So, we need , , , and . Thus forces , and must be opposites of each other. So, we have general form
So, there are no other skew-symmetric matrices!
Similarly, for the 3-dimensional case, we can find general form
So, will give us a family of rotations! But what rotation does it actually represent? In other words, what is the rotations’ axis and angle?
Note that the axis must always go through the origin, as the transformation is linear.
Consider a skew-symmetric matrix of the form
Observe that we can make a vector , which is an eigenvector for matrix with eigenvalue 0! This implies that is also an eigenvector for (similarly, ), with eigenvalue .
This means that the transformation does nothing to the vector ! Meaning, if is a rotation matrix, then must be the axis of rotation!
So, the family of rotations given by has the axis of rotation given as the line through the origin given by vector . Furthermore, as changes, so does the angle of rotation.
may not be exactly the angle of rotation though, since may stretch the vectors!
Theorem: 3D Rotation Matrices
Let be a unit vector, and let
Then, the rotation about the line through the origin in the direction of by radians is given by !
This theorem is 3D specific! Don’t try to generalize it to higher dimensions.
Example: Rotation Matrices
Use the above theorem to find .
Here, we want to rotate about unit vector . We can find our rotation by taking
And taking the exponential
Example: Rotation Matrices (2)
Find the matrix for rotation by 27 degrees around axis through origin in the direction of
First, we take a unit vector in the same direction
Furthermore, we have angle
We can use this to find
And find our rotation matrix as
Example: Rotation Matrices (3)
We know is some 3D rotation. What is it’s axis / angle?
We can use the theorem to find this. First, we need to recognize as the matrix built from a unit vector, by finding scalar multiple
So, we find that we have a rotation about
With rotation radians.
Singular Value Decomposition
Review: Symmetric Matrices and the Spectral Theorem
An matrix is symmetric if
Theorem: Spectral Theorem for Real Symmetric Matrices
Let be an symmetric matrix with real entries. Then,
- It has all real eigenvalues
- Any pair of eigenvectors for with different eigenvalues are going to be orthogonal
- is diagonalizable
- There is an orthonormal basis for that consists of eigenvectors for this matrix.
- can be orthogonally diagonalized, in other words, for orthogonal matrix ,
Recall that is equivalent to having orthonormal columns.
Example: Orthogonally Diagonalizing a Symmetric Matrix
Notice how is a symmetric matrix, so its orthogonally diagonalizable. Let’s orthogonally diagonalize it.
- We find eigenvalues .
- We find eigenvectors .
Our eigenvectors are orthogonal, but not orthonormal! Thus, we need to rescale them to get unit eigenvectors.
So, we orthogonally diagonalize as
Singular Value Decompositions (SVDs)
Let be an matrix with real entries. A singular value decomposition (SVD) for is a factorization of the form
Where
- is an orthogonal matrix
- is diagonal matrix, with non-negative (real) diagonal entries
known as the singular values of .
- is an orthogonal matrix, where the columns are known as the right singular vectors of .
Singular value decompositions are very general!
Note that even for a square diagonalizable , its diagonalization need not be its singular value decomposition!
Example
How to we find singular value decompositions? First, note that for any matrix ,
- and are symmetric matrices of size and , respectively. Hence, by the spectral theorem, they are orthogonally diagonalizable.
- The eigenvalues of and are non-negative real numbers.
- and have the same eigenvalues (with the same multiplicities), except for .
Now suppose that has a singular value decomposition.
Then,
This is an orthogonal diagonalization of our matrix! Similarly, we can find
So, we can find
- as an orthonormal basis of eigenvectors for ( in the orthogonal diagonalization)
- as the (positive) square roots of ’s eigenvalues
- as the orthonormal basis of eigenvectors for ( in the orthogonal diagonalization)
Note that this equation is equivalent to
So, an additional requirement for these systems is that for any , , and furthermore, if , then this implies
Theorem: Singular Value Decompositions
Every matrix has a singular value decomposition
Where
- The diagonal entries of are the square roots of the eigenvalues for
- The columns of are an orthonormal basis of eigenvectors for
- The columns of are an orthonormal basis of eigenvectors for
Chosen such that
For each .
So, one strategy to find an SVD for is as follows:
- Find the eigenvectors and eigenvalues of to get the ’s and ‘s.
- Use the fact that to get the ’s for when is non-zero.
- If necessary, get the rest of the ’s (for ) by finding eigenvectors for .
By convention, we order the singular values in decreasing order.
This can become very unreasonable to do by hand for many matrices! We can use MATLAB to do a SVD for us, using command
[U, S, V] = svd(A)
.
Example: Singular Value Decompositions
Find the SVD of
We start by finding
To find eigenvalues , and eigenvectors
These are our right singular vectors, with singular values !
We now find our ‘s.
Finally, we need a , a unit eigenvector for with eigenvalue . So, we solve .
This gives us final result
Inverses and Pseudoinverses
Motivation
Using SVDs, we can define the concept of a “pseudoinverse” for non-invertible matrices!
Suppose is and invertible with SVD
Where all (otherwise, would be non-invertible). Then,
This is the SVD of ’s inverse matrix!
Note how got swapped, and all ’s get inverted!
This gives us a notion to find inverse matrices, even for matrices that don’t have a inverse! This defines a “pseudoinverse”.
Now, consider a general matrix with SVD
We can define the Moore-Penrose Pseudoinverse of to be
Where is the transpose of the matrix , where all non-negative singular values are inverted ().
Some properties of the pseudo-inverse are as follows:
- If is and invertible, then .
- If is with linearly independent columns, then .
Example: Pseudo-Inverses
A has SVD
So, we can find its pseudo-inverse as
Now consider a linear system .
- If is and invertible, then
- If is with linearly independent columns, then
The unique least-squares solution to the system!
What if has infinitely many solutions? Or what if is inconsistent but has infinitely many least-squares solutions? What does mean in these cases?
Theorem
The vector is the least-squares solution of the system , with the smallest possible norm .
Image Compression
Matrix Approximations
Consider the matrix , with SVD .
We ask, how could we approximate with a lower rank matrix?
has rank equal to the number of non-zero singular values!
Well, for a rank , one way we could approximate is by dropping all singular values between to !
We can find that this is actually the best approximation to possible.
Theorem: Eckart-Young Theorem
is the best approximation to , with error given by the magnitude of the singular values we dropped.
This is known as the Frobenius norm, and we can also find it by taking the sum of the squares of the entries in the matrix (then square rooting).
Example
Find the best rank 1 approximation to
We can find SVD
With this, we can find rank 1 approximation by dropping the smallest singular value.
And furthermore, according to the theorem above, we can find error
Now say we do a rank 1 approximation on
We find ! Because is much larger than our previous example, we find a higher error, so we should expect our approximation to be a lot worse.
Image Compression
But why do we want to be able to approximate matrices like this?
Well, if we write
Each of these terms yields a matrix!
Theorem
If has rank , then
Consequently, the lower rank approximations to are:
We can find the lower rank approximations by dropping the smallest terms!
This gives us a way to store lower rank approximations! For example, instead of explicitly storing , we only need to store , and the computer can reconstruct the original matrix for us!
This is a lot cheaper than storing . If is , for example, then instead of storing the entire matrix (1 million entries), we only need to store entries! If we store these entries in a file, then the computer can take these entries and regenerate the original image!
Generalizing, we can compute matrix provided we know and store the collection
Which would take entries, opposed to the original entries of the matrix! In fact we can actually have our approximation take entries, if we multiply the ’s into one of the vectors!
This will be useful if
So suppose we have a (grayscale) picture that is pixels (could be ). THe color of each pixel is a shade of gray, encoded as a number between 0 (black) and 1 (white).
This gives us a marix , which we can compress using rank approximations!
Now, is there a quantitative way to gauge the quality of our compressed image?
- The error in approximation is found as the Frobenius Norm,
- But the above error can be great if you have many pixels! So, normalizing the above error, we can find
It is convenient to work with the square of this! This tells us how bad the compression is.Now, if we want to know how good it is, we can subtract it from 1! This is known as the compression rate / image quality, tracking the percentage of variance preserved from the compression.