(Linear) Discrete Dynamical Systems
We first review some concepts around eigenvalues and eigenvectors.
An eigenvector for is a non-zero vector with the property that
Where is some scalar value. In other words, is a scalar multiple of .
We can solve for eigenvectors using the following system:
is the identity matrix.
If is an eigenvalue for , then the following are also true.
- There is a non-zero such that .
- There is such that
- is not invertible
Additionally, if is an eigenvalue, then we know that satisfies the characteristic equation
So, generally, the process is:
- Find the eigenvalues for by solving the characteristic equation for
- Find eigenvectors for by solving the system .
Example: Eigenvectors and Eigenvalues
We find the eigenvalues first by solving
We can now find our eigenvectors. Say we want the eigenvector for . Then, we solve system
An matrix is diagonalizable if there is an invertible matrix and a diagonal matrix such that
Theorem: Eigenvectors of Diagonalizable Matrices
An matrix is diagonalizable if and only if must have linearly independent eigenvectors .
In this case, where ’s columns are the eigenvectors, and ’s diagonal values are their respective eigenvalues.
Let’s see how we can use these concepts.
Suppose we have a population of predators (ex. hawks), who live among a population of prey (ex. rats). Let be the number of hawks after months have passed, and be the number of rats (in thousands) after months. We’ll only consider integers of months.
We can represent and make inferences about this system as follows:
Example: Finding
Assume that these populations evolve according to some sort of model as follows:
Note that if we know this, then given the populations in some month , we should be able to compute the populations in the next month!
Now say we know initial populations , , and
To find , we find that
This gives us a nice way to find any given our initial conditions !
We can plug in our values to find what the populations are for any month .
For this to be useful for large , we want to be able to find a formula for . How could we do this?
Diagonal Matrices
If was diagonal, this would be really easy to solve! The power of any diagonal matrix is just it’s entries raised to each power individually.
What if is diagonalizable?
Then, interestingly enough,
This can be used to conveniently find ! We can then multiply this with to find a closed formula for .
Example: Finding
In our above system, we can diagonalize our matrix as
And find
We can see that as , both populations will go to , so they won’t die off! For large ,
Note that
Which tells us what the stable proportion of hawks to rats is in the long term!
Another application of these systems is in Recurrence Relations. Consider the following example.
Example: Recurrence Relations
THe Fibonacci Numbers are given as
Can we find a closed formula for ?
To know what comes next, we need a pair of consecutive Fibonacci numbers. Let . Then,
We’ve found a formula for in terms of (with initial conditions .
We can solve for as
To find a closed form solution.
Markov Chains
Suppose every year, we have changes
- 10% of those living in the city, decide to move to the suburbs.
- 5% of those living in the suburbs decide to move to the city.
Let denote the proportion of the total population living in the city after years, denote the proportion of the tital population living in the suburbs after years. Note that by this definition,
Let . This gives us system
Which is a special kind of linear discrete system called a Markov Chain! This matrix has the unique property that all column sums equal 1.
Now suppose we initially have 40% of the population in the city, 60% of the population in the suburbs. Then, using our matrix we can deduce unique things about our system! For example, what happens to as ?
While we could diagonalize our matrix like in linear discrete systems, the unique properties of Markov Chains gives us a more convenient way to do this!
Theorem
Say we have a Markov Chain such that
exists is non-zero. Then, is an eigenvector for with eigenvalue .
Given the above theorem, we can find what we converge to by solving for the eigenvalues and eigenvectors! In our previous example, we find eigenvalue and eigenvector
Note that by the assumption that the proportions should sum to 1, there is only one eigenvector that satisfies our requirements.
This is what our system will converge to for our input, and in fact, for any valid input to the system!
A probability vector is a vector such for all , , and
An matrix is called a stochastic matrix if every column of is a probability vector.
Theorem: Stochastic Matrices and Probability Vectors
If is stochastic and is an probability vector, then is a probability vector.
Furthermore, the product of two stochastic matrices is stochastic (as multiplying one matrix by another is essentially a bunch of matrix-vector products)
So, if is stochastic, then is stochastic for any positive integer .
A Markov Chain is a dynamical system
Where is stochastic, and is a probability vector. By the above theorem, it follows that all are also probability vectors.
In a Markov Chain, sometimes the stochastic matrix is called a transition matrix.
Theorem
If is stochastic, then is an eigenvalue for .
Proof (Sketch)
It can be shown that and always have the same eigenvalues, and because of this, any vector times is itself (as the rows of are probability vectors).
A stochastic matrix is called regular if there is some integer such that all entries of are strictly positive (nonzero).
If is regular, then it holds that is regular for all . So, one way to verify if a matrix is regular is just to raise it to a high power and check if we get a regular matrix!
Example: Regular Stochastic Matrices
has all positive entries, so it is regular with .
has all positive entries, so is regular with .
is upper triangular, and a product of upper triangular matrices is upper triangular. Thus, is not stochastic.
A steady state vector for a stochastic is a probability vector such that
In other words, a probability vector that is an eigenvector for for .
Theorem
Suppose is a regular stochastic matrix. Then, there is a unique steady state vector for .
Further, if is any probability vector, then
This is an important result we use to solve Markov Chains!
Example
The weather in Columbus is either good, indifferent, or bad on any given day.
- If good today, then for tomorrow we have 60% of good, 30% of indifferent, 10% of bad.
- If indifferent today, then for tomorrow we have 40% good, 30% indifferent, 30% bad.
- If bad today, then for tomorrow we have 40% good, 50% indifferent, 10% bad.
What is the probability that any given day has good weather?
We start by creating transition matrix
This is a Markov Chain with a regular ! So, by our theorem, we can find a unique steady state vector .
Recall the steady state vector has to be a probability vector! So, we need to normalize the eigenvector we find for .
So in the long term, of the days are good, are indifferent, and of the days are bad.
Note that if contains probabilities for the weather today, then contains the probabilities days from now.
Interpreting
The entry of , , represents the probability of moving from state to state . What about the entries of ?
The entry of represents the probability of starting at state , and ending at state after steps.
Based on this, if is regular, then we know that for all , there is some integer such that it is possible to get from !
Example: Random Walks
Consider the following maze with rooms.
graph LR 1 o--o 2 & 3; 2 o--o 3 & 4; 4 o--o 3; 5 o--o 3 & 4;
A mouse runs through this maze with 5 rooms. At each time step, (every second), the mouse will leave its current room and move to a new room, choosing the next room randomly.
This is a particular type of Markov Chain called a random walk. This gives us matrix
We can analyze this matrix as we’ve done previously!
Google Pagerank Algorithm
The Google Pagerank algorithm is an application of Markov Chains.
Google needs to rank webpages based on which are “important”. This will be based entirely on how the webpages link to each other. The basic idea is that a webpage is important if many other pages link to it. However, being linked to by an important webpage should carry more weight than being linked to by a page no one cares about.
Google’s idea is to treat websurfing like a random walk in which the websurfer is rnadomly clicking links. A page is ranked based on the percentage of time the of the random walk is spent on the page (from the steady state).
Some issues with this:
- Some pages don’t have outbound links.
- Sometimes, websurfers load up a new page without following a link.
We assume the following about the random websurfer (RW):
- RW starts at some page.
- If there are outbound links, there is an 85% chance that RW chooses one of the links, considered equally likely. There is a 15% chance that RW chooses to visit a page at random from all possible pages (not following a link).
- If the page has no outbound links, there is a 100% chance RW visits a random page, chosen from all possible pages.
- RW continues this forever.
Example: Google Pagerank
Suppose there are 4 webpages, linked as follows:
graph LR 3 --> 1; 2 --> 3; 1 --> 2 & 3; 4 --> 2 & 3;
This gives us transition matrix
Note that if there no outbound links, then the page would have a 100% chance to go somewhere random. We list this by just filling both columns in the 0.85 and 0.15 matrix with an equal chance of going to any website.
We can then use this matrix and create a steady state vector!
Absorbing Markov Chains
Here we consider a different type of Markov chain, where the underlying system has distinct ending points. In these systems, we ask about the ending state of the system, not its (as we’re used to) the equilibrium point as time goes to infinity.
In tennis, you have to win by at least 2 scores. If both players score 3, the game enters “deuce”.
During a deuce, the game flow is
graph LR
1[Deuce] -. Player A Scores .-> 2[Advantage A];
1 -. Player B Scores .-> 3[Advantage B];
2 -. Player B Scores .-> 1;
3 -. Player A Scores .-> 1;
2 -. Player A Scores .-> 4[A Wins];
3 -. Player B Scores .-> 5[B Wins];
Suppose at any given moment, Player A scores with probability , so Player B cores with probability . We’ll assume .
What is the probability that player wins?
We can model this as a Markov Chain
graph LR
1[1] -. "0.6" .-> 2[2];
1 -. "0.4".-> 3[3];
2 -. "0.4" .-> 1;
3 -. "0.6" .-> 1;
2 -. "0.6" .-> 4[4];
3 -. "0.4" .-> 5[5];
4 -. "1" .-> 4;
5 -. "1" .-> 5;
Note the transitions at the end points back to themselves. These are called absorbing states, and we define them so that we have a valid Markov Chain.
This gives us transition matrix
States 4,5 are called absorbing states, as they are states you can’t leave. States 1,2,3 are called transient states. This is an absorbing Markov Chain.
Note that is not regular, and we are not interested in simply finding the steady state vectors of the system.
If the game starts in deuce, then our starting point is , and
Gives probabilities for being in each of the states after scores. If we did some of the computations,
After scores, we start to see our final probabilities for the game’s outcome, and we can say that has a 69.23% chance to win, and has a 30.77% chance to win!
In general, we find that our answers came from
So, we want a better understanding of . Given our , let’s first separte our transient states from our absorbing states, both in columns and rows. This gives us a grid of blocks,
Where are matrices, and is the identity matrix.
Consider powers .
Continuing this for any , we find the general form
So,
Interestingly, the sum is equivalent to ! This is where we’ll find all of our interesting probabilities.
Theorem: Absorbing Markov Chains
For an absorbing Markov Chain,
- The entry of contains the probability of ending in state given that we start in state .
- The entry of contains the expected number of visits to state given we start in state .
- The column sum is the expected number of time steps we have until stopping.