MATH411 is a continuation of MATH410, but into higher dimensions.
Derivatives in Several Variables
Limits
Let . Then, is a limit point of if there is a sequence such that .
In other words, there is a sequence not containing converging to it.
If we have function , and is a limit point of , then the limit of the function is defined as
if ,
Example: Existence of Limits (1)
.
does not exist, as we can chose , and , which have different limits as .
Example: Existence of Limits (2)
In this case, exists and equals 0.
Let not containing . Then,
Theorem: Compositions of Limits
Let , be a limit point. Let , be functions, such that
Then:
- If for all , and ,
The quotient rule for limits is the most interesting of the 3, and there is a broad study of limits of quotients
Where .
These limits can occur frequently, and we commonly ask if such limits exist (think of derivatives!).
Example: Limit Example
We ask if this limit exists. To determine this, we will establish a bound on the function.
For any where and are both not equal 0, our function is bounded by ! Thus, as , , so by the Comparison Lemma, .
Thus, the limit exists and is equal to 0!
Theorem: Limit Equivalences
Let and let be a limit point of . For a function , and , the following assertions are equivalent:
In other words, for any , if , then- , there exists some such that
We can also use the following property to show that such limits exist.
A function is homogeneous of degree if
Basically, we should be able to replace with , and take out the into a term.
Example: Homogenoeus Functions
Is homogeneous of degree 1, because
Is homogeneous of degree 0, because
Curiously, is homogeneous of degree 1 and has a limit to , whereas is homogeneous of degree 0 and doesn’t. Does there suggest some generalization?
Proposition: Limits of Homogeneous Functions
If is continuous, and homogeneous of degree , then
Proof
We look at , and we try to make the case that as , .
We write this as a product of something that approaches 0 and something that is bounded.
By assumption, is homogeneous of degree , so letting ,
As , ! Also, as (the unit sphere of dimension ), which is a sequentially compact, then continuous functions on sequentially compact sets are bounded.
Example: Continuity with Homogeneity
We show that is by homogeneity.
We know that is in is homogeneous of degree 2. So, is homogeneous of degree 1.
It is a fact that if and is homogeneous of degree , then is homogeneous of degree .
Let , . Then, say is homogeneous degree .
By a theorem, homogeneous functions of positive degree go to 0 as . So, .
This does not necessarily mean that homogeneous functions of degree don’t have a limit at 0! Just that some don’t.
Any constant function (ex. ) have defined limits as !
Partial Derivatives
Let , .
For , we define the partial derivative of with respect to at as
if the latter limit exists. Note that is the basis vector,
Keep all variables constant except , and take !
Example: Partial Derivatives and Continuity
Let ,
We noticed that is not continuous at , and exist (by the quotient rule) at all .
Then, does exist at ?
Yes! It exists and is equal to 0.
Similarly, we can find .
This is very interesting! Even though is not continuous at , our partial derivatives still exist! This goes against our understanding of differentiability and continuity in the single variable case.
Let open, . Then, we say has first-order partial derivatives if for all , the function has a partial derivative with respect to its component, at every point in its domain.
Differentiability Need Not Imply Continuity
In the single-variable case, a function with a derivative was continuous. However, this is no longer true in multiple variables!
A function with first-order derivatives need not be continuous. Consider the following example.
Example: Differentiability Need Not Imply Continuity
Define
We show that the partial derivatives of the function exist at . For all ,
So,
However, this function is not continuous! For sequence , for all , but !
It is only if all partials are continuous, that our theorems from the single-variable case hold.
We say that is continuously differentiable, if it has first-order partial derivatives such that each partial derivative is continuous for .
Let’s now consider second-order partial derivatives, denoted like
Where we apply the partial derivative of first, then after.
Order matters! There are some functions where swapping the order of derivatives changes the result.
- We say has second-order partial derivatives of it has first-order partials, such that for , each also has first-order partial derivatives (of every variable).
- We say has continuous second-order partial derivatives if it has second-order partial derivatives, and each are continuous.
Theorem: Partial Derivative Order
Let open, and let have continuous second-order partial derivatives. Then, for any two , and any ,
Proof (TODO)
Let . Then,
Define . Then,
By the mean value theorem, this is equal to
We create two equivalent functions that converge to the two partial derivatives, respectively, forcing equality.
Directional Derivatives and MVT
Recall that in the single variable case, we had the Mean Value Theorem.
Theorem: Mean Value Theorem
Let be continuous, and differentiable on . Then, such that
This is a really useful theorem! In this section, we generalize it to multiple variables.
This generalization requires we use the single-variable MVT!
Lemma: Mean Value Lemma
Let open, and let . Let have a partial derivative with respect to for all .
Let , and be a real number such that the segment between and lies in . Then, such that
Intuition: If we view our function along an axis, we get a function on one-variable. On this, we can apply single-variable MVT!
Proof
Let be the open interval of real numbers containing 0 and . Note that by assumption, , is in our open set .
Now, define . Then, as has a partial derivative with respect to , we find that is differentiable, whose derivative is given as
Thus, we can apply the single-variable MVT to find a such that
We use this Lemma to prove the following.
Proposition: Mean Value Proposition
Let be a function. Assume all partials exist , .
Choose an , and an offset . Then, there exists a in the ball around of radius () such that
Proof
We prove this for , though the proof can very easily be extended to more dimensions.
Let , . Look at the difference. Our goal is to expand this difference into a sum of differences along one variable (only one variable changes), so that we can apply the Mean Value Lemma on each term!
This gives us two differences, where only one variable is changing in each. In other words, we have two differences in one-dimension!
Thus, by the Mean Value Lemma,
Let , and . Note that each is within the ball of . We are done!
Recall that in our definitions of partial derivatives, we differentiate a function with respect to one of the axes
But what if we wanted to differentiate in a direction that isn’t aligned with the axes? This is where directional derivatives come in!
Let open, and consider the function . For a point , and direction , we define the directional derivative as
If the limit exists.
Now let’s define the gradient of the function, , as the row vector
In some cases, we can calculate the directional derivative using the gradient, which can be a lot easier than taking a limit!
Theorem: Directional Derivative Theorem
Let open, and let be .
Then, , and all directions , the function has a directional derivative at in the direction , which can be calculated as
In other words, the inner product of with the gradient of the function!
Proof
By the Mean Value Proposition,
For . Then, as , the ball of will shrink towards , forcing all ’s to converge to ! Thus, as we have
Note that sometimes the directional derivative may be denoted as
Theorem: The Mean Value Theorem (Multi-Variable)
Let be continuously differentiable. Also let , where .
Then, if the segment joining lies in , then there exists such that
This is the Mean Value Proposition, with the additional assertion that are assumed to be at the same point.
Proof
Let , . We know that for , we have
Then, , , by the single-variable MVT, and furthermore, as the derivative of is the directional derivative,
We can also use directional derivatives to make a few extra inferences.
Note that if is a vector of norm 1, we can interpret the directional derivative as the rate of change in a particular direction!
Theorem: Fastest Rate of Change
Let , . Fix , and assume . Then, the maximum of the directional derivative at is given as
Is attained for
In other words, the direction of the gradient.
Proof
If , we have
By Cauchy-Schwarz, this is
We have an upper bound on our directional derivative! We can attain our upper bound if .
We’ve found a maximizer.
Furthermore, we can use directional dervatives to prove a notion of continuity on multiple variables.
Theorem: Partial Derivatives and Continuity
Let , and assume is continuously differentiable. Then, is continuous.
Recall that if is , then all partials exist and are continuous.
Proof
We look at , and claim that as , .
By MVT, for some ,
By Cauchy Schwarz, we can bound this by
But as is convergent, and the functions are continuous, we can find a bound for the first term!
So, this drops to 0.
By this proof, in fact, if all the partials exist and are bounded, then is still continuous!
We end with a small remark that will segway into the next section. Let . Then,
This can be proven by using Cauchy-Schwarz.
We use this to define differentiable functions! is differentiable at if such that
as .
This is a stronger notion than partial diffentiation! So,
- implies that is differentiable
- differentiable implies that all parties of exist
But, the converses are not true!
Local Approximation of Real-Valued Functions
First Order Approximations
Motivation
Say we have some function, and we want to analyze the behavior of it in an area around the point . One way to do this is to choose another function that approximates , yet is simpler! We can then work with to see what properties it has (and inherits from ).
Let , and . For a positive integer , we say that functions are order approximations of one another at if
We ask, can we find a first-order approximation for a given function ?
Theorem: First Order Approximation Theorem
Let open, be . Then, for , we have first order approximation of
Proof
Recall previously that by MVT, we find such that
We can subtract both sides by and apply Cauchy Schwarz to obtain
Because is continuously differentiable, we know that
So by the Comparison Lemma, we can force our original limit to be 0.
We can alternatively write this in a few ways.
- Let denote some error depending on and . Then, our approximation can be given as
As the error drops to 0 when dividing by , we can also say that the error is of first order, .
- Letting , fixed, we can also write our error as
So if is fixed, and is sufficiently close to , then we have a close approximation!
We can also interpret this formula geometrically. In fact, interestingly enough, our first order approximation is equivalent to a tangent plane approximation of our function!
Proof
Define to be the function . This defines a surface in 3-dimensions. We will define the tangent plane at .
At , we find tangent directions by differentiating with respect to variables and .
We take the cross product, to find a vector orthogonal to both. This will give us a vector that is a normal to our surface.
We can use this to define the tangent plane at as
Thus is the same as our first-order approximation formula! Simply redefine to be offsets , .
Second Order Approximations and Second Derivatives
Motivation
In the single-variable case, we had the second-derivative test for determining minimums and maximums. Here, we develop the corresponding test for multiple variables.
Definitions and Context
Let be an matrix. Note that for any vector , the matrix-vector product
Is equivalent to the values inner products of the row of and ! If denotes the row of , then
This fact will be useful later!
Let be an matrix. Then, the function given by
Is known as the quadratic function associated with the matrix .
This function gives us a clean notation for generalizing directional derivatives into higher orders!
Let be . We define the Hessian Matrix of , denoted , as the matrix where for each pair of indices ,
In other words,
Note that if has continuous second-order partials, then the Hessian Matrix is symmetric because the entry would equal the entry!
We use the quadratic function notation to define higher order directional derivatives. If , fixed, then
Notice the pattern!
Proof
For (1), this is a chain rule.
For (2), we have
Remark
If , then
In the above formulas, (2) will be quite useful in establishing a second-derivative criterion for the multi-variable case. However, we will also need some way to estimate the sizes of the values that quadratic functions can take on! These tools are given as follows.
Let be an matrix, . The Hilbert-Schmidt norm of is given as
We think of the matrix as a long vector, and take the vector norm.
With this norm for a matrix, we can generalize the Cauchy-Schwarz Inequality!
Theorem: Generalized Cauchy Schwarz Inequality
Let be , and . Then,
Proof
We can also define the operator norm of as
Based on this, and the Generalized Cauchy-Schwarz Inequality, we can find that for ,
Let be a matrix. is positive definite if
Similarly, is negative definite if
Proposition: Properties of Positive Definite Matrices
Let be a positive definite matrix. Then, there exists a such that
For all .
Proof
Note that the LHS and the RHS are both homogeneous degree 2.
Thus, our equation is true for if and only if it is true for any other , .
Thus, it suffices to show that our equation is true for unit vectors , as by the earlier proposition, the argument applies for all . Then,
This creates a continuous function along a hypersphere in , which is sequentially compact. Thus, it must have a minimum and maximum. Choose the minimum to find our .
In fact, we can find by taking the minimum of the eigenvalues.
Second Order Approximation and Second Derivative Test
Let , . Also, let . Then, we have the following definitions:
- is a local minimizer if there exists a such that
- is a local maximizer if there exists a such that
- is a local extreme point if it is either a local minimizer or a local maximizer for .
Note that is a strict minimizer / maximizer if the inequality is strictly less than or greater than.
In the single-variable case, we found that for a local extremum to occur, the derivative must be 0. We define the analogous case for multiple variables.
Theorem: Necessity for Local Extremum
Let open, and let have first-order partial derivatives. If is a local extreme point for , then
But unlike the single variable case, finding the ’s such that this holds is very difficult, as we get a system of equations! To help us with this, we need a more formal way to define the behaviors of functions! We define a test analogous to the single-variable Second-Derivative Test to help us with this.
By the Lagrange Remainder Theorem, recall that if , exists for every , then for all , there exists a such that
We can generalize this to the multi-variable case!
Theorem: Multi-Variable Remainder Theorem
Let , . Then, for , there exists such that
Proof
Let . Then,
For some .
Notice that this holds only because we assumed the second order derivatives are continuous. We find each term to be
This is in fact a second order approximation of !
Theorem: Second Order Approximation Theorem
Let , . Then,
Proof
With the Second-Order Approximation Theorem, we can define a multi-variable analogy to the Second-Derivative test.
Theorem: Second Derivative Test
Let , .
Let be a point such that .
If the Hessian Matrix is positive definite, then is a strict local minimizer.
If the Hessian Matrix negative definite, then is a strict local maximizer.
Proof (TODO)
We know
With .
By assumption, , and is positive definite matrix. By definition, such that .
So,
But we can’t guarantee that is positive, and in fact, it can be negative! So now, we show that could be negative, but its magnitude cannot be too large.
Since \lim_{h \to 0 \frac{R(h)}{||h||} = 0\exists \delta > 0$ such that
Thus,
By definition, is a strict local minimizer.
Is the converse of this theorem also true? In other words, let . Assume is a local minimizer. Then . But what about ? Does it have to be positive definite?
No! As a counterexample, let . Then, we have a strict local minimizer at , but the is not positive definite.
But then, what’s a necessary condition for a minimizer? We discuss this below.
Let be a symmetric matrix.
- is positive semi-definite if for all .
- is negative semi-definite if for all .
Note that the inner product can now be 0, where it couldn’t be before!
Theorem: Necessity Condition for Extremum
If has a minimizer at , then , (can be 0). In other words, has to be positive semi-definite.
Proof
Look at . Then, , and has a minimizer at . So, , and . In other words, the matrix is positive semi-definite.
In particular, if , has a local minimum at , then all
And similarly, if has a local maximum at , then all
Proposition (IMPORTANT FOR EXAM)
Let be open in , . Assume the Laplacian of at is positive for all .
Then, has no maximizer in .
Proof
Assume by contradiction that has a maximizer in , given as . Then, , and is positive semi-definite. So, , meaning
Which is a contradiction.
Theorem
Let be a bounded open set, (giving us a sequentially compact set).
Let , continuous on , satisfy
Then,
In other words, the maximum always occurs at the boundary.
Proof
Because , trivially.
Look at . Note that by this, . Then,
Thus, has no interior maximum, and so the maximum of occurs at the boundary of .
Where depends on . Let to get
We showed the inequalities in both directions, so we have equality. We are done.
Higher Order Approximations
Let , and let there be a multi-index where (a vector of 1’s and 0’s).
The multi-index will be used to “select” things we want later on!
With the multi-index, we define operations
Proposition: Multinomial Formula
Proof
We prove this by induction. Start with . Then, we have
Now, suppose that our formula is true for (). Prove it for .
Define , having length . Then, has length , and
So,
Let . Look at
We know that
The one dimensional Taylor expansion!
Express in terms of partials of , .
Let . Then, we have
We have proven the following.
Theorem: Higher Order Approximations
Let . Then,
For some
Notice the similarity with .
For , which is the 1-dimensional approximation formula!
Linear Map Approximations of Non-Linear Mappings
Motivation
Before, we studied linear mappings, or in other words, functions that can be expressed as linear transformations. Now, we turn to examine mappings that may not necessarily be linear!
Linear Mappings
We say a function is linear if for all ,
Theorem: Linear Mappings as Matrices
If is linear, then there exists a unique matrix such that
Proof
Uniqueness
Let be two non-unique matrices satisfying our property. Then,
Now, observe that for any row of , we are taking the dot product of with that row. Let be the row, to see that the norm must be 0. This is only possible if the row is the 0 vector.
Existence
Let be the standard basis. Represent as
Then,
As given above, linear transformations can be given as their matrices, and in fact, many of their properties can be expressed in terms of matrices as well!
First, we consider compositions of transformations. Let us have linear transformations , . Let be matrices such that
Then, the matrix of the composition of these transformations is
Which is the product of the matrices!
Let’s now consider inverses of transformations.
Theorem: Invertible Transformations
linear, is invertible (as a function) if and only if the corresponding matrix is invertible as a matrix if and only if .
We commonly determine that transformations are invertible by checking the matrices!
Theorem: Properties of Invertible Matrices
Let be an matrix. Then, is invertible if and only if such that
By definition, is invertible if there exists a matrix such .
Proof
Proof ()
If for all , then the null space of is , as if , then forces .
From linear algebra, we know that if is on finite dimensions , and the null space is , then the range of is and is invertible.
Proof ()
Conversely, if such that , then we wish to find a such that .
Because , then .
This is the generalized Cauchy Schwarz inequality!
Thus,
Let .
Recall that .
Is is a vector space with bases , and also , and the bases are related by
If the matrix of with respect to is , is , then .
Proof
We know that , then
So, .
The Derivative Matrix and Differential
We consider the following classes of mappings. These are mappings that may be non-linear, and are approximatable by linear mappings.
Let , and consider mapping represented as component functions
We have the following definitions for
- is said to have first-order partial derivatives at , provided that for all , has first-order partial derivatives at .
- is said to have first-order partial derivatives, if it has first-order partial derivatives for all .
- is said to be continuously differentiable provided that all of the ’s are continuously differentiable.
Theorem: Continuity on Mappings
Let , and .
Let be continuously differentiable. Then, is continuous.
Now, define with first-order partials at . We define the derivative matrix of at , denoted , as the matrix whose th entry is given by
We define the gradient of a function as a row vector.
We use this derivative matrix to generalize our findings in earlier sections.
Theorem: Mean Value Theorem for Mappings
Let . Then, for , we find such that
This is the multi-variable MVT applied to each component!
Proof
Apply the MVT for each .
Mean Value Theorem Misconception
Note that above, if we chose all ’s to be equal, then we would have
Which seems like a very clean generalization of the MVT! However, it is not guaranteed that we can find a single that works for each .
Theorem: First-Order Approximation Theorem for Mappings
Let . Then,
Proof
The component of the above quantity, from the previous chapter, is
It can be shown that at , is the only matrix in which this limit holds.
Theorem
Let . Fix , and suppose there exists an matrix such that
Then, the mapping has first-order partial derivatives at , and .
Proof
Look at the component.
In particular, for , , we get
We also say is differentiable at if there exists an , matrix such that
So, implies that F is differentiable , which implies that exists .
These are strict implications! See the examples below.
Example: Counterexamples
The below function is an example of for which exists , but there does not exist an such that
Example
Let . Assume satisfies .
Prove that such that .
We know that
Since
We know that such that
The Chain Rule
From the single-variable case, recall that for , we can find the derivative of as
We can generalize this rule to higher dimensions!
Theorem: The Chain Rule
Let , and let be continuously differentiable. Also let to define continuously differentiable.
Suppose that . Then, the composition is also continuously differentiable, and for , we can find its partial derivative as
Theorem: The Chain Rule for General Mappings
Let open. Let , and let open to define . Let be continuously differentiable.
Suppose that . Then, their composition is also continuously differentiable, and for each , we can find
The Inverse Function Theorem
Motivation
The Inverse Function Theorem provides a sufficiency condition for when a function is one-to-one and invertible, and when we can compute the inverse. We generalize this to higher dimensions here!
Inverse Function Theorem: 1D, 2D
Recall the single-variable Inverse Function Theorem.
Theorem: Inverse Function Theorem (One Dimension)
Let continuously differentiable, and let such that .
Then, there is an open interval around , and an open interval containing such that the function
Is 1-1 and onto. Furthermore, in these intervals, we can define the function inverse continuously differentiable as
for all .
This theorem is pretty important, as it tells us when we can find a function’s inverse! More importantly, it asserts that even if an entire function is not invertible, it may have smaller intervals where it is invertible.
We now generalize this theorem to higher dimensions!
We say that an open subset of containing is a neighborhood of the point . Using this definition, we will now generalize the Inverse Function Theorem to 2 dimensions.
Theorem: Inverse Function Theorem (Two Dimensions)
Let , . Suppose at , is invertible.
Then, there exists a neighborhood around , and a neighborhood around such that
Is 1-1 and onto. Furthermore, is , and for a point where , we can find the derivative matrix of the inverse as
The inverse of the derivative matrix at !
Example: Inverse Function Theorem (2D)
We find that
So for all , our derivative matrix is non-zero! So, by the 2D Inverse Function Theorem, for any , there exists a neighborhood of , of such that
Is 1-1 and onto.
We ask, what happens at ? Does there exist a neighborhood of such that is 1-1 on ?
In fact, the answer is that this is impossible, because . So, for any neighborhood around , is not one-to-one.
Example: Inverse Function Theorem (2D, 2)
Let , and
We find derivative matrix
As the determinant of this matrix is always 0, we find that is not invertible anywhere.
Example: Inverse Function Theorem (2D, 3)
Note that the Inverse Function Theorem provides a sufficiency condition, and is not a necessity.
To show this, we find a such that
Yet is 1-1 and onto. Let . Then, even though the determinant of the derivative matrix at or , is 1-1 and onto.
However, we can prove that for all 3 conclusions of our theorem are to hold, then our our assumptions must hold. Let and assume that is not invertible, ad is 1-1, is onto.
By way of contradiction, as
Then , but is not invertible at some point, which is not possible, as its inverse exists!
Example: Locality of Inverse Function Theorem (2D)
We have
Where . So, the 2D Inverse Function Theorem holds for all ! In other words, for all we can find a local neighborhood such that is 1-1 and onto.
However, this note that this theorem does not hold globally, just locally.
- is not 1-1 globally as we can find .
- is not onto globally, as there does not exist any such that .
Stability of Non-Linear Mappings
We will now introduce concepts necessary to generalize the inverse function theorem.
Theorem
For an matrix , the following are equivalent:
- is invertible
- such that , .
We say that a mapping , open, is stable if such that
Remark
stable implies that is 1-1.
As a brief proof, if , then .
Note that is stable if and only if is Lipschitz (as the inequalities are flipped!)
Interestingly, we find that matrices that are sufficiently close to an invertible matrix are also invertible. In other words, matrices that are close to 1-1 matrices are also 1-1!
Lemma
Let be an matrix, and assume that such that
Now let be an matrix such that . Then, .
Proof
This lets us prove the 1-1 condition on the General Inverse Function Theorem.
Theorem: Nonlinear Stability Theorem
Let , open. Assume that we have a point such that , is invertible.
Then there exists a neighborhood of such that
is stable on (implying is 1-1 on ).
The derivative matrix of is invertible .
Proof
For point 2, look at . Thus, neighborhood of such that on .
For point 1, look at . If belong to the ball , we have
For some on the line from to .
As we know that is invertible, then such that
If is so small that
For all . Then,
Minimization Principle, General Inverse Function Theorem
To prove the General Inverse Function Theorem, we will introduce an auxiliary function such that its minimizers are solutions of some given equation.
Suppose we have a , open, and where is invertible. Then, from the previous section, we can find a neighborhood of such that is invertible for and c > 0$ such that
Using this, we can show the following.
Proposition: A Minimization Principle
Let open in , , . Assume that the derivative matrix of is invertible .
Let , the distance between and . If has an (interior) minimizer at , then .
Proof
Suppose we have:
- A function that transforms to .
- A function that takes the squared norm of its input,
Then, . Assume is a minimizer of . Then,
Because is invertible, then
Note that because , . So,
And as .
Lemma: The Open-Image Lemma
Suppose we have a , open, and where is invertible. Then, is open.
Proof
Let . By assumption, we know that such that .
Let , the sphere of radius centered around . We know that all points along this sphere are greater than .
We will show that if , then there exists an such that .
Look at , where is the closed ball of radius around (includes the border). A minimizer exists, because we have a continuous function on a compact set. We furthermore rule out a boundary minimizer.
Let (so, ). We know that . So,
But , , so no point on can be the minimizer. So, the minimizer must be in the interior, so by the previous lemma, the minimizer must be such that .
Suppose we have a , open, and where is invertible. Using the previous proofs, we have found that exists a neighborhood of , a neighborhood of F(x^*)$, such that:
- is invertible for all
- such that
- .
By general proprties of functions, is well defined. Finally, to show that the inverse is , we will prove that
Proof
To prove this, it suffices to show that
We use the notation . On the LHS, we have
We want to show that .
So, we have
This is a first order approximation! So, this goes to 0 as , as then .
This gives us the General Inverse Function Theorem.
Theorem: General Inverse Function Theorem
Let open, and let be . Now, let be invertible for some .
Then, there is a neighborhood of , a neighborhood of , such that is 1-1 and onto. Furthermore, is also , and for such that ,
We also give a second proof of the inverse function theorem based on the contraction mapping principle.
Proof (Contraction Mapping Principle)
Let . Let where is invertible. We will show that such that if , then such that .
In other words, is locally one-to-one and onto in a local neighborhood of !
We want to solve if and only if . We will use the contraction mapping principle to show that there exists a fixed point of .
Create a sequence
Remark )
Notice the similarity to Newton’s method, which had root-finding formula (for
The main step is as follows: such that
The left hand side equals
Choose such that
So, we found a such tha
Next, we will show that maps its domain onto itself.
Let . Look at . This is equal to
So, is a contraction, and has a fixed point.
The Implicit Function Theorem
We now discuss the Implicit Function Theorem. This lets us create local descriptions of the set of points where a function is equal to 0, , also known as a level-curve!
2D Case: Dini’s Theorem
Let us have function . We ask, when is the set
A curve?
Examples
This will yield an empty set, so we don’t have a curve.
This will yield 1 point, so we don’t have a curve.
What does it actually mean for a set to be a curve?
Intuitively, such a set is a curve if we can define a function to represent the points. More formally, we say is a curve if for all points in the set , there is a neighborhood of , and function such that
In other words, the points in the neighborhood can be represented by some localized output of a function!
Theorem: Dini's Theorem
Let open in , . Let be a point in , and assume .
Then, , and a function such that , and if
And , then .
In this box, the 0-set of takes on a function .
Proof
We know
Without loss of generality, suppose . Then, such that
in the box . So, because , we know that along the vertical line , is strictly increasing, so
We can find an interval around this vertical line where it is always negative around and positive around . In other words, such that if , and if .
By IVT, for we can find a such that for all fixed in our interval , we find a in the vertical line such that . Define , the unique such that , .
Basically, we find a 0 by IVT for every vertical line in this interval!
We have constructed a such that
And if in our box, then .
We have our function . We must now show that is , a we cannot guarantee our ’s on every sliver on the interval will form a continuous function or not.
To do this, we first show that is continuous. Let . We look at as the difference between two points , and apply MVT. So, on the line from to such that
So, we get
We can find an upper bound for the numerator as we have a continuous function on a compact set! As we assumed that the numerator is positive, we can find a bound for the fraction, so such that
Thus, is continuous, as , and thus
This also gives us a formula for !
If we know is , differentiate
To get
We can solve for with this!
Implicit Function Theorem
This can be generalized to higher dimensions!
Remark
We can generalize this!
Let ,
And assume the gradient at this point is not 0. Then, by the same proof, we can find such that such that !
Theorem: The Implicit Function Theorem
We look at points , where , . Let be open in , .
Let such that , invertible. Then, , and a function
Such that
And if , and , then . Also, can be computed by the chain rule.
Proof
Let ,
Clearly, , and
Because is invertible, this matrix has a non-zero determinant, so the inverse function theorem applies!
So, and a neighborhood of such that is is 1-1, onto with a inverse . We define this inverse as
We the fact that . By plugging into , we get
So,
Define . Pick such that . Because which is , is too . Subbing this in, we get
Now, if , and , we write that
If , then .
Example: Implicit Function Theorem
Let . Assume and
Which of the following is true? , , such that
The second one! In the implicit function theorem, we need a such that is invertible. So, choose them to be , with free variable . Then, we can apply our implicit function theorem to get result (2).
We ask, is it possible for ? No. If the above holds, then by the chain rule, we find
And at ,
But this gives us , which is impossible!
Finally, we will show a formula for , . We use the property that . We know that starting with , we map
So, by chain rule,
We can use this to solve for !
Example
Describe solutions to
On the LHS, we have .
We expect this to be a curve through . We will try to describe this curve locally. We find
Define . We need invertible, so we choose (as column and in will give us an invertible matrix).
We get , or in other words, , so our solutions look like .
Surfaces and Paths in
Let . Look at the level set of this function, the set of points where the function is 0.
Assume for all . Then is a surface
Recall, that for to be a surface, , there exists a neighborhood of such that is a function.
Proof
Let . Without loss of generality, assume that .
By the implicit function theorem, there exists a and a function such that
And furthermore, these are the only solutions to in .
So we have a transformation paramterizing near . To figure out the tangent vectors at , look at the change in one variable along a particular direction.
- Fixing , we differentiate in the direction to get tangent
- Fixing , we differentiate in the direction to get tangent
We claim that . Look at for all . By the chain rule,
So, . By a similar argument, it is also orthogonal to .
We can find another vector orthogonal to by taking the cross product! Take . Then, such that
Let . Define the intersection of the two function’s level sets,
Intuitively, we’re intersecting 2 2-dimensional surfaces. So we should expect a 1-dimensional curve!
A sufficient condition for to be a 1-dimensional curve in is
Equivalently, let . The, we require
Has rank 2 for all .
If so, without loss of generality, the derivative matrix with repect to , is invertible. By the implicit function theorem, , and
Such that for all , and these are the only solutions in .
Thus, agrees with the graph in , and we can parameterize it as
With tangent vector at given as , and
These are two normals to our curve!
So, is a non-zero tangent vector to the curve, so such that
We generalize.
An -dimentional manifold embedded in , . Let , and assume that matrix has maximimal rank if .
If so, we will represent the level set
Locally, as a graph.
Let , . Without loss of generality, (the rightmost entries) is invertible. Thus, and such that
And these are the only solutions if . Thus, agrees with the graph .
This is an -dimentional manifold!
We need linearly independent tangent vectors at . The process of doing this is the same— fix variable, and differentiate with respect to our last variable. These are our tangent vectors!
The range of at is the tangent space above, as
So, the tangent space to at is the null space of .
We have , and an open set . We want to know if looks like a smooth surface.
Recall that we say that if
Then is a smooth surface at .
This is equivalent to saying the derivative matrix of has rank 2, as the first and second row are linearly independent!
Theorem
In the general case, let ,
Assume that has rank . Denote . Without loss of generality, assume is invertible.
Then, neighborhood of and neighborhood of such that
For some .
Proof
We know that is invertible. By the inverse function theorem, neighborhood of and neighborhood of such that is one-to-one, onto, and has a inverse .
We compose
Lagrange Multipliers
Case 1: Surfaces in
Let , and let . Furthermore, define surface
Where if ().
Let , and let be such that (or ), . Then, such that
Proof
Without loss of generality, we have that
By the implicit function theorem, there exists a function such that
Is equal to in the neighborhood of . Then, look at the composition , , which has an unconstrained (interior) minimum (or maximum) at . Thus, .
Let . By the Chain Rule,
Each column of the derivative matrix is a tangent vector! So, is orthogonal to both columns.
We know that is also normal to our surface at . So,
But the orthogonal set to the span of the tangent vectors is 1-dimensional! So, is a basis for . As a basis, if is in this space, then we can form as a linear combination of .
Note that this same argument works for the case of ,
Assuming if (then is an dimensional manifold in ).
If and is such that (or ), then such that
Case 2: Curves in
Let . Define curve
And assume
If (then is a curve in ).
Let . Let such that (or ) for all . Then there exists such that
Proof
Without loss of generality, say is invertible. By the implicit function theorem, such that is equal to in a neighborhood .
Let , , has an unconstrainer min (or max) at , .
By the Chain Rule,
Is a basis for the tangent space to at .
Recall are linearly independent vectors orthogonal to . So,
We also know is in this space. Thus, there exists a linear combination of that form .
Let be an symmetric real matrix. Let
We look at the minimum of the quadratic function in the compact set given by the unit sphere.
Let be a minimizer (). Then,
Proof
. We try to minimize the function . By the above theorem, at a minimizer, such that .
We show that . This completes our proof.
Lemma
If , then .
We find
This is the ith component of !
The last equality is because !