Week 7
Subspace
For a matrix A = ( ∣ ∣ v ⃗ 1 ⋯ v ⃗ 2 ∣ ∣ ) A = \begin{pmatrix}
| & & | \\
\vector{v}_1 & \cdots & \vector{v}_2 \\
| & & |
\end{pmatrix} A = ∣ v 1 ∣ ⋯ ∣ v 2 ∣ .
Kernel
ker ( A ) = { x ⃗ ∈ R n : A x ⃗ = 0 ⃗ } \ker(A) = \set{\vector{x}\in\R^n : A\vector{x}=\vector{0}} ker ( A ) = { x ∈ R n : A x = 0 }
To find a basis for ker ( A ) \ker(A) ker ( A ) :
Find rref ( A ) \operatorname{rref}(A) rref ( A ) .
Get the solution set.
Write the solution as a span of vectors.
Then, the vectors in the span are the basis of W W W .
Image
Im ( A ) = { A x ⃗ : x ⃗ ∈ R n } = span { v ⃗ 1 , … , v ⃗ 2 } \begin{align*}
\operatorname{Im}(A) &= \set{A\vector{x}: \vector{x}\in\R^n} \\
&= \operatorname{span}\set{\vector{v}_1,\dots,\vector{v}_2}
\end{align*} Im ( A ) = { A x : x ∈ R n } = span { v 1 , … , v 2 }
To find a basis for Im ( A ) \operatorname{Im}(A) Im ( A ) :
Find rref ( A ) \operatorname{rref}(A) rref ( A ) .
Then, { v ⃗ i } \set{\vector{v}_i} { v i } corresponding to the pivot are the basis for Im ( A ) \operatorname{Im}(A) Im ( A ) .
Given a set of a hundred vectors { v ⃗ 1 , … , v ⃗ 100 } \set{\vector{v}_1, \ldots, \vector{v}_{100}} { v 1 , … , v 100 } in R 5 \R^5 R 5 . Find its basis.
First, put the vectors in in a vector A A A :
A = ( ∣ ∣ v ⃗ 1 ⋯ v ⃗ 2 ∣ ∣ ) A = \begin{pmatrix}
| & & | \\
\vector{v}_1 & \cdots & \vector{v}_2 \\
| & & |
\end{pmatrix} A = ∣ v 1 ∣ ⋯ ∣ v 2 ∣
Then, do Gaussian to get its (reduced) row echelon form. Let’s say there are pivots in columns one, three, and 37:
rref ( A ) = ( 1 0 0 0 1 0 0 ⋯ 0 ⋯ 1 ⋯ 0 0 0 0 0 0 ) \operatorname{rref}(A) = \begin{pmatrix}
1 & & 0 & & 0 & \\
0 & & 1 & & 0 & \\
0 & \cdots & 0 & \cdots & 1 & \cdots\\
0 & & 0 & & 0 & \\
0 & & 0 & & 0 & \\
\end{pmatrix} rref ( A ) = 1 0 0 0 0 ⋯ 0 1 0 0 0 ⋯ 0 0 1 0 0 ⋯
Then, the corresponding vectors in A A A
{ v ⃗ 1 , v ⃗ 3 , v ⃗ 37 } \set{\vector{v}_1, \vector{v}_3, \vector{v}_{37}} { v 1 , v 3 , v 37 }
is a basis of { v ⃗ 1 , … , v ⃗ 100 } \set{\vector{v}_1, \ldots, \vector{v}_{100}} { v 1 , … , v 100 } .
ker ( A ) = { x ⃗ ∈ R n : A x ⃗ = 0 ⃗ } span { v ⃗ 1 , … , v ⃗ n } = Im ( ∣ ∣ v ⃗ 1 ⋯ v ⃗ n ∣ ∣ ) A x ⃗ = ∑ i = 1 n x i v ⃗ i \ker(A) = \set{\vector{x}\in\R^n : A\vector{x} = \vector{0}} \\
\operatorname{span}\set{\vector{v}_1, \ldots, \vector{v}_n} =
\operatorname{Im}\begin{pmatrix}
| & & | \\
\vector{v}_1 & \cdots & \vector{v}_n \\
| & & |
\end{pmatrix} \\
A\vector{x} = \sum_{i=1}^n x_i\vector{v}_i ker ( A ) = { x ∈ R n : A x = 0 } span { v 1 , … , v n } = Im ∣ v 1 ∣ ⋯ ∣ v n ∣ A x = i = 1 ∑ n x i v i
W = span { ( 1 1 1 ) , ( 2 3 1 ) , ( 3 4 2 ) , ( 0 1 1 ) } = Im ( 1 2 3 0 1 3 4 1 1 1 2 1 ) ∈ R 3 \begin{align*}
W &= \operatorname{span}\Set{
\begin{pmatrix}
1 \\ 1 \\ 1
\end{pmatrix},
\begin{pmatrix}
2 \\ 3 \\ 1
\end{pmatrix},
\begin{pmatrix}
3 \\ 4 \\ 2
\end{pmatrix},
\begin{pmatrix}
0 \\ 1 \\ 1
\end{pmatrix}
} \\
&= \operatorname{Im}\begin{pmatrix}
1 & 2 & 3 & 0 \\
1 & 3 & 4 & 1 \\
1 & 1 & 2 & 1
\end{pmatrix} \in\R^3
\end{align*} W = span ⎩ ⎨ ⎧ 1 1 1 , 2 3 1 , 3 4 2 , 0 1 1 ⎭ ⎬ ⎫ = Im 1 1 1 2 3 1 3 4 2 0 1 1 ∈ R 3
Also note here W W W must be linearly dependent because there are four vectors in R 3 \R^3 R 3 .
To find basis of W W W , do Gaussian on the image:
rref ( 1 2 3 0 1 3 4 1 1 1 2 1 ) = ( 1 0 1 0 0 1 1 0 0 0 0 1 ) \operatorname{rref}\begin{pmatrix}
1 & 2 & 3 & 0 \\
1 & 3 & 4 & 1 \\
1 & 1 & 2 & 1
\end{pmatrix} = \begin{pmatrix}
\boxed{1} & 0 & 1 & 0 \\
0 & \boxed{1} & 1 & 0 \\
0 & 0 & 0 & \boxed{1}
\end{pmatrix} rref 1 1 1 2 3 1 3 4 2 0 1 1 = 1 0 0 0 1 0 1 1 0 0 0 1
The pivots are in columns one, two, and four. Then, go back to the image: { v ⃗ 1 , v ⃗ 2 , v ⃗ 4 } \set{\vector{v}_1, \vector{v}_2, \vector{v}_4} { v 1 , v 2 , v 4 } are the basis of W W W :
{ ( 1 1 1 ) , ( 2 3 1 ) , ( 0 1 1 ) } \Set{
\begin{pmatrix}
1 \\ 1 \\ 1
\end{pmatrix},
\begin{pmatrix}
2 \\ 3 \\ 1
\end{pmatrix},
\begin{pmatrix}
0 \\ 1 \\ 1
\end{pmatrix}
} ⎩ ⎨ ⎧ 1 1 1 , 2 3 1 , 0 1 1 ⎭ ⎬ ⎫
Which also means that
W = span { ( 1 1 1 ) , ( 2 3 1 ) , ( 0 1 1 ) } W = \operatorname{span}\Set{
\begin{pmatrix}
1 \\ 1 \\ 1
\end{pmatrix},
\begin{pmatrix}
2 \\ 3 \\ 1
\end{pmatrix},
\begin{pmatrix}
0 \\ 1 \\ 1
\end{pmatrix}
} W = span ⎩ ⎨ ⎧ 1 1 1 , 2 3 1 , 0 1 1 ⎭ ⎬ ⎫
and also that { ( 1 1 1 ) , ( 2 3 1 ) , ( 0 1 1 ) } \Set{
\begin{pmatrix}
1 \\ 1 \\ 1
\end{pmatrix},
\begin{pmatrix}
2 \\ 3 \\ 1
\end{pmatrix},
\begin{pmatrix}
0 \\ 1 \\ 1
\end{pmatrix}
} ⎩ ⎨ ⎧ 1 1 1 , 2 3 1 , 0 1 1 ⎭ ⎬ ⎫ are linearly independent.
So, for an m × n m\times n m × n matrix A A A ,
dim ( ker ( A ) ) = nullity ( A ) = n − rank ( A ) dim ( Im ( A ) ) = rank ( A ) \dim(\ker(A)) = \operatorname{nullity}(A) = n - \operatorname{rank}(A) \\
\dim(\operatorname{Im}(A)) = \operatorname{rank}(A) dim ( ker ( A )) = nullity ( A ) = n − rank ( A ) dim ( Im ( A )) = rank ( A )
A summary of findings from the three chapters
For an m × n m\times n m × n matrix A = ( ∣ ∣ v ⃗ 1 ⋯ v ⃗ n ∣ ∣ ) A=\begin{pmatrix} | & & | \\ \vector{v}_1 & \cdots & \vector{v}_n \\ | & & |\end{pmatrix} A = ∣ v 1 ∣ ⋯ ∣ v n ∣
Chapter 1: Applying algorithms A → rref ( A ) A\to\operatorname{rref}(A) A → rref ( A )
Chapter 2: Linear transformation T ( x ⃗ ) = A x ⃗ T(\vector{x})=A\vector{x} T ( x ) = A x
Chapter 3: { v ⃗ 1 , … , v ⃗ n } \set{\vector{v}_1, \ldots, \vector{v}_n} { v 1 , … , v n }
Only trivial solutions A x ⃗ = 0 ⃗ ⟹ x ⃗ = 0 ⃗ \\A\vector{x}=\vector{0}\implies\vector{x}=\vector{0} A x = 0 ⟹ x = 0
No free variables rank ( A ) = n \\\operatorname{rank}(A)=n rank ( A ) = n
T T T is injective ker ( A ) = { 0 ⃗ } \\\ker(A)=\set{\vector{0}} ker ( A ) = { 0 }
{ v ⃗ 1 , … , v ⃗ n } \set{\vector{v}_1, \ldots, \vector{v}_n} { v 1 , … , v n } is linearly independent
A x ⃗ = b ⃗ A\vector{x}=\vector{b} A x = b is solvable for all b ⃗ ∈ R m \vector{b}\in\R^m b ∈ R m
Full row rank rank ( A ) = m \\\operatorname{rank}(A)=m rank ( A ) = m
T T T is surjective Im ( A ) = R m \\\operatorname{Im}(A)=\R^m Im ( A ) = R m
span { v ⃗ 1 , … , v ⃗ n } = R m \operatorname{span}\set{\vector{v}_1, \ldots, \vector{v}_n}=\R^m span { v 1 , … , v n } = R m
If both
A A A is a square matrix rank ( A ) = m = n \\\operatorname{rank}(A)=m=n rank ( A ) = m = n
T T T is bijective
{ v ⃗ 1 , … , v ⃗ n } \set{\vector{v}_1, \ldots, \vector{v}_n} { v 1 , … , v n } is a basis
No matter what you are trying to do… bring the vectors into a matrix and Gaussian that bitch.
Change of basis
For a standard basis:
R n = span { e ⃗ 1 , … , e ⃗ n } \R^n = \operatorname{span}\set{\vector{e}_1, \ldots, \vector{e}_n} R n = span { e 1 , … , e n }
( x 1 ⋮ x n ) = x 1 e ⃗ 1 + ⋯ + x n e ⃗ n \begin{pmatrix}
x_1 \\ \vdots \\ x_n
\end{pmatrix} = x_1\vector{e}_1 + \cdots + x_n\vector{e}_n x 1 ⋮ x n = x 1 e 1 + ⋯ + x n e n
If we have a basis B = { v ⃗ 1 , … , v ⃗ n } \mathfrak{B}=\set{\vector{v}_1,\ldots,\vector{v}_n} B = { v 1 , … , v n } of R n \R^n R n , then
∀ x ⃗ ∈ R n x ⃗ = x ~ 1 v ⃗ 1 + ⋯ + x ~ n v ⃗ n [ x ⃗ ] B = Δ ( x ~ 1 ⋮ x ~ n ) B \begin{array}{ll}
\forall\vector{x}\in\R^n & \vector{x}=\tilde{x}_1\vector{v}_1 + \cdots + \tilde{x}_n\vector{v}_n \\
& [\vector{x}]_\mathfrak{B} \stackrel{\Delta}{=} \begin{pmatrix}
\tilde{x}_1 \\ \vdots \\ \tilde{x}_n
\end{pmatrix}_\mathfrak{B}
\end{array} ∀ x ∈ R n x = x ~ 1 v 1 + ⋯ + x ~ n v n [ x ] B = Δ x ~ 1 ⋮ x ~ n B
Let P = ( ∣ ∣ v 1 ⃗ ⋯ v ⃗ n ∣ ∣ ) P=\begin{pmatrix}
| && | \\
\vector{v_1} & \cdots & \vector{v}_n \\
| && |
\end{pmatrix} P = ∣ v 1 ∣ ⋯ ∣ v n ∣ . Then,
x ⃗ = x ~ v ⃗ 1 + ⋯ + x ~ v ⃗ n = P ( x ~ 1 ⋮ x ~ n ) = P [ x ⃗ ] B \begin{align*}
\vector{x} &= \tilde{x}\vector{v}_1 + \cdots + \tilde{x}\vector{v}_n \\
&= P\begin{pmatrix}
\tilde{x}_1 \\ \vdots \\ \tilde{x}_n
\end{pmatrix} \\
&= P[\vector{x}]_\mathfrak{B}
\end{align*} x = x ~ v 1 + ⋯ + x ~ v n = P x ~ 1 ⋮ x ~ n = P [ x ] B
So, x ⃗ = P [ x ⃗ ] B \boxed{\vector{x} = P[\vector{x}]_\mathfrak{B}} x = P [ x ] B .
Let B = { ( 1 2 ) , ( 2 1 ) } \mathfrak{B}=\Set{
\begin{pmatrix}
1 \\ 2
\end{pmatrix},
\begin{pmatrix}
2 \\ 1
\end{pmatrix}
} B = { ( 1 2 ) , ( 2 1 ) } . What is x ⃗ \vector{x} x if [ x ⃗ ] B = ( 3 − 1 ) [\vector{x}]_\mathfrak{B}=\begin{pmatrix}
3 \\ -1
\end{pmatrix} [ x ] B = ( 3 − 1 ) ?
x ⃗ = 3 ( 1 2 ) + ( − 1 ) ( 2 1 ) = ( 1 2 2 1 ) ( 3 − 1 ) = ( 1 5 ) \begin{align*}
\vector{x} &= 3\begin{pmatrix}
1 \\ 2
\end{pmatrix} +
(-1)\begin{pmatrix}
2 \\ 1
\end{pmatrix} \\
&= \begin{pmatrix}
1 & 2 \\
2 & 1
\end{pmatrix} \begin{pmatrix}
3 \\ -1
\end{pmatrix} \\
&= \begin{pmatrix}
1 \\ 5
\end{pmatrix}
\end{align*} x = 3 ( 1 2 ) + ( − 1 ) ( 2 1 ) = ( 1 2 2 1 ) ( 3 − 1 ) = ( 1 5 )
Homework 6 hint — Book question #55
We want a P P P such that [ x ⃗ ] R = [ x ⃗ ] B [\vector{x}]_\mathfrak{R}=[\vector{x}]_\mathfrak{B} [ x ] R = [ x ] B .
Let U = ( 1 1 1 2 ) U = \begin{pmatrix}
1 & 1 \\ 1 & 2
\end{pmatrix} U = ( 1 1 1 2 ) and V = ( 1 3 2 4 ) . V = \begin{pmatrix}
1 & 3 \\ 2 & 4
\end{pmatrix}. V = ( 1 2 3 4 ) .
By definition:
x ⃗ = U [ x ⃗ ] B x ⃗ = V [ x ⃗ ] R \vector{x} = U[\vector{x}]_\mathfrak{B} \\
\vector{x} = V[\vector{x}]_\mathfrak{R} x = U [ x ] B x = V [ x ] R
Rewrite the second one by applying inverse on both sides:
x ⃗ = V [ x ⃗ ] R ⟹ V − 1 x ⃗ = [ x ⃗ ] R \vector{x} = V[\vector{x}]_\mathfrak{R} \implies V^{-1}\vector{x} = [\vector{x}]_\mathfrak{R} x = V [ x ] R ⟹ V − 1 x = [ x ] R
Then plug in the expression for x ⃗ \vector{x} x in terms of the U U U transformation.
[ x ⃗ ] R = V − 1 x ⃗ = V − 1 U [ x ⃗ ] B \begin{align*}
[\vector{x}]_\mathfrak{R} &= V^{-1}\vector{x} \\
&= V^{-1}U[\vector{x}]_\mathfrak{B}
\end{align*} [ x ] R = V − 1 x = V − 1 U [ x ] B
By comparison, V − 1 U V^{-1}U V − 1 U is the transformation P P P .
Now, we look at the change of basis as a linear transformation. Using the definition and applying the inverse, we can find the transformation matrix A A A and B B B .
x ⃗ → A T ( x ⃗ ) P ↓ ↓ P [ x ⃗ ] B → B [ T ( x ⃗ ) ] B \begin{CD}
\vector{x} @>A>> T(\vector{x}) \\
@VPVV @VVPV \\
[\vector{x}]_\mathfrak{B} @>B>> [T(\vector{x})]_\mathfrak{B}
\end{CD} x P ↓ ⏐ [ x ] B A B T ( x ) ↓ ⏐ P [ T ( x ) ] B
x ⃗ = P [ x ⃗ ] B [ x ⃗ ] B = P − 1 x ⃗ \vector{x} = P[\vector{x}]_\mathfrak{B} \\
[\vector{x}]_\mathfrak{B} = P^{-1}\vector{x} x = P [ x ] B [ x ] B = P − 1 x
And so:
B [ x ⃗ ] B = [ T ( x ⃗ ) ] B B P − 1 ( x ⃗ ) = P − 1 T ( x ⃗ ) = P − 1 A ( x ⃗ ) \begin{align*}
B[\vector{x}]_\mathfrak{B} &= [T(\vector{x})]_\mathfrak{B} \\
BP^{-1}(\vector{x}) &= P^{-1}T(\vector{x}) \\
&= P^{-1}A(\vector{x})
\end{align*} B [ x ] B B P − 1 ( x ) = [ T ( x ) ] B = P − 1 T ( x ) = P − 1 A ( x )
We have that the compositions
B P − 1 = P − 1 A , BP^{-1} = P^{-1}A, B P − 1 = P − 1 A ,
from which we can derive:
A = P B P − 1 B = P − 1 A P A = PBP^{-1} \\
B = P^{-1}AP A = PB P − 1 B = P − 1 A P
Let B = { ( 1 2 ) , ( 2 1 ) } \mathfrak{B}=\Set{
\begin{pmatrix} 1 \\ 2 \end{pmatrix},
\begin{pmatrix} 2 \\ 1 \end{pmatrix}
} B = { ( 1 2 ) , ( 2 1 ) } . What is the linear transformation under standard basis if [ T ( x ⃗ ) ] B = ( 3 0 0 − 1 ) [ x ⃗ ] B [T(\vector{x})]_\mathfrak{B}=\begin{pmatrix}
3 & 0 \\ 0 & -1
\end{pmatrix}[\vector{x}]_\mathfrak{B} [ T ( x ) ] B = ( 3 0 0 − 1 ) [ x ] B ?
Here, we want to find the matrix A A A because it is the transformation under standard basis. Recall from the diagram above:
x ⃗ → A T ( x ⃗ ) \vector{x}\xrightarrow{A}T(\vector{x}) x A T ( x )
So, we can use the derived equation A = P B P − 1 A=PBP^{-1} A = PB P − 1 where:
B = { ( 1 2 ) , ( 2 1 ) } ⟹ P = ( 1 2 2 1 ) B = ( 3 0 0 − 1 ) \begin{align*}
\mathfrak{B}&=\Set{
\begin{pmatrix} 1 \\ 2 \end{pmatrix},
\begin{pmatrix} 2 \\ 1 \end{pmatrix}
}
\implies P=\begin{pmatrix}
1 & 2 \\
2 & 1
\end{pmatrix} \\
B &= \begin{pmatrix}
3 & 0 \\ 0 & -1
\end{pmatrix}
\end{align*} B B = { ( 1 2 ) , ( 2 1 ) } ⟹ P = ( 1 2 2 1 ) = ( 3 0 0 − 1 )
Then, just plug in:
A = P B P − 1 = ( 1 2 2 1 ) ( 3 0 0 − 1 ) ( 1 2 2 1 ) − 1 = ( 1 2 2 1 ) ( 3 0 0 − 1 ) ( − 1 3 2 3 2 3 − 1 3 ) = ( − 7 3 8 3 − 8 3 13 3 ) \begin{align*}
A &= PBP^{-1} \\
&= \begin{pmatrix}
1 & 2 \\
2 & 1
\end{pmatrix}
\begin{pmatrix}
3 & 0 \\ 0 & -1
\end{pmatrix}
\begin{pmatrix}
1 & 2 \\
2 & 1
\end{pmatrix}^{-1} \\
&= \begin{pmatrix}
1 & 2 \\
2 & 1
\end{pmatrix}
\begin{pmatrix}
3 & 0 \\ 0 & -1
\end{pmatrix}
\begin{pmatrix}
-\frac{1}{3} & \frac{2}{3} \\
\frac{2}{3} & -\frac{1}{3}
\end{pmatrix} \\
&= \begin{pmatrix}
-\frac{7}{3} & \frac{8}{3} \\[.5em]
-\frac{8}{3} & \frac{13}{3}
\end{pmatrix}
\end{align*} A = PB P − 1 = ( 1 2 2 1 ) ( 3 0 0 − 1 ) ( 1 2 2 1 ) − 1 = ( 1 2 2 1 ) ( 3 0 0 − 1 ) ( − 3 1 3 2 3 2 − 3 1 ) = ( − 3 7 − 3 8 3 8 3 13 )
Why do we need to change basis?
Just because we can, we will jump ahead to Chapter 7: Diagonalization to explore why we do this.
Say, if we are given a square matrix A A A , how do we find A n A^n A n ?
A n = A ⋅ A ⋯ A ⋅ A ⏟ n A^n = \underbrace{A\cdot A \cdots A\cdot A}_n A n = n A ⋅ A ⋯ A ⋅ A
Take an easy example, where A = I A=I A = I . Then,
A n = I n = I A^n = I^n = I A n = I n = I
Or, how about diagonal matrices? For example: A = ( 2 0 0 3 ) A = \begin{pmatrix}
2 & 0 \\ 0 & 3
\end{pmatrix} A = ( 2 0 0 3 ) .
We can just multiply the diagonal — easy!
A = ( 2 0 0 3 ) ⟹ A 2 = ( 2 2 0 0 3 2 ) ⟹ A n = ( 2 n 0 0 3 n ) A = \begin{pmatrix}
2 & 0 \\ 0 & 3
\end{pmatrix} \implies
A^2 = \begin{pmatrix}
2^2 & 0 \\ 0 & 3^2
\end{pmatrix} \implies
A^n = \begin{pmatrix}
2^n & 0 \\ 0 & 3^n
\end{pmatrix} A = ( 2 0 0 3 ) ⟹ A 2 = ( 2 2 0 0 3 2 ) ⟹ A n = ( 2 n 0 0 3 n )
But… how would we deal with non-diagonal matrices? We simply cannot do proof by induction to find a “general” form for which we can multiply matrix. And so, this is where change of basis is needed.
Under standard basis, given a transformation matrix ( 2 0 0 3 ) \begin{pmatrix}
2 & 0 \\ 0 & 3
\end{pmatrix} ( 2 0 0 3 ) :
→ T \begin{array}{ccc}
\includegraphics[height=150px]{../assets/notes_wk7_stdbasis_1.svg}
& \raisebox{75px}{$\xrightarrow{T}$}
& \includegraphics[height=150px]{../assets/notes_wk7_stdbasis_2.svg}
\end{array} T
We transform the vectors e ⃗ 1 = ( 1 0 ) \vector{e}_1 = \begin{pmatrix}
1 \\ 0
\end{pmatrix} e 1 = ( 1 0 ) and e ⃗ 2 = ( 0 1 ) \vector{e}_2 = \begin{pmatrix}
0 \\ 1
\end{pmatrix} e 2 = ( 0 1 ) , such that:
T ( e ⃗ 1 ) = ( 2 0 0 3 ) ( 1 0 ) = ( 2 0 ) = 2 e ⃗ 1 T ( e ⃗ 2 ) = ( 2 0 0 3 ) ( 0 1 ) = ( 0 3 ) = 3 e ⃗ 2 \begin{align*}
T(\vector{e}_1) =
\begin{pmatrix}
2 & 0 \\ 0 & 3
\end{pmatrix} \begin{pmatrix}
1 \\ 0
\end{pmatrix} = \begin{pmatrix}
2 \\ 0
\end{pmatrix} = 2\vector{e}_1 \\
T(\vector{e}_2) =
\begin{pmatrix}
2 & 0 \\ 0 & 3
\end{pmatrix} \begin{pmatrix}
0 \\ 1
\end{pmatrix} = \begin{pmatrix}
0 \\ 3
\end{pmatrix} = 3\vector{e}_2
\end{align*} T ( e 1 ) = ( 2 0 0 3 ) ( 1 0 ) = ( 2 0 ) = 2 e 1 T ( e 2 ) = ( 2 0 0 3 ) ( 0 1 ) = ( 0 3 ) = 3 e 2
What we want to achieve in general
Given a matrix A A A under the standard basis, we want to find another basis B = { v ⃗ 1 , v ⃗ 2 } \mathfrak{B}=\set{\vector{v}_1, \vector{v}_2} B = { v 1 , v 2 } and λ 1 , λ 2 ∈ R \lambda_1, \lambda_2 \in \R λ 1 , λ 2 ∈ R such that:
( A x ⃗ ) B = ( λ 1 0 0 λ 2 ) ( x ⃗ ) B (A\vector{x})_\mathfrak{B} = \begin{pmatrix}
\lambda_1 & 0 \\ 0 & \lambda_2
\end{pmatrix}(\vector{x})_\mathfrak{B} ( A x ) B = ( λ 1 0 0 λ 2 ) ( x ) B
Let P = ( ∣ ∣ v 1 ⃗ v ⃗ n ∣ ∣ ) P=\begin{pmatrix}
| & | \\
\vector{v_1}& \vector{v}_n \\
| & |
\end{pmatrix} P = ∣ v 1 ∣ ∣ v n ∣ . And recall that x ⃗ = P [ x ⃗ ] B \vector{x}=P[\vector{x}]_\mathfrak{B} x = P [ x ] B . Then:
P − 1 A x ⃗ = ( λ 1 0 0 λ 2 ) P − 1 x ⃗ P − 1 A P = ( λ 1 0 0 λ 2 ) \begin{align*}
P^{-1}A\vector{x} &= \begin{pmatrix}
\lambda_1 & 0 \\
0 & \lambda_2
\end{pmatrix} P^{-1}\vector{x} \\
P^{-1}AP &= \begin{pmatrix}
\lambda_1 & 0 \\
0 & \lambda_2
\end{pmatrix}
\end{align*} P − 1 A x P − 1 A P = ( λ 1 0 0 λ 2 ) P − 1 x = ( λ 1 0 0 λ 2 )
First, we need to solve for v ⃗ 1 \vector{v}_1 v 1 and λ 1 \lambda_1 λ 1 .
Let [ x ⃗ ] B = ( 1 0 ) B [\vector{x}]_\mathfrak{B}=\begin{pmatrix}
1 \\ 0
\end{pmatrix}_\mathfrak{B} [ x ] B = ( 1 0 ) B . Where x ⃗ = v ⃗ 1 \vector{x}=\vector{v}_1 x = v 1 , we have that
( A v ⃗ 1 ) B = ( λ 1 0 0 λ 2 ) ( 1 0 ) B = ( λ 1 0 ) B \begin{align*}
(A\vector{v}_1)_\mathfrak{B} = \begin{pmatrix}
\lambda_1 & 0 \\
0 & \lambda_2
\end{pmatrix} \begin{pmatrix}
1 \\ 0
\end{pmatrix}_\mathfrak{B} = \begin{pmatrix}
\lambda_1 \\ 0
\end{pmatrix}_\mathfrak{B}
\end{align*} ( A v 1 ) B = ( λ 1 0 0 λ 2 ) ( 1 0 ) B = ( λ 1 0 ) B
Where v ⃗ 1 , v ⃗ 2 ≠ 0 ⃗ \vector{v}_1,\vector{v}_2\neq\vector{0} v 1 , v 2 = 0 ,
A v ⃗ 1 = λ v ⃗ 1 A v ⃗ 2 = λ v ⃗ 2 . A\vector{v}_1 = \lambda\vector{v}_1 \\
A\vector{v}_2 = \lambda\vector{v}_2. A v 1 = λ v 1 A v 2 = λ v 2 .
So in general, we need to solve for A v ⃗ = λ v ⃗ A\vector{v}=\lambda\vector{v} A v = λ v such that v ⃗ ≠ 0 ⃗ \vector{v}\neq\vector{0} v = 0 and ( A − λ I ) v ⃗ = 0 ⃗ (A-\lambda I)\vector{v}=\vector{0} ( A − λ I ) v = 0 . Or, in other words where ker ( A − λ I ) \ker(A-\lambda I) ker ( A − λ I ) is non-trivial.
Eigenvalues
Let A A A be an n × n n\times n n × n matrix. We say that A A A is diagonalizable if there exists an invertible matrix P P P such that
P − 1 A P = ( λ 1 0 ⋱ 0 λ n ) P^{-1}AP = \begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix} P − 1 A P = λ 1 0 ⋱ 0 λ n
where λ 1 , … , λ n \lambda_1,\ldots,\lambda_n λ 1 , … , λ n are called eigenvalues of A A A .
Why diagonalize?
Because for diagonal matrices, we are easily able to find their squares, such as:
P − 1 A P = ( λ 1 0 ⋱ 0 λ n ) ( P − 1 A P ) 2 = ( λ 1 0 ⋱ 0 λ n ) 2 = ( λ 1 2 0 ⋱ 0 λ n 2 ) \begin{align*}
P^{-1}AP &= \begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix} \\
(P^{-1}AP)^2 &= \begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix}^2
= \begin{pmatrix}
\lambda_1^2 & & 0 \\
& \ddots & \\
0 & & \lambda_n^2
\end{pmatrix}
\end{align*} P − 1 A P ( P − 1 A P ) 2 = λ 1 0 ⋱ 0 λ n = λ 1 0 ⋱ 0 λ n 2 = λ 1 2 0 ⋱ 0 λ n 2
And notice where
( P − 1 A P ) 2 = P − 1 A P P − 1 ⏟ I A P = P − 1 A A P = P − 1 A 2 P \begin{align*}
(P^{-1}AP)^2 &= P^{-1}A\underbrace{PP^{-1}}_I AP \\
&= P^{-1}AAP \\
&= P^{-1}A^2P
\end{align*} ( P − 1 A P ) 2 = P − 1 A I P P − 1 A P = P − 1 AA P = P − 1 A 2 P
We can then derive an expression for A 2 A^2 A 2 :
P − 1 A 2 P = ( λ 1 0 ⋱ 0 λ n ) 2 A 2 = P ( λ 1 0 ⋱ 0 λ n ) 2 P − 1 = P ( λ 1 2 0 ⋱ 0 λ n 2 ) P − 1 P^{-1}A^2P = \begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix}^2 \\
A^2 = P \begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix}^2 P^{-1}
= P \begin{pmatrix}
\lambda_1^2 & & 0 \\
& \ddots & \\
0 & & \lambda_n^2
\end{pmatrix} P^{-1} P − 1 A 2 P = λ 1 0 ⋱ 0 λ n 2 A 2 = P λ 1 0 ⋱ 0 λ n 2 P − 1 = P λ 1 2 0 ⋱ 0 λ n 2 P − 1
And so, for ( P − 1 A P ) k = ( λ 1 0 ⋱ 0 λ n ) k = ( λ 1 k 0 ⋱ 0 λ n k ) (P^{-1}AP)^k = \begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix}^k = \begin{pmatrix}
\lambda_1^k & & 0 \\
& \ddots & \\
0 & & \lambda_n^k
\end{pmatrix} ( P − 1 A P ) k = λ 1 0 ⋱ 0 λ n k = λ 1 k 0 ⋱ 0 λ n k :
( P − 1 A P ) k = ( P − 1 A P ) ( P − 1 A P ) ⋯ ( P − 1 A P ) ( P − 1 A P ) ⏟ k = P − 1 A A ⋯ A A ⏟ k P = P − 1 A k P \begin{align*}
(P^{-1}AP)^k &= \underbrace{(P^{-1}A\cancel{P})(\cancel{P^{-1}}A\cancel{P})\thinspace\cdots\thinspace(\cancel{P^{-1}}A\cancel{P})(\cancel{P^{-1}}AP)}_k \\
&= P^{-1}\underbrace{AA\thinspace\cdots\thinspace AA}_k P \\
&= P^{-1}A^k P
\end{align*} ( P − 1 A P ) k = k ( P − 1 A P ) ( P − 1 A P ) ⋯ ( P − 1 A P ) ( P − 1 A P ) = P − 1 k AA ⋯ AA P = P − 1 A k P
And again:
P − 1 A k P = ( λ 1 0 ⋱ 0 λ n ) k A k = P ( λ 1 0 ⋱ 0 λ n ) k P − 1 = P ( λ 1 k 0 ⋱ 0 λ n k ) P − 1 P^{-1}A^kP = \begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix}^k \\
A^k = P \begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix}^k P^{-1}
= P \begin{pmatrix}
\lambda_1^k & & 0 \\
& \ddots & \\
0 & & \lambda_n^k
\end{pmatrix} P^{-1} P − 1 A k P = λ 1 0 ⋱ 0 λ n k A k = P λ 1 0 ⋱ 0 λ n k P − 1 = P λ 1 k 0 ⋱ 0 λ n k P − 1
Now, all square matrices to the power of k k k can be found… once we find the matrix P P P and the eigenvalues λ 1 , … , λ n \lambda_1,\ldots,\lambda_n λ 1 , … , λ n .
Finding eigenvalues and P P P
For an n × n n\times n n × n matrix A A A and some invertible matrix P P P where:
P − 1 A P = ( λ 1 0 ⋱ 0 λ n ) ⇓ A P = P ( λ 1 0 ⋱ 0 λ n ) P^{-1}AP = \begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix} \\
\Downarrow \\
AP = P\begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix} P − 1 A P = λ 1 0 ⋱ 0 λ n ⇓ A P = P λ 1 0 ⋱ 0 λ n
First, we write P P P as a matrix composed of some column vectors v ⃗ i \vector{v}_i v i .
P = ( ∣ ∣ v ⃗ 1 ⋯ v ⃗ n ∣ ∣ ) P = \begin{pmatrix}
| & & | \\
\vector{v}_1 & \cdots & \vector{v}_n \\
| & & |
\end{pmatrix} P = ∣ v 1 ∣ ⋯ ∣ v n ∣
Then, on the left-hand side, we have:
A P = A ( ∣ ∣ v ⃗ 1 ⋯ v ⃗ n ∣ ∣ ) = ( ∣ ∣ A v ⃗ 1 ⋯ A v ⃗ n ∣ ∣ ) \begin{align*}
AP &= A\begin{pmatrix}
| & & | \\
\vector{v}_1 & \cdots & \vector{v}_n \\
| & & |
\end{pmatrix} \\
&= \begin{pmatrix}
| & & | \\
A\vector{v}_1 & \cdots & A\vector{v}_n \\
| & & |
\end{pmatrix}
\end{align*} A P = A ∣ v 1 ∣ ⋯ ∣ v n ∣ = ∣ A v 1 ∣ ⋯ ∣ A v n ∣
And on the right:
P ( λ 1 0 ⋱ 0 λ n ) = ( ∣ ∣ v ⃗ 1 ⋯ v ⃗ n ∣ ∣ ) ( λ 1 0 ⋱ 0 λ n ) = ( ∣ ∣ λ 1 v ⃗ 1 ⋯ λ n v ⃗ n ∣ ∣ ) \begin{align*}
P\begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix}
&= \begin{pmatrix}
| & & | \\
\vector{v}_1 & \cdots & \vector{v}_n \\
| & & |
\end{pmatrix}
\begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix} \\
&= \begin{pmatrix}
| & & | \\
\lambda_1\vector{v}_1 & \cdots & \lambda_n\vector{v}_n \\
| & & |
\end{pmatrix}
\end{align*} P λ 1 0 ⋱ 0 λ n = ∣ v 1 ∣ ⋯ ∣ v n ∣ λ 1 0 ⋱ 0 λ n = ∣ λ 1 v 1 ∣ ⋯ ∣ λ n v n ∣
Putting it together, we have:
A P = P ( λ 1 0 ⋱ 0 λ n ) ( ∣ ∣ A v ⃗ 1 ⋯ A v ⃗ n ∣ ∣ ) = ( ∣ ∣ λ 1 v ⃗ 1 ⋯ λ n v ⃗ n ∣ ∣ ) \begin{align*}
AP &= P\begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix} \\
\begin{pmatrix}
| & & | \\
A\vector{v}_1 & \cdots & A\vector{v}_n \\
| & & |
\end{pmatrix}
&= \begin{pmatrix}
| & & | \\
\lambda_1\vector{v}_1 & \cdots & \lambda_n\vector{v}_n \\
| & & |
\end{pmatrix}
\end{align*} A P ∣ A v 1 ∣ ⋯ ∣ A v n ∣ = P λ 1 0 ⋱ 0 λ n = ∣ λ 1 v 1 ∣ ⋯ ∣ λ n v n ∣
So, if there exists an invertible matrix P P P where P − 1 A P = ( λ 1 0 ⋱ 0 λ n ) P^{-1}AP = \begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix} P − 1 A P = λ 1 0 ⋱ 0 λ n . Then, there must exists a basis { v ⃗ 1 , … , v ⃗ n } \set{\vector{v}_1,\ldots,\vector{v}_n} { v 1 , … , v n } such that v ⃗ i ≠ 0 ⃗ \vector{v}_i\neq\vector{0} v i = 0 .
( A v ⃗ 1 = λ 1 v ⃗ 1 ⋮ A v ⃗ n = λ n v ⃗ n ) ⟺ A v ⃗ i − λ i v ⃗ i = 0 ⃗ ( ( A − λ 1 I ) v ⃗ 1 = 0 ⃗ ⋮ ( A − λ n I ) v ⃗ n = 0 ⃗ ) \begin{array}{ccc}
\begin{pmatrix}
A\vector{v}_1 = \lambda_1\vector{v}_1 \\
\vdots \\
A\vector{v}_n = \lambda_n\vector{v}_n \\
\end{pmatrix}
&\overset{
\raisebox{1em}{
\boxed{A\vector{v}_i - \lambda_i\vector{v}_i = \vector{0}}
}
}{\iff}
&\begin{pmatrix}
(A-\lambda_1 I)\vector{v}_1 = \vector{0} \\
\vdots \\
(A-\lambda_n I)\vector{v}_n = \vector{0} \\
\end{pmatrix}
\end{array} A v 1 = λ 1 v 1 ⋮ A v n = λ n v n ⟺ A v i − λ i v i = 0 ( A − λ 1 I ) v 1 = 0 ⋮ ( A − λ n I ) v n = 0
Notice on the right-hand side, after subtracting and factoring the vector v ⃗ i \vector{v}_i v i , we have a form that is equivalent to finding the kernel of A − λ n I A-\lambda_n I A − λ n I .
We say that λ \lambda λ is an eigenvalue of the matrix A A A if ker ( A − λ I ) ≠ { 0 ⃗ } \ker(A-\lambda I) \neq \set{\vector{0}} ker ( A − λ I ) = { 0 } . In other words, the kernel must have a non-trivial solution.
Additionally, if λ \lambda λ is an eigenvalue :
ker ( A − λ I ) \ker(A-\lambda I) ker ( A − λ I ) is called the eigenspace of A A A corresponding to λ \lambda λ , and
∀ v ⃗ ∈ ker ( A − λ I ) \forall\vector{v}\in\ker(A-\lambda I) ∀ v ∈ ker ( A − λ I ) are called eigenvectors corresponding to λ \lambda λ .
To begin, let’s take a simple 2 × 2 2\times2 2 × 2 matrix: A = ( 1 2 2 1 ) A=\begin{pmatrix}
1 & 2 \\ 2 & 1
\end{pmatrix} A = ( 1 2 2 1 ) .
Typically, you’d be asked to:
Find eigenvalues of A A A
For each eigenvalues λ \lambda λ , find the basis for the respective eigenspaces
Diagonalize A A A
1. Find eigenvalues of A A A
We need to find λ \lambda λ such that A v ⃗ = λ v ⃗ A\vector{v}=\lambda\vector{v} A v = λ v and v ⃗ ≠ 0 ⃗ \vector{v}\neq\vector{0} v = 0 .
A v ⃗ = λ v ⃗ ( 1 2 2 1 ) v ⃗ = ( λ 0 0 λ ) v ⃗ A\vector{v} = \lambda\vector{v} \\
\begin{pmatrix}
1 & 2 \\
2 & 1
\end{pmatrix}
\vector{v} = \begin{pmatrix}
\lambda & 0 \\
0 & \lambda
\end{pmatrix} \vector{v} A v = λ v ( 1 2 2 1 ) v = ( λ 0 0 λ ) v
Then, we subtract and factor v ⃗ \vector{v} v to obtain ( A − λ I ) v ⃗ (A-\lambda I)\vector{v} ( A − λ I ) v .
( A − λ I ) v ⃗ = ( ( 1 2 2 1 ) − λ ( 1 0 0 1 ) ) v ⃗ = 0 ⃗ = ( ( 1 2 2 1 ) − ( λ 0 0 λ ) ) v ⃗ = 0 ⃗ = ( 1 − λ 2 2 1 − λ ) v ⃗ = 0 ⃗ \begin{align*}
(A-\lambda I)\vector{v} &= \left(
\begin{pmatrix}
1 & 2 \\
2 & 1
\end{pmatrix}
- \lambda\begin{pmatrix}
1 & 0 \\
0 & 1
\end{pmatrix}
\right)\vector{v}
&= \vector{0} \\
&= \left(
\begin{pmatrix}
1 & 2 \\
2 & 1
\end{pmatrix}
- \begin{pmatrix}
\lambda & 0 \\
0 & \lambda
\end{pmatrix}
\right)\vector{v}
&= \vector{0} \\
&= \begin{pmatrix}
1-\lambda & 2 \\
2 & 1-\lambda
\end{pmatrix}
\vector{v}
&= \vector{0}
\end{align*} ( A − λ I ) v = ( ( 1 2 2 1 ) − λ ( 1 0 0 1 ) ) v = ( ( 1 2 2 1 ) − ( λ 0 0 λ ) ) v = ( 1 − λ 2 2 1 − λ ) v = 0 = 0 = 0
By definition, λ \lambda λ are eigenvalues ⟺ ker ( 1 − λ 2 2 1 − λ ) ≠ { 0 ⃗ } \iff\ker\begin{pmatrix}
1-\lambda & 2 \\
2 & 1-\lambda
\end{pmatrix}\neq\set{\vector{0}} ⟺ ker ( 1 − λ 2 2 1 − λ ) = { 0 } .
So… how the fuck do we solve for λ \lambda λ ? Well, we could do Gaussian. However, it would be kinda nasty because we would need to make the diagonal 1 1 1 … but λ \lambda λ is on the diagonal.
Instead, for a 2 × 2 2\times2 2 × 2 matrix, we can use the fact that
( a b c d ) − 1 = 1 a d − b c ( d − b − c a ) , a d − b c ≠ 0 \begin{pmatrix}
a & b \\
c & d
\end{pmatrix}^{-1} = \frac{1}{ad-bc}\begin{pmatrix}
d & -b \\
-c & a
\end{pmatrix},\qquad ad-bc\neq0 ( a c b d ) − 1 = a d − b c 1 ( d − c − b a ) , a d − b c = 0
Recall that for a matrix A = ( a b c d ) A=\begin{pmatrix}
a & b \\
c & d
\end{pmatrix} A = ( a c b d ) , its determinant det ( A ) = a d − b c \det(A)=ad-bc det ( A ) = a d − b c .
And that A − 1 A^{-1} A − 1 exists if and only if:
det ( A ) ≠ 0 \det(A)\neq0 det ( A ) = 0
ker ( A ) = { 0 ⃗ } \ker(A) = \set{\vector{0}} ker ( A ) = { 0 } .
So, for our matrix (again, by definition), λ \lambda λ is an eigenvalue if and only if:
det ( 1 − λ 2 2 1 − λ ) = 0 ⟺ ker ( 1 − λ 2 2 1 − λ ) ≠ { 0 ⃗ } \det\begin{pmatrix}
1-\lambda & 2 \\
2 & 1-\lambda
\end{pmatrix}=0
\iff
\ker\begin{pmatrix}
1-\lambda & 2 \\
2 & 1-\lambda
\end{pmatrix}\neq\set{\vector{0}} det ( 1 − λ 2 2 1 − λ ) = 0 ⟺ ker ( 1 − λ 2 2 1 − λ ) = { 0 }
As such:
det ( 1 − λ 2 2 1 − λ ) = ( 1 − λ ) ( 1 − λ ) − 2 ⋅ 2 = 0 = 1 − 2 λ + λ 2 − 4 = 0 = λ 2 − 2 λ − 3 = 0 = ( λ + 1 ) ( λ − 3 ) = 0 ∴ λ = − 1 , 3 \begin{align*}
\det\begin{pmatrix}
1-\lambda & 2 \\
2 & 1-\lambda
\end{pmatrix}
&= (1-\lambda)(1-\lambda)-2\cdot2 &= 0\\
&= 1-2\lambda+\lambda^2-4 &= 0 \\
&= \lambda^2 - 2\lambda - 3 &= 0 \\
&= (\lambda + 1)(\lambda - 3) &= 0
\end{align*}
\\
\therefore\lambda=-1,3 det ( 1 − λ 2 2 1 − λ ) = ( 1 − λ ) ( 1 − λ ) − 2 ⋅ 2 = 1 − 2 λ + λ 2 − 4 = λ 2 − 2 λ − 3 = ( λ + 1 ) ( λ − 3 ) = 0 = 0 = 0 = 0 ∴ λ = − 1 , 3
The eigenvalues for the matrix ( 1 2 2 1 ) \begin{pmatrix}
1 & 2 \\
2 & 1
\end{pmatrix} ( 1 2 2 1 ) are − 1 -1 − 1 and 3 3 3 .
2. For each eigenvalues λ \lambda λ , find the basis for the respective eigenspaces
From the last part, we have two eigenvalues λ = − 1 , 3 \lambda=-1,3 λ = − 1 , 3 .
First, take λ = − 1 \lambda=-1 λ = − 1 . The eigenspace is given by:
ker ( A − λ I ) = ker ( A ( − 1 ) I ) = ker ( A + I ) = ker ( ( 1 2 2 1 ) + ( 1 0 0 1 ) ) = ker ( 2 2 2 2 ) \begin{align*}
\ker(A-\lambda I) &= \ker(A(-1)I) \\
&= \ker(A+I) \\
&= \ker\left(
\begin{pmatrix}
1 & 2 \\ 2 & 1
\end{pmatrix} + \begin{pmatrix}
1 & 0 \\ 0 & 1
\end{pmatrix}
\right) \\
&= \ker\begin{pmatrix}
2 & 2 \\ 2 & 2
\end{pmatrix}
\end{align*} ker ( A − λ I ) = ker ( A ( − 1 ) I ) = ker ( A + I ) = ker ( ( 1 2 2 1 ) + ( 1 0 0 1 ) ) = ker ( 2 2 2 2 )
Then, to find the kernel, we just Gaussian that bitch.
rref ( 2 2 2 2 ) = ( 1 1 0 0 ) ∴ x = − y y ∈ R ∴ ker ( A + I ) = { y ( − 1 1 ) : y ∈ R } \operatorname{rref}\begin{pmatrix}
2 & 2 \\ 2 & 2
\end{pmatrix} = \begin{pmatrix}
1 & 1 \\ 0 & 0
\end{pmatrix}
\\
\therefore x = -y \\
y\in\R
\\[1em]
\therefore\ker(A+I) = \Set{
y\begin{pmatrix}
-1 \\ 1
\end{pmatrix}: y\in\R
} rref ( 2 2 2 2 ) = ( 1 0 1 0 ) ∴ x = − y y ∈ R ∴ ker ( A + I ) = { y ( − 1 1 ) : y ∈ R }
Now, do the same for λ = 3 \lambda=3 λ = 3 .
ker ( A − λ I ) = ker ( A − 3 I ) = ker ( ( 1 2 2 1 ) − 3 ( 1 0 0 1 ) ) = ker ( ( 1 2 2 1 ) − ( 3 0 0 3 ) ) = ker ( − 2 2 2 − 2 ) \begin{align*}
\ker(A-\lambda I) &= \ker(A-3I) \\
&= \ker\left(
\begin{pmatrix}
1 & 2 \\ 2 & 1
\end{pmatrix} - 3\begin{pmatrix}
1 & 0 \\ 0 & 1
\end{pmatrix}
\right) \\
&= \ker\left(
\begin{pmatrix}
1 & 2 \\ 2 & 1
\end{pmatrix} - \begin{pmatrix}
3 & 0 \\ 0 & 3
\end{pmatrix}
\right) \\
&= \ker\begin{pmatrix}
-2 & 2 \\ 2 & -2
\end{pmatrix}
\end{align*} ker ( A − λ I ) = ker ( A − 3 I ) = ker ( ( 1 2 2 1 ) − 3 ( 1 0 0 1 ) ) = ker ( ( 1 2 2 1 ) − ( 3 0 0 3 ) ) = ker ( − 2 2 2 − 2 )
And, again Gaussian to find the kernel.
rref ( − 2 2 2 − 2 ) = ( 1 − 1 0 0 ) ∴ x = y y ∈ R ∴ ker ( A + I ) = { y ( 1 1 ) : y ∈ R } \operatorname{rref}\begin{pmatrix}
-2 & 2 \\ 2 & -2
\end{pmatrix} = \begin{pmatrix}
1 & -1 \\ 0 & 0
\end{pmatrix}
\\
\therefore x = y \\
y\in\R
\\[1em]
\therefore\ker(A+I) = \Set{
y\begin{pmatrix}
1 \\ 1
\end{pmatrix}: y\in\R
} rref ( − 2 2 2 − 2 ) = ( 1 0 − 1 0 ) ∴ x = y y ∈ R ∴ ker ( A + I ) = { y ( 1 1 ) : y ∈ R }
So, we have that the basis for λ = − 1 \lambda=-1 λ = − 1 is { ( − 1 1 ) } \Set{\begin{pmatrix}
-1 \\ 1
\end{pmatrix}} { ( − 1 1 ) } and λ = 3 \lambda=3 λ = 3 is { ( 1 1 ) } \Set{\begin{pmatrix}
1 \\ 1
\end{pmatrix}} { ( 1 1 ) } .
Then, we say that the basis for the eigenspace is { ( − 1 1 ) , ( 1 1 ) } \Set{\begin{pmatrix}
-1 \\ 1
\end{pmatrix}, \begin{pmatrix}
1 \\ 1
\end{pmatrix}} { ( − 1 1 ) , ( 1 1 ) } . The vectors in that basis are called eigenvectors .
To illustrate this, take the eigenvectors: v ⃗ 1 = ( − 1 1 ) , v ⃗ 2 = ( 1 1 ) \vector{v}_1=\begin{pmatrix}
-1 \\ 1
\end{pmatrix},\vector{v}_2=\begin{pmatrix}
1 \\ 1
\end{pmatrix} v 1 = ( − 1 1 ) , v 2 = ( 1 1 ) and transform it using A A A .
Recall that:
A v ⃗ 1 = λ 1 v ⃗ 1 ⋮ A v ⃗ n = λ n v ⃗ n A\vector{v}_1 = \lambda_1\vector{v}_1 \\
\vdots \\
A\vector{v}_n = \lambda_n\vector{v}_n A v 1 = λ 1 v 1 ⋮ A v n = λ n v n
So, to transform each of our vectors, we simply plug in our eigenvalues to get:
A v ⃗ 1 = λ 1 v ⃗ 1 = − 1 ( − 1 1 ) = ( 1 − 1 ) A v ⃗ 2 = λ 2 v ⃗ 2 = 3 ( 1 1 ) = ( 3 3 ) A\vector{v}_1 = \lambda_1\vector{v}_1 = -1\begin{pmatrix}
-1 \\ 1
\end{pmatrix} = \begin{pmatrix}
1 \\ -1
\end{pmatrix} \\
A\vector{v}_2 = \lambda_2\vector{v}_2 = 3\begin{pmatrix}
1 \\ 1
\end{pmatrix} = \begin{pmatrix}
3 \\ 3
\end{pmatrix} A v 1 = λ 1 v 1 = − 1 ( − 1 1 ) = ( 1 − 1 ) A v 2 = λ 2 v 2 = 3 ( 1 1 ) = ( 3 3 )
→ A \begin{array}{ccc}
\includegraphics[height=150px]{../assets/notes_wk7_eigenspace_1.svg}
& \raisebox{75px}{$\xrightarrow{A}$}
& \includegraphics[height=150px]{../assets/notes_wk7_eigenspace_2.svg}
\end{array} A
The lines through the origin is the ker ( A − λ I ) \ker(A-\lambda I) ker ( A − λ I ) for each eigenvalues λ \lambda λ .
ker ( A − λ 1 I ) = ker ( A + I ) = { y ( − 1 1 ) : y ∈ R } = span { ( − 1 1 ) } ker ( A − λ 2 I ) = ker ( A − 3 I ) = { y ( 1 1 ) : y ∈ R } = span { ( 1 1 ) } \ker(A-\lambda_1 I) = \ker(A+I) = \Set{
y\begin{pmatrix}
-1 \\ 1
\end{pmatrix}: y \in \R
} = \operatorname{span} \Set{
\begin{pmatrix}
-1 \\ 1
\end{pmatrix}
} \\
\ker(A-\lambda_2 I) = \ker(A-3I) = \Set{
y\begin{pmatrix}
1 \\ 1
\end{pmatrix}: y \in \R
} = \operatorname{span} \Set{
\begin{pmatrix}
1 \\ 1
\end{pmatrix}
} ker ( A − λ 1 I ) = ker ( A + I ) = { y ( − 1 1 ) : y ∈ R } = span { ( − 1 1 ) } ker ( A − λ 2 I ) = ker ( A − 3 I ) = { y ( 1 1 ) : y ∈ R } = span { ( 1 1 ) }
And sure enough, transforming the regular way using matrix multiplication would also give us the same results for each vectors.
A v ⃗ 1 = ( 1 2 2 1 ) ( − 1 1 ) = ( 1 − 1 ) = − 1 ( − 1 1 ) = λ 1 v ⃗ 1 A v ⃗ 2 = ( 1 2 2 1 ) ( 1 1 ) = ( 3 3 ) = 3 ( 1 1 ) = λ 2 v ⃗ 2 \begin{align*}
A\vector{v}_1 = \begin{pmatrix}
1 & 2 \\
2 & 1
\end{pmatrix} \begin{pmatrix}
-1 \\ 1
\end{pmatrix}
&= \begin{pmatrix}
1 \\ -1
\end{pmatrix} \\
&= -1\begin{pmatrix}
-1 \\ 1
\end{pmatrix} \\
&= \lambda_1\vector{v}_1 \\
A\vector{v}_2 = \begin{pmatrix}
1 & 2 \\
2 & 1
\end{pmatrix} \begin{pmatrix}
1 \\ 1
\end{pmatrix}
&= \begin{pmatrix}
3 \\ 3
\end{pmatrix} \\
&= 3\begin{pmatrix}
1 \\ 1
\end{pmatrix} \\
&= \lambda_2\vector{v}_2
\end{align*} A v 1 = ( 1 2 2 1 ) ( − 1 1 ) A v 2 = ( 1 2 2 1 ) ( 1 1 ) = ( 1 − 1 ) = − 1 ( − 1 1 ) = λ 1 v 1 = ( 3 3 ) = 3 ( 1 1 ) = λ 2 v 2
3. Diagonalize A A A
In general, we what is something in the form:
P − 1 A P = ( λ 1 0 ⋱ 0 λ n ) P^{-1}AP = \begin{pmatrix}
\lambda_1 & & 0 \\
& \ddots & \\
0 & & \lambda_n
\end{pmatrix} P − 1 A P = λ 1 0 ⋱ 0 λ n
P − 1 A P P^{-1}AP P − 1 A P is called the diagonalization of A A A .
From our previous steps, we found that A A A has two eigenvalues λ 1 = − 1 , λ 2 = 3 \lambda_1=-1,\lambda_2=3 λ 1 = − 1 , λ 2 = 3 .
Then, the diagonalization of A A A is
P − 1 A P = ( λ 1 0 0 λ 2 ) = ( − 1 0 0 3 ) P^{-1}AP = \begin{pmatrix}
\lambda_1 & 0 \\
0 & \lambda_2
\end{pmatrix}
= \begin{pmatrix}
-1 & 0 \\
0 & 3
\end{pmatrix} P − 1 A P = ( λ 1 0 0 λ 2 ) = ( − 1 0 0 3 )
where P P P is the matrix made up of the eigenvectors v ⃗ 1 , v ⃗ 2 \vector{v}_1, \vector{v}_2 v 1 , v 2 .
We can also derive this by writing P P P as a matrix composed of the eigenvectors v ⃗ 1 = ( − 1 1 ) , v ⃗ 2 = ( 1 1 ) \vector{v}_1=\begin{pmatrix}
-1 \\ 1
\end{pmatrix},\vector{v}_2=\begin{pmatrix}
1 \\ 1
\end{pmatrix} v 1 = ( − 1 1 ) , v 2 = ( 1 1 ) .
Then:
P − 1 A P = ( λ 1 0 0 λ 2 ) A P = P ( λ 1 0 0 λ 2 ) = ( ∣ ∣ v ⃗ 1 v ⃗ 2 ∣ ∣ ) ( λ 1 0 0 λ 2 ) = ( ∣ ∣ λ 1 v ⃗ 1 λ 2 v ⃗ 2 ∣ ∣ ) = ( 1 1 − 1 1 ) ( − 1 0 0 3 ) = ( 1 3 − 1 3 ) = P ( − 1 0 0 3 ) ⟹ P − 1 A P = ( − 1 0 0 3 ) P^{-1}AP = \begin{pmatrix}
\lambda_1 & 0 \\
0 & \lambda_2
\end{pmatrix} \\
\begin{align*}
AP &= P\begin{pmatrix}
\lambda_1 & 0 \\
0 & \lambda_2
\end{pmatrix} \\
&= \begin{pmatrix}
| & | \\
\vector{v}_1 & \vector{v}_2 \\
| & |
\end{pmatrix}\begin{pmatrix}
\lambda_1 & 0 \\
0 & \lambda_2
\end{pmatrix}
= \begin{pmatrix}
| & | \\
\lambda_1\vector{v}_1 & \lambda_2\vector{v}_2 \\
| & |
\end{pmatrix} \\
&= \begin{pmatrix}
1 & 1 \\
-1 & 1
\end{pmatrix}\begin{pmatrix}
-1 & 0 \\
0 & 3
\end{pmatrix}
= \begin{pmatrix}
1 & 3 \\
-1 & 3
\end{pmatrix} \\
&= P\begin{pmatrix}
-1 & 0 \\
0 & 3
\end{pmatrix}
\end{align*} \\
\implies P^{-1}AP = \begin{pmatrix}
-1 & 0 \\
0 & 3
\end{pmatrix} P − 1 A P = ( λ 1 0 0 λ 2 ) A P = P ( λ 1 0 0 λ 2 ) = ∣ v 1 ∣ ∣ v 2 ∣ ( λ 1 0 0 λ 2 ) = ∣ λ 1 v 1 ∣ ∣ λ 2 v 2 ∣ = ( 1 − 1 1 1 ) ( − 1 0 0 3 ) = ( 1 − 1 3 3 ) = P ( − 1 0 0 3 ) ⟹ P − 1 A P = ( − 1 0 0 3 )
With the diagonalization of A A A , we can now find A k A^k A k for any k ∈ N k\in\N k ∈ N .
P − 1 A k P = ( − 1 0 0 3 ) ∴ A k = P ( − 1 0 0 3 ) P − 1 P^{-1}A^k P = \begin{pmatrix}
-1 & 0 \\
0 & 3
\end{pmatrix} \\
\therefore A^k = P\begin{pmatrix}
-1 & 0 \\
0 & 3
\end{pmatrix} P^{-1} P − 1 A k P = ( − 1 0 0 3 ) ∴ A k = P ( − 1 0 0 3 ) P − 1
For A = ( 1 2 2 1 ) A=\begin{pmatrix}
1 & 2 \\ 2 & 1
\end{pmatrix} A = ( 1 2 2 1 ) , find A 10 A^{10} A 10 .
Where P = ( ∣ ∣ v ⃗ 1 v ⃗ 2 ∣ ∣ ) = ( − 1 1 1 1 ) P = \begin{pmatrix}
| & | \\
\vector{v}_1 & \vector{v}_2 \\
| & |
\end{pmatrix} = \begin{pmatrix}
-1 & 1 \\
1 & 1
\end{pmatrix} P = ∣ v 1 ∣ ∣ v 2 ∣ = ( − 1 1 1 1 ) . Then:
A 10 = ( 1 1 − 1 1 ) ( − 1 0 0 3 ) ( − 1 1 1 1 ) − 1 = ( 1 1 − 1 1 ) ( − 1 0 0 3 ) ⋅ − 1 2 ( 1 − 1 − 1 − 1 ) = ( 2 1 1 2 ) \begin{align*}
A^{10} &= \begin{pmatrix}
1 & 1 \\
-1 & 1
\end{pmatrix}\begin{pmatrix}
-1 & 0 \\
0 & 3
\end{pmatrix}\begin{pmatrix}
-1 & 1 \\
1 & 1
\end{pmatrix}^{-1} \\
&= \begin{pmatrix}
1 & 1 \\
-1 & 1
\end{pmatrix}\begin{pmatrix}
-1 & 0 \\
0 & 3
\end{pmatrix}
\cdot -\frac{1}{2}\begin{pmatrix}
1 & -1 \\
-1 & -1
\end{pmatrix} \\
&= \begin{pmatrix}
2 & 1 \\
1 & 2
\end{pmatrix}
\end{align*} A 10 = ( 1 − 1 1 1 ) ( − 1 0 0 3 ) ( − 1 1 1 1 ) − 1 = ( 1 − 1 1 1 ) ( − 1 0 0 3 ) ⋅ − 2 1 ( 1 − 1 − 1 − 1 ) = ( 2 1 1 2 )
For 3 × 3 3\times3 3 × 3 matrices
Now, what about for a 3 × 3 3\times3 3 × 3 matrices? For example A = ( 1 1 0 1 2 1 0 0 3 ) A=\begin{pmatrix}
1 & 1 & 0 \\
1 & 2 & 1 \\
0 & 0 & 3
\end{pmatrix} A = 1 1 0 1 2 0 0 1 3 .
We start with finding eigenvalues of A A A by solving for det ( A − λ I ) = 0 \det(A-\lambda I) = 0 det ( A − λ I ) = 0 . Which means, we’ll need to find the determinant for a 3 × 3 3\times3 3 × 3 matrix.
For a matrix in R 3 \R^3 R 3 such as
A = ( a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 ) , A = \begin{pmatrix}
a_{11} & a_{12} & a_{13} \\
a_{21} & a_{22} & a_{23} \\
a_{31} & a_{32} & a_{33}
\end{pmatrix}, A = a 11 a 21 a 31 a 12 a 22 a 32 a 13 a 23 a 33 ,
the determinant of A A A is given by:
det ( A ) = a 11 ( a 22 a 23 a 32 a 33 ) − a 12 ( a 21 a 23 a 31 a 33 ) + a 13 ( a 21 a 22 a 31 a 32 ) \det(A) = a_{11}\begin{pmatrix}
a_{22} & a_{23} \\
a_{32} & a_{33}
\end{pmatrix} - a_{12}\begin{pmatrix}
a_{21} & a_{23} \\
a_{31} & a_{33}
\end{pmatrix} + a_{13}\begin{pmatrix}
a_{21} & a_{22} \\
a_{31} & a_{32}
\end{pmatrix} det ( A ) = a 11 ( a 22 a 32 a 23 a 33 ) − a 12 ( a 21 a 31 a 23 a 33 ) + a 13 ( a 21 a 31 a 22 a 32 )
Then, we’d solve for
det ( A − λ I ) = 0 det ( 1 − λ 1 0 1 2 − λ 1 0 0 3 − λ ) = 0 \det(A-\lambda I) = 0 \\
\det\begin{pmatrix}
1-\lambda & 1 & 0 \\
1 & 2-\lambda & 1 \\
0 & 0 & 3-\lambda
\end{pmatrix} = 0 det ( A − λ I ) = 0 det 1 − λ 1 0 1 2 − λ 0 0 1 3 − λ = 0
to find the eigenvalues… and so on.
Notice that for a 2 × 2 2\times2 2 × 2 matrix, solving for the λ \lambda λ yields a polynomial expression of degree two i.e., a λ 2 + b λ + c = 0 a\lambda^2 + b\lambda + c = 0 a λ 2 + bλ + c = 0 . And so, you’d have at most two eigenvalues.
That means that, here, for a 3 × 3 3\times3 3 × 3 matrix, there will be at most three eigenvalues. And for n × n n\times n n × n matrix, there’d be at most n n n eigenvalues.
For 4 × 4 4\times4 4 × 4 matrices and beyond…
Well, you’d need to find the determinant for a 4 × 4 4\times4 4 × 4 matrix. Which, at this point, you may as well kill yourself.