Week 7

Subspace

For a matrix A=(v1v2)A = \begin{pmatrix} | & & | \\ \vector{v}_1 & \cdots & \vector{v}_2 \\ | & & | \end{pmatrix}.

Kernel

ker(A)={xRn:Ax=0}\ker(A) = \set{\vector{x}\in\R^n : A\vector{x}=\vector{0}}

To find a basis for ker(A)\ker(A):

  1. Find rref(A)\operatorname{rref}(A).
  2. Get the solution set.
  3. Write the solution as a span of vectors.
  4. Then, the vectors in the span are the basis of WW.

Image

Im(A)={Ax:xRn}=span{v1,,v2}\begin{align*} \operatorname{Im}(A) &= \set{A\vector{x}: \vector{x}\in\R^n} \\ &= \operatorname{span}\set{\vector{v}_1,\dots,\vector{v}_2} \end{align*}

To find a basis for Im(A)\operatorname{Im}(A):

  1. Find rref(A)\operatorname{rref}(A).
  2. Then, {vi}\set{\vector{v}_i} corresponding to the pivot are the basis for Im(A)\operatorname{Im}(A).

Given a set of a hundred vectors {v1,,v100}\set{\vector{v}_1, \ldots, \vector{v}_{100}} in R5\R^5. Find its basis.

First, put the vectors in in a vector AA:

A=(v1v2)A = \begin{pmatrix} | & & | \\ \vector{v}_1 & \cdots & \vector{v}_2 \\ | & & | \end{pmatrix}

Then, do Gaussian to get its (reduced) row echelon form. Let’s say there are pivots in columns one, three, and 37:

rref(A)=(100010001000000)\operatorname{rref}(A) = \begin{pmatrix} 1 & & 0 & & 0 & \\ 0 & & 1 & & 0 & \\ 0 & \cdots & 0 & \cdots & 1 & \cdots\\ 0 & & 0 & & 0 & \\ 0 & & 0 & & 0 & \\ \end{pmatrix}

Then, the corresponding vectors in AA
{v1,v3,v37}\set{\vector{v}_1, \vector{v}_3, \vector{v}_{37}}

is a basis of {v1,,v100}\set{\vector{v}_1, \ldots, \vector{v}_{100}}.

ker(A)={xRn:Ax=0}span{v1,,vn}=Im(v1vn)Ax=i=1nxivi\ker(A) = \set{\vector{x}\in\R^n : A\vector{x} = \vector{0}} \\ \operatorname{span}\set{\vector{v}_1, \ldots, \vector{v}_n} = \operatorname{Im}\begin{pmatrix} | & & | \\ \vector{v}_1 & \cdots & \vector{v}_n \\ | & & | \end{pmatrix} \\ A\vector{x} = \sum_{i=1}^n x_i\vector{v}_i

W=span{(111),(231),(342),(011)}=Im(123013411121)R3\begin{align*} W &= \operatorname{span}\Set{ \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix}, \begin{pmatrix} 2 \\ 3 \\ 1 \end{pmatrix}, \begin{pmatrix} 3 \\ 4 \\ 2 \end{pmatrix}, \begin{pmatrix} 0 \\ 1 \\ 1 \end{pmatrix} } \\ &= \operatorname{Im}\begin{pmatrix} 1 & 2 & 3 & 0 \\ 1 & 3 & 4 & 1 \\ 1 & 1 & 2 & 1 \end{pmatrix} \in\R^3 \end{align*}

Also note here WW must be linearly dependent because there are four vectors in R3\R^3.

To find basis of WW, do Gaussian on the image:

rref(123013411121)=(101001100001)\operatorname{rref}\begin{pmatrix} 1 & 2 & 3 & 0 \\ 1 & 3 & 4 & 1 \\ 1 & 1 & 2 & 1 \end{pmatrix} = \begin{pmatrix} \boxed{1} & 0 & 1 & 0 \\ 0 & \boxed{1} & 1 & 0 \\ 0 & 0 & 0 & \boxed{1} \end{pmatrix}

The pivots are in columns one, two, and four. Then, go back to the image: {v1,v2,v4}\set{\vector{v}_1, \vector{v}_2, \vector{v}_4} are the basis of WW:

{(111),(231),(011)}\Set{ \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix}, \begin{pmatrix} 2 \\ 3 \\ 1 \end{pmatrix}, \begin{pmatrix} 0 \\ 1 \\ 1 \end{pmatrix} }

Which also means that
W=span{(111),(231),(011)}W = \operatorname{span}\Set{ \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix}, \begin{pmatrix} 2 \\ 3 \\ 1 \end{pmatrix}, \begin{pmatrix} 0 \\ 1 \\ 1 \end{pmatrix} }

and also that {(111),(231),(011)}\Set{ \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix}, \begin{pmatrix} 2 \\ 3 \\ 1 \end{pmatrix}, \begin{pmatrix} 0 \\ 1 \\ 1 \end{pmatrix} } are linearly independent.

So, for an m×nm\times n matrix AA,

dim(ker(A))=nullity(A)=nrank(A)dim(Im(A))=rank(A)\dim(\ker(A)) = \operatorname{nullity}(A) = n - \operatorname{rank}(A) \\ \dim(\operatorname{Im}(A)) = \operatorname{rank}(A)

A summary of findings from the three chapters

For an m×nm\times n matrix A=(v1vn)A=\begin{pmatrix} | & & | \\ \vector{v}_1 & \cdots & \vector{v}_n \\ | & & |\end{pmatrix} Chapter 1: Applying algorithms Arref(A)A\to\operatorname{rref}(A) Chapter 2: Linear transformation T(x)=AxT(\vector{x})=A\vector{x} Chapter 3: {v1,,vn}\set{\vector{v}_1, \ldots, \vector{v}_n}
Only trivial solutions Ax=0    x=0\\A\vector{x}=\vector{0}\implies\vector{x}=\vector{0} No free variables rank(A)=n\\\operatorname{rank}(A)=n TT is injective ker(A)={0}\\\ker(A)=\set{\vector{0}} {v1,,vn}\set{\vector{v}_1, \ldots, \vector{v}_n} is linearly independent
Ax=bA\vector{x}=\vector{b} is solvable for all bRm\vector{b}\in\R^m Full row rank rank(A)=m\\\operatorname{rank}(A)=m TT is surjective Im(A)=Rm\\\operatorname{Im}(A)=\R^m span{v1,,vn}=Rm\operatorname{span}\set{\vector{v}_1, \ldots, \vector{v}_n}=\R^m
If both AA is a square matrix rank(A)=m=n\\\operatorname{rank}(A)=m=n TT is bijective {v1,,vn}\set{\vector{v}_1, \ldots, \vector{v}_n} is a basis

No matter what you are trying to do… bring the vectors into a matrix and Gaussian that bitch.

Change of basis

For a standard basis:

If we have a basis B={v1,,vn}\mathfrak{B}=\set{\vector{v}_1,\ldots,\vector{v}_n} of Rn\R^n, then

xRnx=x~1v1++x~nvn[x]B=Δ(x~1x~n)B\begin{array}{ll} \forall\vector{x}\in\R^n & \vector{x}=\tilde{x}_1\vector{v}_1 + \cdots + \tilde{x}_n\vector{v}_n \\ & [\vector{x}]_\mathfrak{B} \stackrel{\Delta}{=} \begin{pmatrix} \tilde{x}_1 \\ \vdots \\ \tilde{x}_n \end{pmatrix}_\mathfrak{B} \end{array}

Let P=(v1vn)P=\begin{pmatrix} | && | \\ \vector{v_1} & \cdots & \vector{v}_n \\ | && | \end{pmatrix}. Then,
x=x~v1++x~vn=P(x~1x~n)=P[x]B\begin{align*} \vector{x} &= \tilde{x}\vector{v}_1 + \cdots + \tilde{x}\vector{v}_n \\ &= P\begin{pmatrix} \tilde{x}_1 \\ \vdots \\ \tilde{x}_n \end{pmatrix} \\ &= P[\vector{x}]_\mathfrak{B} \end{align*}

So, x=P[x]B\boxed{\vector{x} = P[\vector{x}]_\mathfrak{B}}.

Let B={(12),(21)}\mathfrak{B}=\Set{ \begin{pmatrix} 1 \\ 2 \end{pmatrix}, \begin{pmatrix} 2 \\ 1 \end{pmatrix} }. What is x\vector{x} if [x]B=(31)[\vector{x}]_\mathfrak{B}=\begin{pmatrix} 3 \\ -1 \end{pmatrix}?

x=3(12)+(1)(21)=(1221)(31)=(15)\begin{align*} \vector{x} &= 3\begin{pmatrix} 1 \\ 2 \end{pmatrix} + (-1)\begin{pmatrix} 2 \\ 1 \end{pmatrix} \\ &= \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} \begin{pmatrix} 3 \\ -1 \end{pmatrix} \\ &= \begin{pmatrix} 1 \\ 5 \end{pmatrix} \end{align*}

Homework 6 hint — Book question #55

We want a PP such that [x]R=[x]B[\vector{x}]_\mathfrak{R}=[\vector{x}]_\mathfrak{B}.

Let U=(1112)U = \begin{pmatrix} 1 & 1 \\ 1 & 2 \end{pmatrix} and V=(1324).V = \begin{pmatrix} 1 & 3 \\ 2 & 4 \end{pmatrix}.

By definition:

x=U[x]Bx=V[x]R\vector{x} = U[\vector{x}]_\mathfrak{B} \\ \vector{x} = V[\vector{x}]_\mathfrak{R}

Rewrite the second one by applying inverse on both sides:

x=V[x]R    V1x=[x]R\vector{x} = V[\vector{x}]_\mathfrak{R} \implies V^{-1}\vector{x} = [\vector{x}]_\mathfrak{R}

Then plug in the expression for x\vector{x} in terms of the UU transformation.

[x]R=V1x=V1U[x]B\begin{align*} [\vector{x}]_\mathfrak{R} &= V^{-1}\vector{x} \\ &= V^{-1}U[\vector{x}]_\mathfrak{B} \end{align*}

By comparison, V1UV^{-1}U is the transformation PP.

Linear transformation

Now, we look at the change of basis as a linear transformation. Using the definition and applying the inverse, we can find the transformation matrix AA and BB.

xAT(x)PP[x]BB[T(x)]B\begin{CD} \vector{x} @>A>> T(\vector{x}) \\ @VPVV @VVPV \\ [\vector{x}]_\mathfrak{B} @>B>> [T(\vector{x})]_\mathfrak{B} \end{CD}

x=P[x]B[x]B=P1x\vector{x} = P[\vector{x}]_\mathfrak{B} \\ [\vector{x}]_\mathfrak{B} = P^{-1}\vector{x}

And so:

B[x]B=[T(x)]BBP1(x)=P1T(x)=P1A(x)\begin{align*} B[\vector{x}]_\mathfrak{B} &= [T(\vector{x})]_\mathfrak{B} \\ BP^{-1}(\vector{x}) &= P^{-1}T(\vector{x}) \\ &= P^{-1}A(\vector{x}) \end{align*}

We have that the compositions

BP1=P1A,BP^{-1} = P^{-1}A,

from which we can derive:

A=PBP1B=P1APA = PBP^{-1} \\ B = P^{-1}AP

Let B={(12),(21)}\mathfrak{B}=\Set{ \begin{pmatrix} 1 \\ 2 \end{pmatrix}, \begin{pmatrix} 2 \\ 1 \end{pmatrix} }. What is the linear transformation under standard basis if [T(x)]B=(3001)[x]B[T(\vector{x})]_\mathfrak{B}=\begin{pmatrix} 3 & 0 \\ 0 & -1 \end{pmatrix}[\vector{x}]_\mathfrak{B}?

Here, we want to find the matrix AA because it is the transformation under standard basis. Recall from the diagram above:

xAT(x)\vector{x}\xrightarrow{A}T(\vector{x})

So, we can use the derived equation A=PBP1A=PBP^{-1} where:

B={(12),(21)}    P=(1221)B=(3001)\begin{align*} \mathfrak{B}&=\Set{ \begin{pmatrix} 1 \\ 2 \end{pmatrix}, \begin{pmatrix} 2 \\ 1 \end{pmatrix} } \implies P=\begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} \\ B &= \begin{pmatrix} 3 & 0 \\ 0 & -1 \end{pmatrix} \end{align*}

Then, just plug in:
A=PBP1=(1221)(3001)(1221)1=(1221)(3001)(13232313)=(738383133)\begin{align*} A &= PBP^{-1} \\ &= \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} \begin{pmatrix} 3 & 0 \\ 0 & -1 \end{pmatrix} \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix}^{-1} \\ &= \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} \begin{pmatrix} 3 & 0 \\ 0 & -1 \end{pmatrix} \begin{pmatrix} -\frac{1}{3} & \frac{2}{3} \\ \frac{2}{3} & -\frac{1}{3} \end{pmatrix} \\ &= \begin{pmatrix} -\frac{7}{3} & \frac{8}{3} \\[.5em] -\frac{8}{3} & \frac{13}{3} \end{pmatrix} \end{align*}

Why do we need to change basis?

Just because we can, we will jump ahead to Chapter 7: Diagonalization to explore why we do this.

Say, if we are given a square matrix AA, how do we find AnA^n?

An=AAAAnA^n = \underbrace{A\cdot A \cdots A\cdot A}_n

Take an easy example, where A=IA=I. Then,

An=In=IA^n = I^n = I

Or, how about diagonal matrices? For example: A=(2003)A = \begin{pmatrix} 2 & 0 \\ 0 & 3 \end{pmatrix}.

We can just multiply the diagonal — easy!

A=(2003)    A2=(220032)    An=(2n003n)A = \begin{pmatrix} 2 & 0 \\ 0 & 3 \end{pmatrix} \implies A^2 = \begin{pmatrix} 2^2 & 0 \\ 0 & 3^2 \end{pmatrix} \implies A^n = \begin{pmatrix} 2^n & 0 \\ 0 & 3^n \end{pmatrix}

But… how would we deal with non-diagonal matrices? We simply cannot do proof by induction to find a “general” form for which we can multiply matrix. And so, this is where change of basis is needed.

Under standard basis, given a transformation matrix (2003)\begin{pmatrix} 2 & 0 \\ 0 & 3 \end{pmatrix}:

T\begin{array}{ccc} \includegraphics[height=150px]{../assets/notes_wk7_stdbasis_1.svg} & \raisebox{75px}{$\xrightarrow{T}$} & \includegraphics[height=150px]{../assets/notes_wk7_stdbasis_2.svg} \end{array}

We transform the vectors e1=(10)\vector{e}_1 = \begin{pmatrix} 1 \\ 0 \end{pmatrix} and e2=(01)\vector{e}_2 = \begin{pmatrix} 0 \\ 1 \end{pmatrix}, such that:

T(e1)=(2003)(10)=(20)=2e1T(e2)=(2003)(01)=(03)=3e2\begin{align*} T(\vector{e}_1) = \begin{pmatrix} 2 & 0 \\ 0 & 3 \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix} = \begin{pmatrix} 2 \\ 0 \end{pmatrix} = 2\vector{e}_1 \\ T(\vector{e}_2) = \begin{pmatrix} 2 & 0 \\ 0 & 3 \end{pmatrix} \begin{pmatrix} 0 \\ 1 \end{pmatrix} = \begin{pmatrix} 0 \\ 3 \end{pmatrix} = 3\vector{e}_2 \end{align*}

What we want to achieve in general

Given a matrix AA under the standard basis, we want to find another basis B={v1,v2}\mathfrak{B}=\set{\vector{v}_1, \vector{v}_2} and λ1,λ2R\lambda_1, \lambda_2 \in \R such that:
(Ax)B=(λ100λ2)(x)B(A\vector{x})_\mathfrak{B} = \begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix}(\vector{x})_\mathfrak{B}

Let P=(v1vn)P=\begin{pmatrix} | & | \\ \vector{v_1}& \vector{v}_n \\ | & | \end{pmatrix}. And recall that x=P[x]B\vector{x}=P[\vector{x}]_\mathfrak{B}. Then:

P1Ax=(λ100λ2)P1xP1AP=(λ100λ2)\begin{align*} P^{-1}A\vector{x} &= \begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} P^{-1}\vector{x} \\ P^{-1}AP &= \begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} \end{align*}

First, we need to solve for v1\vector{v}_1 and λ1\lambda_1.

Let [x]B=(10)B[\vector{x}]_\mathfrak{B}=\begin{pmatrix} 1 \\ 0 \end{pmatrix}_\mathfrak{B}. Where x=v1\vector{x}=\vector{v}_1, we have that

(Av1)B=(λ100λ2)(10)B=(λ10)B\begin{align*} (A\vector{v}_1)_\mathfrak{B} = \begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix}_\mathfrak{B} = \begin{pmatrix} \lambda_1 \\ 0 \end{pmatrix}_\mathfrak{B} \end{align*}

Where v1,v20\vector{v}_1,\vector{v}_2\neq\vector{0},
Av1=λv1Av2=λv2.A\vector{v}_1 = \lambda\vector{v}_1 \\ A\vector{v}_2 = \lambda\vector{v}_2.

So in general, we need to solve for Av=λvA\vector{v}=\lambda\vector{v} such that v0\vector{v}\neq\vector{0} and (AλI)v=0(A-\lambda I)\vector{v}=\vector{0}. Or, in other words where ker(AλI)\ker(A-\lambda I) is non-trivial.


Eigenvalues

Let AA be an n×nn\times n matrix. We say that AA is diagonalizable if there exists an invertible matrix PP such that

P1AP=(λ100λn)P^{-1}AP = \begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix}

where λ1,,λn\lambda_1,\ldots,\lambda_n are called eigenvalues of AA.

Why diagonalize?

Because for diagonal matrices, we are easily able to find their squares, such as:

P1AP=(λ100λn)(P1AP)2=(λ100λn)2=(λ1200λn2)\begin{align*} P^{-1}AP &= \begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix} \\ (P^{-1}AP)^2 &= \begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix}^2 = \begin{pmatrix} \lambda_1^2 & & 0 \\ & \ddots & \\ 0 & & \lambda_n^2 \end{pmatrix} \end{align*}

And notice where

(P1AP)2=P1APP1IAP=P1AAP=P1A2P\begin{align*} (P^{-1}AP)^2 &= P^{-1}A\underbrace{PP^{-1}}_I AP \\ &= P^{-1}AAP \\ &= P^{-1}A^2P \end{align*}

We can then derive an expression for A2A^2:

P1A2P=(λ100λn)2A2=P(λ100λn)2P1=P(λ1200λn2)P1P^{-1}A^2P = \begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix}^2 \\ A^2 = P \begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix}^2 P^{-1} = P \begin{pmatrix} \lambda_1^2 & & 0 \\ & \ddots & \\ 0 & & \lambda_n^2 \end{pmatrix} P^{-1}

And so, for (P1AP)k=(λ100λn)k=(λ1k00λnk)(P^{-1}AP)^k = \begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix}^k = \begin{pmatrix} \lambda_1^k & & 0 \\ & \ddots & \\ 0 & & \lambda_n^k \end{pmatrix}:

(P1AP)k=(P1AP)(P1AP)(P1AP)(P1AP)k=P1AAAAkP=P1AkP\begin{align*} (P^{-1}AP)^k &= \underbrace{(P^{-1}A\cancel{P})(\cancel{P^{-1}}A\cancel{P})\thinspace\cdots\thinspace(\cancel{P^{-1}}A\cancel{P})(\cancel{P^{-1}}AP)}_k \\ &= P^{-1}\underbrace{AA\thinspace\cdots\thinspace AA}_k P \\ &= P^{-1}A^k P \end{align*}

And again:

P1AkP=(λ100λn)kAk=P(λ100λn)kP1=P(λ1k00λnk)P1P^{-1}A^kP = \begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix}^k \\ A^k = P \begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix}^k P^{-1} = P \begin{pmatrix} \lambda_1^k & & 0 \\ & \ddots & \\ 0 & & \lambda_n^k \end{pmatrix} P^{-1}

Now, all square matrices to the power of kk can be found… once we find the matrix PP and the eigenvalues λ1,,λn\lambda_1,\ldots,\lambda_n.

Finding eigenvalues and PP

For an n×nn\times n matrix AA and some invertible matrix PP where:

P1AP=(λ100λn)AP=P(λ100λn)P^{-1}AP = \begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix} \\ \Downarrow \\ AP = P\begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix}

First, we write PP as a matrix composed of some column vectors vi\vector{v}_i.

P=(v1vn)P = \begin{pmatrix} | & & | \\ \vector{v}_1 & \cdots & \vector{v}_n \\ | & & | \end{pmatrix}

Then, on the left-hand side, we have:

AP=A(v1vn)=(Av1Avn)\begin{align*} AP &= A\begin{pmatrix} | & & | \\ \vector{v}_1 & \cdots & \vector{v}_n \\ | & & | \end{pmatrix} \\ &= \begin{pmatrix} | & & | \\ A\vector{v}_1 & \cdots & A\vector{v}_n \\ | & & | \end{pmatrix} \end{align*}

And on the right:

P(λ100λn)=(v1vn)(λ100λn)=(λ1v1λnvn)\begin{align*} P\begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix} &= \begin{pmatrix} | & & | \\ \vector{v}_1 & \cdots & \vector{v}_n \\ | & & | \end{pmatrix} \begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix} \\ &= \begin{pmatrix} | & & | \\ \lambda_1\vector{v}_1 & \cdots & \lambda_n\vector{v}_n \\ | & & | \end{pmatrix} \end{align*}

Putting it together, we have:

AP=P(λ100λn)(Av1Avn)=(λ1v1λnvn)\begin{align*} AP &= P\begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix} \\ \begin{pmatrix} | & & | \\ A\vector{v}_1 & \cdots & A\vector{v}_n \\ | & & | \end{pmatrix} &= \begin{pmatrix} | & & | \\ \lambda_1\vector{v}_1 & \cdots & \lambda_n\vector{v}_n \\ | & & | \end{pmatrix} \end{align*}

So, if there exists an invertible matrix PP where P1AP=(λ100λn)P^{-1}AP = \begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix}. Then, there must exists a basis {v1,,vn}\set{\vector{v}_1,\ldots,\vector{v}_n} such that vi0\vector{v}_i\neq\vector{0}.

(Av1=λ1v1Avn=λnvn)     Aviλivi=0 ((Aλ1I)v1=0(AλnI)vn=0)\begin{array}{ccc} \begin{pmatrix} A\vector{v}_1 = \lambda_1\vector{v}_1 \\ \vdots \\ A\vector{v}_n = \lambda_n\vector{v}_n \\ \end{pmatrix} &\overset{ \raisebox{1em}{ \boxed{A\vector{v}_i - \lambda_i\vector{v}_i = \vector{0}} } }{\iff} &\begin{pmatrix} (A-\lambda_1 I)\vector{v}_1 = \vector{0} \\ \vdots \\ (A-\lambda_n I)\vector{v}_n = \vector{0} \\ \end{pmatrix} \end{array}

Notice on the right-hand side, after subtracting and factoring the vector vi\vector{v}_i, we have a form that is equivalent to finding the kernel of AλnIA-\lambda_n I.

We say that λ\lambda is an eigenvalue of the matrix AA if ker(AλI){0}\ker(A-\lambda I) \neq \set{\vector{0}}. In other words, the kernel must have a non-trivial solution.

Additionally, if λ\lambda is an eigenvalue:

  • ker(AλI)\ker(A-\lambda I) is called the eigenspace of AA corresponding to λ\lambda, and
  • vker(AλI)\forall\vector{v}\in\ker(A-\lambda I) are called eigenvectors corresponding to λ\lambda.

To begin, let’s take a simple 2×22\times2 matrix: A=(1221)A=\begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix}.

Typically, you’d be asked to:

  1. Find eigenvalues of AA
  2. For each eigenvalues λ\lambda, find the basis for the respective eigenspaces
  3. Diagonalize AA

1. Find eigenvalues of AA

We need to find λ\lambda such that Av=λvA\vector{v}=\lambda\vector{v} and v0\vector{v}\neq\vector{0}.

Av=λv(1221)v=(λ00λ)vA\vector{v} = \lambda\vector{v} \\ \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} \vector{v} = \begin{pmatrix} \lambda & 0 \\ 0 & \lambda \end{pmatrix} \vector{v}

Then, we subtract and factor v\vector{v} to obtain (AλI)v(A-\lambda I)\vector{v}.

(AλI)v=((1221)λ(1001))v=0=((1221)(λ00λ))v=0=(1λ221λ)v=0\begin{align*} (A-\lambda I)\vector{v} &= \left( \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} - \lambda\begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} \right)\vector{v} &= \vector{0} \\ &= \left( \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} - \begin{pmatrix} \lambda & 0 \\ 0 & \lambda \end{pmatrix} \right)\vector{v} &= \vector{0} \\ &= \begin{pmatrix} 1-\lambda & 2 \\ 2 & 1-\lambda \end{pmatrix} \vector{v} &= \vector{0} \end{align*}

By definition, λ\lambda are eigenvalues     ker(1λ221λ){0}\iff\ker\begin{pmatrix} 1-\lambda & 2 \\ 2 & 1-\lambda \end{pmatrix}\neq\set{\vector{0}}.

So… how the fuck do we solve for λ\lambda? Well, we could do Gaussian. However, it would be kinda nasty because we would need to make the diagonal 11… but λ\lambda is on the diagonal.

Instead, for a 2×22\times2 matrix, we can use the fact that

(abcd)1=1adbc(dbca),adbc0\begin{pmatrix} a & b \\ c & d \end{pmatrix}^{-1} = \frac{1}{ad-bc}\begin{pmatrix} d & -b \\ -c & a \end{pmatrix},\qquad ad-bc\neq0

Recall that for a matrix A=(abcd)A=\begin{pmatrix} a & b \\ c & d \end{pmatrix}, its determinant det(A)=adbc\det(A)=ad-bc.

And that A1A^{-1} exists if and only if:

  • det(A)0\det(A)\neq0
  • ker(A)={0}\ker(A) = \set{\vector{0}}.

So, for our matrix (again, by definition), λ\lambda is an eigenvalue if and only if:

det(1λ221λ)=0    ker(1λ221λ){0}\det\begin{pmatrix} 1-\lambda & 2 \\ 2 & 1-\lambda \end{pmatrix}=0 \iff \ker\begin{pmatrix} 1-\lambda & 2 \\ 2 & 1-\lambda \end{pmatrix}\neq\set{\vector{0}}

As such:

det(1λ221λ)=(1λ)(1λ)22=0=12λ+λ24=0=λ22λ3=0=(λ+1)(λ3)=0λ=1,3\begin{align*} \det\begin{pmatrix} 1-\lambda & 2 \\ 2 & 1-\lambda \end{pmatrix} &= (1-\lambda)(1-\lambda)-2\cdot2 &= 0\\ &= 1-2\lambda+\lambda^2-4 &= 0 \\ &= \lambda^2 - 2\lambda - 3 &= 0 \\ &= (\lambda + 1)(\lambda - 3) &= 0 \end{align*} \\ \therefore\lambda=-1,3

The eigenvalues for the matrix (1221)\begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} are 1-1 and 33.

2. For each eigenvalues λ\lambda, find the basis for the respective eigenspaces

From the last part, we have two eigenvalues λ=1,3\lambda=-1,3.

First, take λ=1\lambda=-1. The eigenspace is given by:

ker(AλI)=ker(A(1)I)=ker(A+I)=ker((1221)+(1001))=ker(2222)\begin{align*} \ker(A-\lambda I) &= \ker(A(-1)I) \\ &= \ker(A+I) \\ &= \ker\left( \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} + \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} \right) \\ &= \ker\begin{pmatrix} 2 & 2 \\ 2 & 2 \end{pmatrix} \end{align*}

Then, to find the kernel, we just Gaussian that bitch.

rref(2222)=(1100)x=yyRker(A+I)={y(11):yR}\operatorname{rref}\begin{pmatrix} 2 & 2 \\ 2 & 2 \end{pmatrix} = \begin{pmatrix} 1 & 1 \\ 0 & 0 \end{pmatrix} \\ \therefore x = -y \\ y\in\R \\[1em] \therefore\ker(A+I) = \Set{ y\begin{pmatrix} -1 \\ 1 \end{pmatrix}: y\in\R }

Now, do the same for λ=3\lambda=3.

ker(AλI)=ker(A3I)=ker((1221)3(1001))=ker((1221)(3003))=ker(2222)\begin{align*} \ker(A-\lambda I) &= \ker(A-3I) \\ &= \ker\left( \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} - 3\begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} \right) \\ &= \ker\left( \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} - \begin{pmatrix} 3 & 0 \\ 0 & 3 \end{pmatrix} \right) \\ &= \ker\begin{pmatrix} -2 & 2 \\ 2 & -2 \end{pmatrix} \end{align*}

And, again Gaussian to find the kernel.

rref(2222)=(1100)x=yyRker(A+I)={y(11):yR}\operatorname{rref}\begin{pmatrix} -2 & 2 \\ 2 & -2 \end{pmatrix} = \begin{pmatrix} 1 & -1 \\ 0 & 0 \end{pmatrix} \\ \therefore x = y \\ y\in\R \\[1em] \therefore\ker(A+I) = \Set{ y\begin{pmatrix} 1 \\ 1 \end{pmatrix}: y\in\R }

So, we have that the basis for λ=1\lambda=-1 is {(11)}\Set{\begin{pmatrix} -1 \\ 1 \end{pmatrix}} and λ=3\lambda=3 is {(11)}\Set{\begin{pmatrix} 1 \\ 1 \end{pmatrix}}.

Then, we say that the basis for the eigenspace is {(11),(11)}\Set{\begin{pmatrix} -1 \\ 1 \end{pmatrix}, \begin{pmatrix} 1 \\ 1 \end{pmatrix}}. The vectors in that basis are called eigenvectors.

To illustrate this, take the eigenvectors: v1=(11),v2=(11)\vector{v}_1=\begin{pmatrix} -1 \\ 1 \end{pmatrix},\vector{v}_2=\begin{pmatrix} 1 \\ 1 \end{pmatrix} and transform it using AA.

Recall that:

Av1=λ1v1Avn=λnvnA\vector{v}_1 = \lambda_1\vector{v}_1 \\ \vdots \\ A\vector{v}_n = \lambda_n\vector{v}_n

So, to transform each of our vectors, we simply plug in our eigenvalues to get:

Av1=λ1v1=1(11)=(11)Av2=λ2v2=3(11)=(33)A\vector{v}_1 = \lambda_1\vector{v}_1 = -1\begin{pmatrix} -1 \\ 1 \end{pmatrix} = \begin{pmatrix} 1 \\ -1 \end{pmatrix} \\ A\vector{v}_2 = \lambda_2\vector{v}_2 = 3\begin{pmatrix} 1 \\ 1 \end{pmatrix} = \begin{pmatrix} 3 \\ 3 \end{pmatrix}

A\begin{array}{ccc} \includegraphics[height=150px]{../assets/notes_wk7_eigenspace_1.svg} & \raisebox{75px}{$\xrightarrow{A}$} & \includegraphics[height=150px]{../assets/notes_wk7_eigenspace_2.svg} \end{array}

The lines through the origin is the ker(AλI)\ker(A-\lambda I) for each eigenvalues λ\lambda.

ker(Aλ1I)=ker(A+I)={y(11):yR}=span{(11)}ker(Aλ2I)=ker(A3I)={y(11):yR}=span{(11)}\ker(A-\lambda_1 I) = \ker(A+I) = \Set{ y\begin{pmatrix} -1 \\ 1 \end{pmatrix}: y \in \R } = \operatorname{span} \Set{ \begin{pmatrix} -1 \\ 1 \end{pmatrix} } \\ \ker(A-\lambda_2 I) = \ker(A-3I) = \Set{ y\begin{pmatrix} 1 \\ 1 \end{pmatrix}: y \in \R } = \operatorname{span} \Set{ \begin{pmatrix} 1 \\ 1 \end{pmatrix} }

And sure enough, transforming the regular way using matrix multiplication would also give us the same results for each vectors.

Av1=(1221)(11)=(11)=1(11)=λ1v1Av2=(1221)(11)=(33)=3(11)=λ2v2\begin{align*} A\vector{v}_1 = \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} \begin{pmatrix} -1 \\ 1 \end{pmatrix} &= \begin{pmatrix} 1 \\ -1 \end{pmatrix} \\ &= -1\begin{pmatrix} -1 \\ 1 \end{pmatrix} \\ &= \lambda_1\vector{v}_1 \\ A\vector{v}_2 = \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix} \begin{pmatrix} 1 \\ 1 \end{pmatrix} &= \begin{pmatrix} 3 \\ 3 \end{pmatrix} \\ &= 3\begin{pmatrix} 1 \\ 1 \end{pmatrix} \\ &= \lambda_2\vector{v}_2 \end{align*}

3. Diagonalize AA

In general, we what is something in the form:

P1AP=(λ100λn)P^{-1}AP = \begin{pmatrix} \lambda_1 & & 0 \\ & \ddots & \\ 0 & & \lambda_n \end{pmatrix}

P1APP^{-1}AP is called the diagonalization of AA.

From our previous steps, we found that AA has two eigenvalues λ1=1,λ2=3\lambda_1=-1,\lambda_2=3.

Then, the diagonalization of AA is

P1AP=(λ100λ2)=(1003)P^{-1}AP = \begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} = \begin{pmatrix} -1 & 0 \\ 0 & 3 \end{pmatrix}

where PP is the matrix made up of the eigenvectors v1,v2\vector{v}_1, \vector{v}_2.

We can also derive this by writing PP as a matrix composed of the eigenvectors v1=(11),v2=(11)\vector{v}_1=\begin{pmatrix} -1 \\ 1 \end{pmatrix},\vector{v}_2=\begin{pmatrix} 1 \\ 1 \end{pmatrix}.

Then:

P1AP=(λ100λ2)AP=P(λ100λ2)=(v1v2)(λ100λ2)=(λ1v1λ2v2)=(1111)(1003)=(1313)=P(1003)    P1AP=(1003)P^{-1}AP = \begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} \\ \begin{align*} AP &= P\begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} \\ &= \begin{pmatrix} | & | \\ \vector{v}_1 & \vector{v}_2 \\ | & | \end{pmatrix}\begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} = \begin{pmatrix} | & | \\ \lambda_1\vector{v}_1 & \lambda_2\vector{v}_2 \\ | & | \end{pmatrix} \\ &= \begin{pmatrix} 1 & 1 \\ -1 & 1 \end{pmatrix}\begin{pmatrix} -1 & 0 \\ 0 & 3 \end{pmatrix} = \begin{pmatrix} 1 & 3 \\ -1 & 3 \end{pmatrix} \\ &= P\begin{pmatrix} -1 & 0 \\ 0 & 3 \end{pmatrix} \end{align*} \\ \implies P^{-1}AP = \begin{pmatrix} -1 & 0 \\ 0 & 3 \end{pmatrix}

With the diagonalization of AA, we can now find AkA^k for any kNk\in\N.

P1AkP=(1003)Ak=P(1003)P1P^{-1}A^k P = \begin{pmatrix} -1 & 0 \\ 0 & 3 \end{pmatrix} \\ \therefore A^k = P\begin{pmatrix} -1 & 0 \\ 0 & 3 \end{pmatrix} P^{-1}

For A=(1221)A=\begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix}, find A10A^{10}.

Where P=(v1v2)=(1111)P = \begin{pmatrix} | & | \\ \vector{v}_1 & \vector{v}_2 \\ | & | \end{pmatrix} = \begin{pmatrix} -1 & 1 \\ 1 & 1 \end{pmatrix}. Then:

A10=(1111)(1003)(1111)1=(1111)(1003)12(1111)=(2112)\begin{align*} A^{10} &= \begin{pmatrix} 1 & 1 \\ -1 & 1 \end{pmatrix}\begin{pmatrix} -1 & 0 \\ 0 & 3 \end{pmatrix}\begin{pmatrix} -1 & 1 \\ 1 & 1 \end{pmatrix}^{-1} \\ &= \begin{pmatrix} 1 & 1 \\ -1 & 1 \end{pmatrix}\begin{pmatrix} -1 & 0 \\ 0 & 3 \end{pmatrix} \cdot -\frac{1}{2}\begin{pmatrix} 1 & -1 \\ -1 & -1 \end{pmatrix} \\ &= \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix} \end{align*}

For 3×33\times3 matrices

Now, what about for a 3×33\times3 matrices? For example A=(110121003)A=\begin{pmatrix} 1 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 0 & 3 \end{pmatrix}.

We start with finding eigenvalues of AA by solving for det(AλI)=0\det(A-\lambda I) = 0. Which means, we’ll need to find the determinant for a 3×33\times3 matrix.

For a matrix in R3\R^3 such as

A=(a11a12a13a21a22a23a31a32a33),A = \begin{pmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{pmatrix},

the determinant of AA is given by:

det(A)=a11(a22a23a32a33)a12(a21a23a31a33)+a13(a21a22a31a32)\det(A) = a_{11}\begin{pmatrix} a_{22} & a_{23} \\ a_{32} & a_{33} \end{pmatrix} - a_{12}\begin{pmatrix} a_{21} & a_{23} \\ a_{31} & a_{33} \end{pmatrix} + a_{13}\begin{pmatrix} a_{21} & a_{22} \\ a_{31} & a_{32} \end{pmatrix}

Then, we’d solve for

det(AλI)=0det(1λ1012λ1003λ)=0\det(A-\lambda I) = 0 \\ \det\begin{pmatrix} 1-\lambda & 1 & 0 \\ 1 & 2-\lambda & 1 \\ 0 & 0 & 3-\lambda \end{pmatrix} = 0

to find the eigenvalues… and so on.

Notice that for a 2×22\times2 matrix, solving for the λ\lambda yields a polynomial expression of degree two i.e., aλ2+bλ+c=0a\lambda^2 + b\lambda + c = 0. And so, you’d have at most two eigenvalues.

That means that, here, for a 3×33\times3 matrix, there will be at most three eigenvalues. And for n×nn\times n matrix, there’d be at most nn eigenvalues.

For 4×44\times4 matrices and beyond…

Well, you’d need to find the determinant for a 4×44\times4 matrix. Which, at this point, you may as well kill yourself.

Week 7

Subspace

Change of basis

Eigenvalues