Final exam
Question 1. (10 points)
{ v 1 , . . . , v n } \set{v_1, ..., v_n} { v 1 , ... , v n } forms a basis for a vector space V V V if and only if they are linearly independent and spans the space V V V .
(b) What is the definition of the dimension of a vector space V V V ?
The dimension of a vector space V V V is the number of vectors in a basis of V V V .
(c) Explain why the vector space M 3 , 2 \mathcal{M}_{3,2} M 3 , 2 , the set of all 3 × 2 3 × 2 3 × 2 matrices has dimension 6 6 6 .
Consider a matrix [ a b c d e f ] ∈ M 3 , 2 \begin{bmatrix}
a & b \\ c & d \\ e & f
\end{bmatrix}\in\mathcal{M}_{3,2} a c e b d f ∈ M 3 , 2 . This matrix can be expanded as
[ a b c d e f ] = a [ 1 0 0 0 0 0 ] + b [ 0 1 0 0 0 0 ] + c [ 0 0 1 0 0 0 ] + d [ 0 0 0 1 0 0 ] + e [ 0 0 0 0 1 0 ] + f [ 0 0 0 0 0 1 ] . \begin{split}
\begin{bmatrix}
a & b \\ c & d \\ e & f
\end{bmatrix} &= a \begin{bmatrix}
1 & 0 \\ 0 & 0 \\ 0 & 0
\end{bmatrix} + b \begin{bmatrix}
0 & 1 \\ 0 & 0 \\ 0 & 0
\end{bmatrix} + c \begin{bmatrix}
0 & 0 \\ 1 & 0 \\ 0 & 0
\end{bmatrix} \\ &\quad+ d \begin{bmatrix}
0 & 0 \\ 0 & 1 \\ 0 & 0
\end{bmatrix} + e \begin{bmatrix}
0 & 0 \\ 0 & 0 \\ 1 & 0
\end{bmatrix} + f \begin{bmatrix}
0 & 0 \\ 0 & 0 \\ 0 & 1
\end{bmatrix}.
\end{split} a c e b d f = a 1 0 0 0 0 0 + b 0 0 0 1 0 0 + c 0 1 0 0 0 0 + d 0 0 0 0 1 0 + e 0 0 1 0 0 0 + f 0 0 0 0 0 1 .
By inspection,
{ [ 1 0 0 0 0 0 ] , [ 0 1 0 0 0 0 ] , [ 0 0 1 0 0 0 ] [ 0 0 0 1 0 0 ] , [ 0 0 0 0 1 0 ] , [ 0 0 0 0 0 1 ] } \Set{
\begin{bmatrix}
1 & 0 \\ 0 & 0 \\ 0 & 0
\end{bmatrix}, \begin{bmatrix}
0 & 1 \\ 0 & 0 \\ 0 & 0
\end{bmatrix}, \begin{bmatrix}
0 & 0 \\ 1 & 0 \\ 0 & 0
\end{bmatrix} \\ \begin{bmatrix}
0 & 0 \\ 0 & 1 \\ 0 & 0
\end{bmatrix}, \begin{bmatrix}
0 & 0 \\ 0 & 0 \\ 1 & 0
\end{bmatrix}, \begin{bmatrix}
0 & 0 \\ 0 & 0 \\ 0 & 1
\end{bmatrix}
} ⎩ ⎨ ⎧ 1 0 0 0 0 0 , 0 0 0 1 0 0 , 0 1 0 0 0 0 0 0 0 0 1 0 , 0 0 1 0 0 0 , 0 0 0 0 0 1 ⎭ ⎬ ⎫
are linearly independent and and every matrix in M 3 , 2 \mathcal{M}_{3,2} M 3 , 2 can be expanded by them. Hence, it forms a basis of M 3 , 2 \mathcal{M}_{3,2} M 3 , 2 and its dimension is six.
Question 2. (10 points) Find the answer of the following problem. Write a brief solution to explain.
a. Suppose that A A A is a 8 × 17 8 ×17 8 × 17 matrix and the kernel of A A A has dimension 12 12 12 . What is the dimension of Im ( A ) \operatorname{Im}(A) Im ( A ) ?
If dim ker ( A ) = 12 \dim\ker(A) = 12 dim ker ( A ) = 12 , then finding rref ( A ) \operatorname{rref}(A) rref ( A ) will yield five pivot columns because the rest are free variables.
The corresponding pivot column on A A A will make up a basis of Im ( A ) \operatorname{Im}(A) Im ( A ) . As such, the dimension of Im ( A ) \operatorname{Im}(A) Im ( A ) is five.
b.Find the inverse of the matrix ( cos θ sin θ − sin θ cos θ ) . \begin{pmatrix}\cos \theta & \sin \theta \\ −\sin \theta & \cos \theta\end{pmatrix}. ( cos θ − sin θ sin θ cos θ ) .
( cos θ sin θ − sin θ cos θ ) − 1 = 1 cos 2 θ + sin 2 θ ( cos θ − sin θ sin θ cos θ ) = ( cos θ − sin θ sin θ cos θ ) \begin{pmatrix}
\cos \theta & \sin \theta \\
−\sin \theta & \cos \theta
\end{pmatrix}^{-1} =
\frac{1}{\cos^2\theta+\sin^2\theta}\begin{pmatrix}
\cos \theta & -\sin \theta \\
\sin \theta & \cos \theta
\end{pmatrix} =
\begin{pmatrix}
\cos \theta & -\sin \theta \\
\sin \theta & \cos \theta
\end{pmatrix} ( cos θ − sin θ sin θ cos θ ) − 1 = cos 2 θ + sin 2 θ 1 ( cos θ sin θ − sin θ cos θ ) = ( cos θ sin θ − sin θ cos θ )
c. Find the dimension of the following subspace on R 4 \R^4 R 4 . W = { ( x , y , z , w ) : x + y + z + w = 0 } . W = \set{(x,y,z,w) : x + y + z + w = 0}. W = { ( x , y , z , w ) : x + y + z + w = 0 } .
x = − y − z − w W = { ( − y − z − w y z w ) : y , z , w ∈ R } = span { ( − 1 1 0 0 ) , ( − 1 0 1 0 ) , ( − 1 0 0 1 ) } ∴ dim W = 3 x = -y-z-w \\
W = \Set{
\begin{pmatrix}
-y-z-w \\ y \\ z \\ w
\end{pmatrix}: y,z,w \in\R
} = \operatorname{span}\Set{
\begin{pmatrix}
-1 \\ 1 \\ 0 \\ 0
\end{pmatrix},
\begin{pmatrix}
-1 \\ 0 \\ 1 \\ 0
\end{pmatrix},
\begin{pmatrix}
-1 \\ 0 \\ 0 \\ 1
\end{pmatrix}
} \\
\therefore \dim W = 3 x = − y − z − w W = ⎩ ⎨ ⎧ − y − z − w y z w : y , z , w ∈ R ⎭ ⎬ ⎫ = span ⎩ ⎨ ⎧ − 1 1 0 0 , − 1 0 1 0 , − 1 0 0 1 ⎭ ⎬ ⎫ ∴ dim W = 3
d. Find the determinant of the matrix A = ( 1 1 1 1 2 3 100 100 100 ) . A = \begin{pmatrix} 1 & 1 & 1 \\ 1 & 2 & 3 \\ 100 & 100 & 100\end{pmatrix}. A = 1 1 100 1 2 100 1 3 100 .
( 1 1 1 1 2 3 100 100 100 ) → R 2 − R 1 ( 1 1 1 0 1 2 100 100 100 ) → R 3 − 100 R 1 ( 1 1 1 0 1 2 0 0 0 ) ∴ det A = 1 ⋅ 1 ⋅ 0 = 0 \begin{pmatrix}
1 & 1 & 1 \\
1 & 2 & 3 \\
100 & 100 & 100
\end{pmatrix}
\xrightarrow{R_2 - R_1}
\begin{pmatrix}
1 & 1 & 1 \\
0 & 1 & 2 \\
100 & 100 & 100
\end{pmatrix}
\xrightarrow{R_3 - 100R_1}
\begin{pmatrix}
1 & 1 & 1 \\
0 & 1 & 2 \\
0 & 0 & 0
\end{pmatrix}
\\[1em]
\therefore \det A = 1\cdot1\cdot0 = 0 1 1 100 1 2 100 1 3 100 R 2 − R 1 1 0 100 1 1 100 1 2 100 R 3 − 100 R 1 1 0 0 1 1 0 1 2 0 ∴ det A = 1 ⋅ 1 ⋅ 0 = 0
Question 3. (15 points) Let A = [ 1 3 4 5 0 0 2 6 0 0 0 0 0 0 0 0 ] A = \begin{bmatrix} 1 & 3 & 4 & 5 \\ 0 & 0 & 2 & 6 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0\end{bmatrix} A = 1 0 0 0 3 0 0 0 4 2 0 0 5 6 0 0
(a) Using Gram-Schmidt Process, find an orthogonal basis for the Im ( A ) \operatorname{Im}(A) Im ( A ) .
rref ( A ) = [ 1 3 0 − 7 0 0 1 3 0 0 0 0 0 0 0 0 ] ⟹ Im ( A ) = span { ( 1 0 0 0 ) , ( 4 2 0 0 ) } \operatorname{rref}(A) = \begin{bmatrix}
1 & 3 & 0 & -7 \\
0 & 0 & 1 & 3 \\
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0
\end{bmatrix}
\implies
\operatorname{Im}(A) = \operatorname{span}\Set{
\begin{pmatrix}
1 \\ 0 \\ 0 \\ 0
\end{pmatrix},
\begin{pmatrix}
4 \\ 2 \\ 0 \\ 0
\end{pmatrix}
} rref ( A ) = 1 0 0 0 3 0 0 0 0 1 0 0 − 7 3 0 0 ⟹ Im ( A ) = span ⎩ ⎨ ⎧ 1 0 0 0 , 4 2 0 0 ⎭ ⎬ ⎫
Let v ⃗ 1 = ( 1 0 0 0 ) \vector{v}_1 = \begin{pmatrix}
1 \\ 0 \\ 0 \\ 0
\end{pmatrix} v 1 = 1 0 0 0 . Then:
v ⃗ 2 = ( 4 2 0 0 ) − ⟨ ( 4 2 0 0 ) , ( 1 0 0 0 ) ⟩ ∣ ∣ ( 1 0 0 0 ) ∣ ∣ 2 ( 1 0 0 0 ) = ( 4 2 0 0 ) − 4 ( 1 0 0 0 ) = ( 0 2 0 0 ) \def<{\left\langle} \def>{\right\rangle}
\def\norm#1{\left|\left|#1\right|\right|}
\begin{align*}
\vector{v}_2 &= \begin{pmatrix}
4 \\ 2 \\ 0 \\ 0
\end{pmatrix} - \frac{<\begin{pmatrix}
4 \\ 2 \\ 0 \\ 0
\end{pmatrix}, \begin{pmatrix}
1 \\ 0 \\ 0 \\ 0
\end{pmatrix}>}{
\norm{\begin{pmatrix}
1 \\ 0 \\ 0 \\ 0
\end{pmatrix}}^2
}\begin{pmatrix}
1 \\ 0 \\ 0 \\ 0
\end{pmatrix} \\
&= \begin{pmatrix}
4 \\ 2 \\ 0 \\ 0
\end{pmatrix} - 4\begin{pmatrix}
1 \\ 0 \\ 0 \\ 0
\end{pmatrix} \\
&= \begin{pmatrix}
0 \\ 2 \\ 0 \\ 0
\end{pmatrix}
\end{align*} v 2 = 4 2 0 0 − 1 0 0 0 2 ⟨ 4 2 0 0 , 1 0 0 0 ⟩ 1 0 0 0 = 4 2 0 0 − 4 1 0 0 0 = 0 2 0 0
As such, an orthogonal basis for Im ( A ) \operatorname{Im}(A) Im ( A ) is { v ⃗ 1 , v ⃗ 2 } = { ( 1 0 0 0 ) , ( 0 2 0 0 ) } \set{\vector{v}_1, \vector{v}_2} = \Set{
\begin{pmatrix}
1 \\ 0 \\ 0 \\ 0
\end{pmatrix},\begin{pmatrix}
0 \\ 2 \\ 0 \\ 0
\end{pmatrix}
} { v 1 , v 2 } = ⎩ ⎨ ⎧ 1 0 0 0 , 0 2 0 0 ⎭ ⎬ ⎫ .
(b) Find the basis for the orthogonal complement for the Im ( A ) \operatorname{Im}(A) Im ( A ) .
A ⊤ = [ 1 0 0 0 3 0 0 0 4 2 0 0 5 6 0 0 ] ⟹ rref ( A ⊤ ) = [ 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 ] ∴ Im ( A ) ⊥ = ker ( A ⊤ ) = span { ( 0 0 1 0 ) , ( 0 0 0 1 ) } A^\top = \begin{bmatrix}
1 & 0 & 0 & 0 \\
3 & 0 & 0 & 0\\
4 & 2 & 0 & 0\\
5 & 6 & 0 & 0
\end{bmatrix}
\implies
\operatorname{rref}(A^\top) = \begin{bmatrix}
1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0\\
0 & 0 & 0 & 0\\
0 & 0 & 0 & 0
\end{bmatrix}
\\[1em]
\therefore\operatorname{Im}(A)^\perp = \ker(A^\top) = \operatorname{span}\Set{
\begin{pmatrix}
0 \\ 0 \\ 1 \\ 0
\end{pmatrix},
\begin{pmatrix}
0 \\ 0 \\ 0 \\ 1
\end{pmatrix}
} A ⊤ = 1 3 4 5 0 0 2 6 0 0 0 0 0 0 0 0 ⟹ rref ( A ⊤ ) = 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 ∴ Im ( A ) ⊥ = ker ( A ⊤ ) = span ⎩ ⎨ ⎧ 0 0 1 0 , 0 0 0 1 ⎭ ⎬ ⎫
As such, the basis for the orthogonal complement of Im ( A ) \operatorname{Im}(A) Im ( A ) is { ( 0 0 1 0 ) , ( 0 0 0 1 ) } \Set{
\begin{pmatrix}
0 \\ 0 \\ 1 \\ 0
\end{pmatrix},
\begin{pmatrix}
0 \\ 0 \\ 0 \\ 1
\end{pmatrix}
} ⎩ ⎨ ⎧ 0 0 1 0 , 0 0 0 1 ⎭ ⎬ ⎫ .
(c) Let b = ( 0 2 1 − 1 ) \mathbf{b} = \begin{pmatrix}0 \\2 \\1 \\ −1\end{pmatrix} b = 0 2 1 − 1
(i) Find the orthogonal projection of b \mathbf{b} b onto Im ( A ) \operatorname{Im}(A) Im ( A ) .
From (a), an orthogonal basis for Im ( A ) \operatorname{Im}(A) Im ( A ) is { ( 1 0 0 0 ) , ( 0 2 0 0 ) } \Set{
\begin{pmatrix}
1 \\ 0 \\ 0 \\ 0
\end{pmatrix},\begin{pmatrix}
0 \\ 2 \\ 0 \\ 0
\end{pmatrix}
} ⎩ ⎨ ⎧ 1 0 0 0 , 0 2 0 0 ⎭ ⎬ ⎫ . As such:
proj Im ( A ) ( b ) = ⟨ ( 0 2 1 − 1 ) , ( 1 0 0 0 ) ⟩ ∣ ∣ ( 1 0 0 0 ) ∣ ∣ 2 ( 1 0 0 0 ) + ⟨ ( 0 2 1 − 1 ) , ( 0 2 0 0 ) ⟩ ∣ ∣ ( 0 2 0 0 ) ∣ ∣ 2 ( 0 2 0 0 ) = 0 ( 1 0 0 0 ) + 4 4 ( 0 2 0 0 ) = ( 0 2 0 0 ) \def<{\left\langle} \def>{\right\rangle}
\def\norm#1{\left|\left|#1\right|\right|}
\begin{align*}
\operatorname{proj}_{\operatorname{Im}(A)} (\mathbf{b})
&= \frac{<\begin{pmatrix}
0 \\ 2 \\ 1 \\ -1
\end{pmatrix}, \begin{pmatrix}
1 \\ 0 \\ 0 \\ 0
\end{pmatrix}>}{\norm{
\begin{pmatrix}
1 \\ 0 \\ 0 \\ 0
\end{pmatrix}
}^2}\begin{pmatrix}
1 \\ 0 \\ 0 \\ 0
\end{pmatrix}
+ \frac{<\begin{pmatrix}
0 \\ 2 \\ 1 \\ -1
\end{pmatrix}, \begin{pmatrix}
0 \\ 2 \\ 0 \\ 0
\end{pmatrix}>}{\norm{
\begin{pmatrix}
0 \\ 2 \\ 0 \\ 0
\end{pmatrix}
}^2}\begin{pmatrix}
0 \\ 2 \\ 0 \\ 0
\end{pmatrix} \\
&= 0 \begin{pmatrix}
1 \\ 0 \\ 0 \\ 0
\end{pmatrix} + \frac{4}{4}\begin{pmatrix}
0 \\ 2 \\ 0 \\ 0
\end{pmatrix} \\
&= \begin{pmatrix}
0 \\ 2 \\ 0 \\ 0
\end{pmatrix}
\end{align*} proj Im ( A ) ( b ) = 1 0 0 0 2 ⟨ 0 2 1 − 1 , 1 0 0 0 ⟩ 1 0 0 0 + 0 2 0 0 2 ⟨ 0 2 1 − 1 , 0 2 0 0 ⟩ 0 2 0 0 = 0 1 0 0 0 + 4 4 0 2 0 0 = 0 2 0 0
(i) Find the orthogonal projection of b \mathbf{b} b onto the orthogonal complement of Im ( A ) \operatorname{Im}(A) Im ( A )
From (b), the orthogonal complement of Im ( A ) \operatorname{Im}(A) Im ( A ) is { ( 0 0 1 0 ) , ( 0 0 0 1 ) } \Set{
\begin{pmatrix}
0 \\ 0 \\ 1 \\ 0
\end{pmatrix},
\begin{pmatrix}
0 \\ 0 \\ 0 \\ 1
\end{pmatrix}
} ⎩ ⎨ ⎧ 0 0 1 0 , 0 0 0 1 ⎭ ⎬ ⎫ As such:
proj Im ( A ) ⊥ ( b ) = ⟨ ( 0 2 1 − 1 ) , ( 0 0 1 0 ) ⟩ ∣ ∣ ( 0 0 1 0 ) ∣ ∣ 2 ( 0 0 1 0 ) + ⟨ ( 0 2 1 − 1 ) , ( 0 0 0 1 ) ⟩ ∣ ∣ ( 0 0 0 1 ) ∣ ∣ 2 ( 0 0 0 1 ) = 1 1 ( 0 0 1 0 ) − 1 1 ( 0 0 0 1 ) = ( 0 0 1 − 1 ) \def<{\left\langle} \def>{\right\rangle}
\def\norm#1{\left|\left|#1\right|\right|}
\begin{align*}
\operatorname{proj}_{\operatorname{Im}(A)^\perp} (\mathbf{b})
&= \frac{<\begin{pmatrix}
0 \\ 2 \\ 1 \\ -1
\end{pmatrix}, \begin{pmatrix}
0 \\ 0 \\ 1 \\ 0
\end{pmatrix}>}{\norm{
\begin{pmatrix}
0 \\ 0 \\ 1 \\ 0
\end{pmatrix}
}^2}\begin{pmatrix}
0 \\ 0 \\ 1 \\ 0
\end{pmatrix}
+ \frac{<\begin{pmatrix}
0 \\ 2 \\ 1 \\ -1
\end{pmatrix}, \begin{pmatrix}
0 \\ 0 \\ 0 \\ 1
\end{pmatrix}>}{\norm{
\begin{pmatrix}
0 \\ 0 \\ 0 \\ 1
\end{pmatrix}
}^2}\begin{pmatrix}
0 \\ 0 \\ 0 \\ 1
\end{pmatrix} \\
&= \frac{1}{1} \begin{pmatrix}
0 \\ 0 \\ 1 \\ 0
\end{pmatrix} - \frac{1}{1} \begin{pmatrix}
0 \\ 0 \\ 0 \\ 1
\end{pmatrix} \\
&= \begin{pmatrix}
0 \\ 0 \\ 1 \\ -1
\end{pmatrix}
\end{align*} proj Im ( A ) ⊥ ( b ) = 0 0 1 0 2 ⟨ 0 2 1 − 1 , 0 0 1 0 ⟩ 0 0 1 0 + 0 0 0 1 2 ⟨ 0 2 1 − 1 , 0 0 0 1 ⟩ 0 0 0 1 = 1 1 0 0 1 0 − 1 1 0 0 0 1 = 0 0 1 − 1
Question 4. (10 points) Suppose that we want to find the least square best fitting hyperplane z = A x + B y + C z = Ax + By + C z = A x + B y + C for a set of datas ( x 1 , y 1 , z 1 ) , . . . , ( x k , y k , z k ) (x_1,y_1,z_1),...,(x_k,y_k,z_k) ( x 1 , y 1 , z 1 ) , ... , ( x k , y k , z k ) . Explain step by step the procedure we need to do.
First, interpret the data points ( x 1 , y 1 , z 1 ) , . . . , ( x k , y k , z k ) (x_1,y_1,z_1),...,(x_k,y_k,z_k) ( x 1 , y 1 , z 1 ) , ... , ( x k , y k , z k ) , as a system of equation:
{ z 1 = A x 1 + B y 1 + C ⋮ z k = A x k + B y k + C \left\{\begin{array}{c}
z_1 = Ax_1 + By_1 + C \\
\vdots \\
z_k = Ax_k + By_k + C
\end{array}\right. ⎩ ⎨ ⎧ z 1 = A x 1 + B y 1 + C ⋮ z k = A x k + B y k + C
Then, they can be written in the form b ⃗ = A x ^ \vector{b}=A\mathbf{\hat{x}} b = A x ^ .
[ z 1 ⋮ z k ] = [ x 1 y 1 1 ⋱ x k y k 1 ] ( A ^ B ^ C ^ ) \begin{bmatrix}
z_1 \\ \vdots \\ z_k
\end{bmatrix} = \begin{bmatrix}
x_1 & y_1 & 1 \\[.5em]
& \ddots & \\
x_k & y_k & 1
\end{bmatrix} \begin{pmatrix}
\hat{A} \\ \hat{B} \\ \hat{C}
\end{pmatrix} z 1 ⋮ z k = x 1 x k y 1 ⋱ y k 1 1 A ^ B ^ C ^
Finally, to find x ^ \mathbf{\hat{x}} x ^ , we apply A ⊤ A^\top A ⊤ to both sides.
A x ^ = b ⃗ A ⊤ A x ^ = A ⊤ b ⃗ ∴ x ^ = ( A ⊤ A ) − 1 A ⊤ b ⃗ A\mathbf{\hat{x}} = \vector{b} \\
A^\top A\mathbf{\hat{x}} = A^\top\vector{b} \\
\therefore \mathbf{\hat{x}} = (A^\top A)^{-1}A^\top \vector{b} A x ^ = b A ⊤ A x ^ = A ⊤ b ∴ x ^ = ( A ⊤ A ) − 1 A ⊤ b
Thus, x ^ = ( A ^ B ^ C ^ ) \mathbf{\hat{x}} = \begin{pmatrix}
\hat{A} \\ \hat{B} \\ \hat{C}
\end{pmatrix} x ^ = A ^ B ^ C ^ is the least squared solution, giving us the best fitting hyperplane z = A ^ x + B ^ y + C ^ z = \hat{A}x + \hat{B}y + \hat{C} z = A ^ x + B ^ y + C ^ .
Question 5 (15 points)
(a) State the definition of eigenvalue and eigenvectors of a matrix A A A .
We say that λ \lambda λ is an eigenvalue of A A A if we can find some vector v ⃗ ≠ 0 ⃗ \vector{v}\neq\vector{0} v = 0 such that A v ⃗ = λ v ⃗ A\vector{v}=\lambda\vector{v} A v = λ v . Subsequently, v ⃗ \vector{v} v is the corresponding eigenvectors associated with λ \lambda λ .
(b) State the definition of geometric multiplicity and algebraic multiplicity of the eigenvalue λ λ λ for the matrix A A A .
The geometric multiplicity of an eigenvalue λ \lambda λ is dim ker ( A − λ I ) \dim\ker(A-\lambda I) dim ker ( A − λ I ) .
The algebraic multiplicity of an eigenvalue λ i \lambda_i λ i is the highest power p p p such that ( λ − λ i ) p (\lambda − \lambda_i)^p ( λ − λ i ) p is a factor of det ( A − λ I ) \det(A − \lambda I) det ( A − λ I ) .
(c) Let A = ( 3 − 2 4 − 4 1 0 2 − 2 − 1 1 − 1 2 − 1 1 − 2 3 ) A = \begin{pmatrix} 3 & −2 & 4 & −4 \\ 1 & 0 & 2 & −2 \\ −1 & 1 & −1 & 2 \\ −1 & 1 & −2 & 3\end{pmatrix} A = 3 1 − 1 − 1 − 2 0 1 1 4 2 − 1 − 2 − 4 − 2 2 3 Find the eigenvalues of A A A (computer is allowed, but you need to write down the polynomial equation required to solve) and determine if A A A is diagonalizable.
A − λ I = ( 3 − λ − 2 4 − 4 1 0 − λ 2 − 2 − 1 1 − 1 − λ 2 − 1 1 − 2 3 − λ ) det ( A − λ I ) = ( λ − 1 ) 3 ( λ − 2 ) = 0 ∴ λ = 1 , 2 A-\lambda I = \begin{pmatrix}
3-\lambda & −2 & 4 & −4 \\
1 & 0-\lambda & 2 & −2 \\
−1 & 1 & −1-\lambda & 2 \\
−1 & 1 & −2 & 3-\lambda
\end{pmatrix}
\\[1em]
\det(A-\lambda I) = (\lambda-1)^3(\lambda-2) = 0 \\
\therefore\lambda = 1,2 A − λ I = 3 − λ 1 − 1 − 1 − 2 0 − λ 1 1 4 2 − 1 − λ − 2 − 4 − 2 2 3 − λ det ( A − λ I ) = ( λ − 1 ) 3 ( λ − 2 ) = 0 ∴ λ = 1 , 2
The eigenvalues of A A A are 1 1 1 and 2 2 2 with algebraic multiplicities of 3 3 3 and 2 2 2 , respectively.
For A A A to be diagonalizable, the geometric multiplicities must be equal to the algebraic multiplicities for all corresponding eigenvectors.
For λ = 1 \lambda = 1 λ = 1 ,
rref ( A − I ) = rref ( 3 − 1 − 2 4 − 4 1 0 − 1 2 − 2 − 1 1 − 1 − 1 2 − 1 1 − 2 3 − 1 ) = ( 1 − 1 2 − 2 0 0 0 0 0 0 0 0 0 0 0 0 ) \operatorname{rref}(A-I) = \operatorname{rref}\begin{pmatrix}
3-1 & −2 & 4 & −4 \\
1 & 0-1 & 2 & −2 \\
−1 & 1 & −1-1 & 2 \\
−1 & 1 & −2 & 3-1
\end{pmatrix} = \begin{pmatrix}
1 & -1 & 2 & -2 \\
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0
\end{pmatrix} rref ( A − I ) = rref 3 − 1 1 − 1 − 1 − 2 0 − 1 1 1 4 2 − 1 − 1 − 2 − 4 − 2 2 3 − 1 = 1 0 0 0 − 1 0 0 0 2 0 0 0 − 2 0 0 0
There are three free variables, as such dim ker ( A − I ) = 3 \dim\ker(A-I)=3 dim ker ( A − I ) = 3 . So the geometric and algebraic multiplicity for λ = 1 \lambda=1 λ = 1 matches.
For λ = 2 \lambda = 2 λ = 2 ,
rref ( A − 2 I ) = rref ( 3 − 2 − 2 4 − 4 1 0 − 2 2 − 2 − 1 1 − 1 − 2 2 − 1 1 − 2 3 − 2 ) = ( 1 0 0 2 0 1 0 1 0 0 1 − 1 0 0 0 0 ) \operatorname{rref}(A-2I) = \operatorname{rref}\begin{pmatrix}
3-2 & −2 & 4 & −4 \\
1 & 0-2 & 2 & −2 \\
−1 & 1 & −1-2 & 2 \\
−1 & 1 & −2 & 3-2
\end{pmatrix} = \begin{pmatrix}
1 & 0 & 0 & 2 \\
0 & 1 & 0 & 1 \\
0 & 0 & 1 & -1 \\
0 & 0 & 0 & 0
\end{pmatrix} rref ( A − 2 I ) = rref 3 − 2 1 − 1 − 1 − 2 0 − 2 1 1 4 2 − 1 − 2 − 2 − 4 − 2 2 3 − 2 = 1 0 0 0 0 1 0 0 0 0 1 0 2 1 − 1 0
There is one free variable, as such dim ker ( A − 2 I ) = 1 \dim\ker(A-2I)=1 dim ker ( A − 2 I ) = 1 . And so, the geometric and algebraic multiplicity for λ = 2 \lambda=2 λ = 2 also matches.
As such, we conclude that A A A is diagonalizable.
Question 6. (10 points) Let v 1 , v 2 , v 3 , v 4 , v 5 \mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4, \mathbf{v}_5 v 1 , v 2 , v 3 , v 4 , v 5 be any 5 vectors in a vector space V V V of dimension 4 4 4 . Determine if the following statements are correct. Explain.
(i) These 5 vectors must be linearly dependent.
True. Since these vectors are of dimension four, at least one of them must be the same vector or a multiple or each other.
False. Consider v 1 = v 2 = v 3 = v 4 = v 5 \mathbf{v}_1 = \mathbf{v}_2 = \mathbf{v}_3 = \mathbf{v}_4 = \mathbf{v}_5 v 1 = v 2 = v 3 = v 4 = v 5 .
True. Let W ⊆ V W\subseteq V W ⊆ V be a subspace where W = span { v 1 , v 2 , v 3 , v 4 , v 5 } W=\operatorname{span}\set{\mathbf{v}_1, \mathbf{v}_2, \mathbf{v}_3, \mathbf{v}_4, \mathbf{v}_5} W = span { v 1 , v 2 , v 3 , v 4 , v 5 } . A basis of W W W would just be the set of linearly independent vectors in the span of W W W . Just to be clear, this only applies to W W W and may not necessarily span the vector space V V V .
Question 7. (15 points)
(a) Define rigorously the definition of the least square solution for the system A x = b A\mathbf{x} = b A x = b . Using your definition, explain why if the system A x = b A\mathbf{x} = b A x = b has a solution x 0 x_0 x 0 , then x 0 x_0 x 0 must be the least square solution.
The least square solution for the system A x = b A\mathbf{x}=b A x = b is a vector x ^ \mathbf{\hat{x}} x ^ such that ∣ ∣ b − A x ^ ∣ ∣ ||b-A\mathbf{\hat{x}}|| ∣∣ b − A x ^ ∣∣ is minimized. More concretely, it is a solution such that
∣ ∣ b − A x ^ ∣ ∣ ≤ ∣ ∣ b − A x ∣ ∣ ||b-A\mathbf{\hat{x}}|| \le ||b-A\mathbf{x}|| ∣∣ b − A x ^ ∣∣ ≤ ∣∣ b − A x ∣∣
for all other x \mathbf{x} x . In other words, the whole point of finding the least square solution is to satisfy the system A x = b A\mathbf{x}=b A x = b as closely as possible.
However, if A x = b A\mathbf{x}=b A x = b has a solution x 0 x_0 x 0 , then ∣ ∣ b − A x 0 ∣ ∣ = 0 ||b-Ax_0|| = 0 ∣∣ b − A x 0 ∣∣ = 0 . Which means that x 0 x_0 x 0 is a solution such that the distance is minimized. Hence, x 0 x_0 x 0 must be the least square solution.
(b). Let A A A be an m × n m × n m × n matrix with rank ( A ) = n \operatorname{rank}(A) = n rank ( A ) = n . Let also A = U Σ V T A = UΣV^T A = U Σ V T be its singular value decomposition. Show that the least square solution of the system A x = b A\mathbf{x} = \mathbf{b} A x = b is equal to x ^ = ⟨ b , u 1 ⟩ σ 1 v 1 + ⋯ + ⟨ b , u n ⟩ σ n v n \hat{\mathbf{x}} = \frac{\lang \mathbf{b}, \mathbf{u}_1 \rang}{\sigma_1}\mathbf{v}_1 + \cdots + \frac{\lang \mathbf{b}, \mathbf{u}_n \rang}{\sigma_n}\mathbf{v}_n x ^ = σ 1 ⟨ b , u 1 ⟩ v 1 + ⋯ + σ n ⟨ b , u n ⟩ v n
If A A A is an m × n m\times n m × n matrix with rank ( A ) = n \operatorname{rank}(A)=n rank ( A ) = n . Then, A ⊤ A A^\top A A ⊤ A is an m × m m\times m m × m matrix with rank ( A ⊤ A ) = m \operatorname{rank}(A^\top A)=m rank ( A ⊤ A ) = m .
Then, let
U = [ ∣ ∣ u 1 ⋯ u n ∣ ∣ ] , Σ = [ σ 1 0 ⋱ ⋮ σ n 0 ] , V = [ ∣ ∣ v 1 ⋯ v n ∣ ∣ ] , U = \begin{bmatrix}
| && | \\
\mathbf{u}_1 & \cdots & \mathbf{u}_n \\
| && |
\end{bmatrix},\quad
\Sigma = \begin{bmatrix}
\sigma_1 && & 0 \\
& \ddots && \vdots\\
&& \sigma_n & 0
\end{bmatrix},\quad
V = \begin{bmatrix}
| && | \\
\mathbf{v}_1 & \cdots & \mathbf{v}_n \\
| && |
\end{bmatrix}, U = ∣ u 1 ∣ ⋯ ∣ u n ∣ , Σ = σ 1 ⋱ σ n 0 ⋮ 0 , V = ∣ v 1 ∣ ⋯ ∣ v n ∣ ,
where U ⊤ U = I U^\top U = I U ⊤ U = I and V ⊤ V = I V^\top V = I V ⊤ V = I .
Since A ⊤ A A^\top A A ⊤ A is invertible, the least square solution of the system A x = b A\mathbf{x}=\mathbf{b} A x = b is
x ^ = ( A ⊤ A ) − 1 A ⊤ b . \mathbf{\hat{x}} = (A^\top A)^{-1} A^\top \mathbf{b}. x ^ = ( A ⊤ A ) − 1 A ⊤ b .
Now, let’s unpack A ⊤ A A^\top A A ⊤ A .
A = U Σ V ⊤ ⟹ A ⊤ = V Σ ⊤ U ⊤ A ⊤ A = V Σ ⊤ U ⊤ U ⏟ I Σ V ⊤ = V [ σ 1 ⋱ σ n 0 ⋯ 0 ] [ σ 1 0 ⋱ ⋮ σ n 0 ] V ⊤ = V [ σ 1 2 ⋱ σ n 2 ] V ⊤ A = U\Sigma V^\top \implies A^\top = V\Sigma^\top U^\top \\[1em]
\begin{align*}
A^\top A &= V\Sigma^\top \underbrace{U^\top U}_I \Sigma V^\top \\
&= V\begin{bmatrix}
\sigma_1 & \\
& \ddots & \\
&& \sigma_n \\
0 &\cdots & 0
\end{bmatrix}\begin{bmatrix}
\sigma_1 && & 0 \\
& \ddots && \vdots\\
&& \sigma_n & 0
\end{bmatrix}V^\top \\
&= V\begin{bmatrix}
\sigma_1^2 && \\
& \ddots &\\
&& \sigma_n^2
\end{bmatrix}V^\top
\end{align*} A = U Σ V ⊤ ⟹ A ⊤ = V Σ ⊤ U ⊤ A ⊤ A = V Σ ⊤ I U ⊤ U Σ V ⊤ = V σ 1 0 ⋱ ⋯ σ n 0 σ 1 ⋱ σ n 0 ⋮ 0 V ⊤ = V σ 1 2 ⋱ σ n 2 V ⊤
Then, taking the inverse yields:
( A ⊤ A ) − 1 = ( V [ σ 1 2 ⋱ σ n 2 ] V ⊤ ) − 1 = V [ 1 σ 1 2 ⋱ 1 σ n 2 ] V ⊤ \begin{align*}
(A^\top A)^{-1} &= \(V\begin{bmatrix}
\sigma_1^2 && \\
& \ddots &\\
&& \sigma_n^2
\end{bmatrix}V^\top\)^{-1} \\
&= V\begin{bmatrix}
\frac{1}{\sigma_1^2} && \\
& \ddots &\\
&& \frac{1}{\sigma_n^2}
\end{bmatrix}V^\top
\end{align*} ( A ⊤ A ) − 1 = V σ 1 2 ⋱ σ n 2 V ⊤ − 1 = V σ 1 2 1 ⋱ σ n 2 1 V ⊤
Finally, plugging it into the expression for x ^ \mathbf{\hat{x}} x ^ :
x ^ = ( A ⊤ A ) − 1 A ⊤ b = V [ 1 σ 1 2 ⋱ 1 σ n 2 ] V ⊤ V ⏟ I Σ ⊤ U ⊤ b = [ ∣ ∣ v 1 ⋯ v n ∣ ∣ ] [ 1 σ 1 2 ⋱ 1 σ n 2 ] [ σ 1 0 ⋱ ⋮ σ n 0 ] [ u 1 ⋮ u n ] b = [ ∣ ∣ v 1 ⋯ v n ∣ ∣ ] [ 1 σ 1 ⋱ 1 σ n ] [ ⟨ b , u 1 ⟩ ⋮ ⟨ b , u n ⟩ ] = [ ∣ ∣ v 1 ⋯ v n ∣ ∣ ] [ ⟨ b , u 1 ⟩ σ 1 ⋮ ⟨ b , u n ⟩ σ n ] = ⟨ b , u 1 ⟩ σ 1 v 1 + ⋯ + ⟨ b , u n ⟩ σ n v n \def<{\left\langle} \def>{\right\rangle}
\begin{align*}
\mathbf{\hat{x}} &= (A^\top A)^{-1} A^\top \mathbf{b} \\
&= V\begin{bmatrix}
\frac{1}{\sigma_1^2} && \\
& \ddots &\\
&& \frac{1}{\sigma_n^2}
\end{bmatrix}\underbrace{V^\top V}_I \Sigma^\top U^\top \mathbf{b} \\
&= \begin{bmatrix}
| && | \\
\mathbf{v}_1 & \cdots & \mathbf{v}_n \\
| && |
\end{bmatrix}
\begin{bmatrix}
\frac{1}{\sigma_1^2} && \\
& \ddots &\\
&& \frac{1}{\sigma_n^2}
\end{bmatrix}
\begin{bmatrix}
\sigma_1 && & 0 \\
& \ddots && \vdots\\
&& \sigma_n & 0
\end{bmatrix}
\begin{bmatrix}
\mathbf{u}_1 \\ \vdots \\ \mathbf{u}_n
\end{bmatrix} \mathbf{b} \\
&= \begin{bmatrix}
| && | \\
\mathbf{v}_1 & \cdots & \mathbf{v}_n \\
| && |
\end{bmatrix}
\begin{bmatrix}
\frac{1}{\sigma_1} && \\
& \ddots &\\
&& \frac{1}{\sigma_n}
\end{bmatrix}
\begin{bmatrix}
<\mathbf{b}, \mathbf{u}_1> \\
\vdots \\
<\mathbf{b}, \mathbf{u}_n>
\end{bmatrix} \\
&= \begin{bmatrix}
| && | \\
\mathbf{v}_1 & \cdots & \mathbf{v}_n \\
| && |
\end{bmatrix}
\begin{bmatrix}
\frac{<\mathbf{b}, \mathbf{u}_1>}{\sigma_1} \\
\vdots \\
\frac{<\mathbf{b}, \mathbf{u}_n>}{\sigma_n}
\end{bmatrix} \\
&= \frac{\lang \mathbf{b}, \mathbf{u}_1 \rang}{\sigma_1}\mathbf{v}_1
+ \cdots
+ \frac{\lang \mathbf{b}, \mathbf{u}_n \rang}{\sigma_n}\mathbf{v}_n
\end{align*} x ^ = ( A ⊤ A ) − 1 A ⊤ b = V σ 1 2 1 ⋱ σ n 2 1 I V ⊤ V Σ ⊤ U ⊤ b = ∣ v 1 ∣ ⋯ ∣ v n ∣ σ 1 2 1 ⋱ σ n 2 1 σ 1 ⋱ σ n 0 ⋮ 0 u 1 ⋮ u n b = ∣ v 1 ∣ ⋯ ∣ v n ∣ σ 1 1 ⋱ σ n 1 ⟨ b , u 1 ⟩ ⋮ ⟨ b , u n ⟩ = ∣ v 1 ∣ ⋯ ∣ v n ∣ σ 1 ⟨ b , u 1 ⟩ ⋮ σ n ⟨ b , u n ⟩ = σ 1 ⟨ b , u 1 ⟩ v 1 + ⋯ + σ n ⟨ b , u n ⟩ v n
Question 8. (15 points) Let P n \mathcal{P}_n P n be the vector space of polynomials of degree at most n n n . Let W 1 = { P ( x ) = a 0 + a 1 x + a 2 x 2 + . . . . + a n x n : P ( 1 ) = 0 } . W_1 = \set{P(x) = a_0 + a_1x + a_2x^2 + .... + a_nx^n : P(1) = 0}. W 1 = { P ( x ) = a 0 + a 1 x + a 2 x 2 + .... + a n x n : P ( 1 ) = 0 } .
(i) Show that W 1 W_1 W 1 is a subspace of P n \mathcal{P}_n P n .
Checking if W 1 W_1 W 1 is closed under addition
Consider P 1 , P 2 ∈ W 1 P_1,P_2\in W_1 P 1 , P 2 ∈ W 1 . Then, P 1 ( 1 ) = 0 P_1(1) = 0 P 1 ( 1 ) = 0 and P 2 ( 1 ) = 0 P_2(1) = 0 P 2 ( 1 ) = 0 . Subsequently,
( P 1 + P 2 ) ( 1 ) = P 1 ( 1 ) + P 2 ( 1 ) = 0 + 0 = 0. (P_1 + P_2)(1) = P_1(1) + P_2(1) = 0 + 0 = 0. ( P 1 + P 2 ) ( 1 ) = P 1 ( 1 ) + P 2 ( 1 ) = 0 + 0 = 0.
Hence, P 1 + P 2 ∈ W 1 P_1 + P_2\in W_1 P 1 + P 2 ∈ W 1 .
Checking if W 1 W_1 W 1 is closed under scalar multiplication
Consider P ∈ W 1 P\in W_1 P ∈ W 1 and α ∈ R \alpha\in\R α ∈ R . Then, P ( 1 ) = 0 P(1) = 0 P ( 1 ) = 0 and α P ( 1 ) = α ( 0 ) = 0 \alpha P(1) = \alpha(0) = 0 α P ( 1 ) = α ( 0 ) = 0 .
Hence, α P ∈ W 1 \alpha P\in W_1 α P ∈ W 1 .
Since W 1 W_1 W 1 is closed under addition and scalar multiplication, it is a subspace of P n \mathcal{P}_n P n .
(ii) Find a basis for W 1 W_1 W 1 .
Consider a set of polynomial P 1 , P 2 , … , P n ∈ W 1 P_1,P_2,\ldots,P_n\in W_1 P 1 , P 2 , … , P n ∈ W 1 , where:
P 1 ( x ) = a 0 + a 1 x P 2 ( x ) = a 0 + a 1 x + a 2 x 2 ⋮ P n ( x ) = a 0 + a 1 x + a 2 x 2 + ⋯ + a n x n \begin{align*}
P_1(x) &= a_0 + a_1x \\
P_2(x) &= a_0 + a_1x + a_2x^2 \\
&\vdots \\
P_n(x) &= a_0 + a_1x + a_2x^2 + \cdots + a_nx^n \\
\end{align*} P 1 ( x ) P 2 ( x ) P n ( x ) = a 0 + a 1 x = a 0 + a 1 x + a 2 x 2 ⋮ = a 0 + a 1 x + a 2 x 2 + ⋯ + a n x n
For some P ∈ W 1 P\in W_1 P ∈ W 1 , we need P ( 1 ) = 0 P(1)=0 P ( 1 ) = 0 .
P ( 1 ) = a 0 + a 1 ( 1 ) + a 2 ( 1 ) 2 + ⋯ + a n ( 1 ) n = 0 = a 0 + a 1 + a 2 + ⋯ + a n = 0 ∴ a 0 + a 1 + a 2 + ⋯ + a n = 0 \begin{align*}
P(1) &= a_0 + a_1(1) + a_2(1)^2 + \cdots + a_n(1)^n &= 0 \\
&= a_0 + a_1 + a_2 + \cdots + a_n &= 0
\end{align*}
\\
\therefore a_0 + a_1 + a_2 + \cdots + a_n = 0 P ( 1 ) = a 0 + a 1 ( 1 ) + a 2 ( 1 ) 2 + ⋯ + a n ( 1 ) n = a 0 + a 1 + a 2 + ⋯ + a n = 0 = 0 ∴ a 0 + a 1 + a 2 + ⋯ + a n = 0
So, we need the sum of the coefficients to be zero. An easy way to satisfy this is to:
let the coefficient of the degree zero term, a 0 = 1 a_0=1 a 0 = 1 ,
let the coefficient of the highest degree term a n = − 1 a_n=-1 a n = − 1 ,
and set the coefficient of all other terms a 1 = a 2 = ⋯ = a n − 1 = 0 a_1=a_2=\cdots=a_{n-1}=0 a 1 = a 2 = ⋯ = a n − 1 = 0 .
a 0 + a 1 x + a 2 x 2 + ⋯ + a n − 1 x n − 1 + a n x n a 0 = 1 ↓ a n = − 1 1 − x n \begin{CD}
\text{ $a_0 + \cancel{a_1 x + a_2x^2 + \cdots + a_{n-1}x^{n-1}} + a_nx^n $}\\
@V\text{$a_0 = 1$}\quad V \quad\text{$a_n=-1$}V \\
\text{$1 - x^n$}
\end{CD} a 0 + a 1 x + a 2 x 2 + ⋯ + a n − 1 x n − 1 + a n x n a 0 = 1 ↓ ⏐ a n = − 1 1 − x n
Simply put, we want to kill off the middle terms so that we get polynomials that will give us 1 − 1 n = 0 1-1^n=0 1 − 1 n = 0 .
And so, we have:
P 1 ( x ) = 1 − x P 2 ( x ) = 1 − x 2 ⋮ P n ( x ) = 1 − x n \begin{align*}
P_1(x) &= 1 - x \\
P_2(x) &= 1 - x^2 \\
\vdots \\
P_n(x) &= 1-x^n
\end{align*} P 1 ( x ) P 2 ( x ) ⋮ P n ( x ) = 1 − x = 1 − x 2 = 1 − x n
By observation, we can see that P i ( 1 ) = 0 P_i(1) = 0 P i ( 1 ) = 0 for all i ∈ Z + i\in\Z^+ i ∈ Z + .
Let c 1 , c 2 , … , c n ∈ R c_1,c_2, \ldots, c_n \in\R c 1 , c 2 , … , c n ∈ R be some scalars. Then, using the abovementioned, we have that
c 1 P 1 ( x ) + c 2 P 2 ( x ) + ⋯ + c n P n ( x ) = c 1 ( 0 ) + c 2 ( 0 ) + ⋯ + c n ( 0 ) = 0. \begin{align*}
& c_1P_1(x) + c_2P_2(x) + \cdots + c_nP_n(x) \\
&= c_1(0) + c_2(0) + \cdots + c_n(0) \\
&= 0.
\end{align*} c 1 P 1 ( x ) + c 2 P 2 ( x ) + ⋯ + c n P n ( x ) = c 1 ( 0 ) + c 2 ( 0 ) + ⋯ + c n ( 0 ) = 0.
As such, a basis of W 1 W_1 W 1 is { 1 − x , 1 − x 2 , ⋯ , 1 − x n } \Set{1-x, 1-x^2, \cdots, 1-x^n} { 1 − x , 1 − x 2 , ⋯ , 1 − x n } .
We now let W = { P ( x ) = a 0 + a 1 x + a 2 x 2 + . . . . + a n x n : P ( i ) = 0 , for all i = 1 , 2 , ⋯ , n } . W = \set{P(x) = a_0 + a_1x + a_2x^2 + .... + a_nx^n : P(i) = 0, \text{for all $i = 1, 2,\cdots, n$}}. W = { P ( x ) = a 0 + a 1 x + a 2 x 2 + .... + a n x n : P ( i ) = 0 , for all i = 1 , 2 , ⋯ , n } . (iii) Google online the definition of the “Vandermonde matrix” and write down the determinant of the Vandermonde matrix.
From Wikipedia :
In linear algebra, a Vandermonde matrix , named after Alexandre-Théophile Vandermonde, is a matrix with the terms of a geometric progression in each row: an ( m + 1 ) × ( n + 1 ) (m + 1) \times (n + 1) ( m + 1 ) × ( n + 1 ) matrix
V = V ( x 0 , x 1 , ⋯ , x m ) = [ 1 x 0 x 0 2 … x 0 n 1 x 1 x 1 2 … x 1 n 1 x 2 x 2 2 … x 2 n ⋮ ⋮ ⋮ ⋱ ⋮ 1 x m x m 2 … x m n ] V = V(x_0, x_1, \cdots, x_m) =
\begin{bmatrix}
1 & x_0 & x_0^2 & \dots & x_0^n \\
1 & x_1 & x_1^2 & \dots & x_1^n \\
1 & x_2 & x_2^2 & \dots & x_2^n \\
\vdots & \vdots & \vdots & \ddots &\vdots \\
1 & x_m & x_m^2 & \dots & x_m^n
\end{bmatrix} V = V ( x 0 , x 1 , ⋯ , x m ) = 1 1 1 ⋮ 1 x 0 x 1 x 2 ⋮ x m x 0 2 x 1 2 x 2 2 ⋮ x m 2 … … … ⋱ … x 0 n x 1 n x 2 n ⋮ x m n
with entries V i , j = x i j V_{i,j} = x_i^j V i , j = x i j , the j j j th power of the number x i x_i x i , for all zero-based indices i i i and j j j .
The determinant of a square Vandermonde matrix (when n = m n=m n = m ) is called a Vandermonde determinant or Vandermonde polynomial. Its value is:
det ( V ) = ∏ 0 ≤ i < j ≤ n ( x j − x i ) . \det(V) = \prod_{0 \le i < j \le n} (x_j - x_i). det ( V ) = 0 ≤ i < j ≤ n ∏ ( x j − x i ) .
This is non-zero if and only if all x i x_i x i are distinct (no two are equal), making the Vandermonde matrix invertible.
(iv) Use Vandermonde matrix, show that W = { 0 } W = \set{0} W = { 0 } .
If P ∈ W P\in W P ∈ W , then it is a polynomial in the form
P ( x ) = a 0 + a 1 x + a 2 x 2 + ⋯ + a n x n P(x) = a_0 + a_1x + a_2x^2 + \cdots + a_nx^n P ( x ) = a 0 + a 1 x + a 2 x 2 + ⋯ + a n x n
such that it satisfies
P ( x 0 ) = y 0 , P ( x 1 ) = y 1 , … , P ( x m ) = y m . P(x_0) = y_0, P(x_1) = y_1, \ldots, P(x_m) = y_m. P ( x 0 ) = y 0 , P ( x 1 ) = y 1 , … , P ( x m ) = y m .
In our case, we have that
x ⃗ = [ x 0 x 1 x 2 ⋮ x m ] = [ 1 2 3 ⋮ m ] and y ⃗ = [ y 0 y 1 y 2 ⋮ y m ] = [ P ( x 0 ) P ( x 1 ) P ( x 2 ) ⋮ P ( x m ) ] = [ 0 0 0 ⋮ 0 ] . \vector{x} = \begin{bmatrix}
x_0 \\ x_1 \\ x_2 \\ \vdots \\ x_m
\end{bmatrix} = \begin{bmatrix}
1 \\ 2 \\ 3 \\ \vdots \\ m
\end{bmatrix}
\quad\text{and}\quad
\vector{y} = \begin{bmatrix}
y_0 \\ y_1 \\ y_2 \\ \vdots \\ y_m
\end{bmatrix} = \begin{bmatrix}
P(x_0) \\ P(x_1) \\ P(x_2) \\ \vdots \\ P(x_m)
\end{bmatrix} = \begin{bmatrix}
0 \\ 0 \\ 0 \\ \vdots \\ 0
\end{bmatrix}. x = x 0 x 1 x 2 ⋮ x m = 1 2 3 ⋮ m and y = y 0 y 1 y 2 ⋮ y m = P ( x 0 ) P ( x 1 ) P ( x 2 ) ⋮ P ( x m ) = 0 0 0 ⋮ 0 .
Here, we note that y ⃗ = 0 ⃗ \vector{y}=\vector{0} y = 0 because
P ∈ W ⟺ P ( x 0 ) = P ( x 1 ) = P ( x 2 ) = ⋯ = P ( x m ) = 0 ⟺ P ( 1 ) = P ( 2 ) = P ( 3 ) = ⋯ = P ( m ) = 0. \begin{align*}
P\in W &\iff P(x_0) = P(x_1) = P(x_2) = \cdots = P(x_m) = 0 \\
&\iff P(1) = P(2) = P(3) = \cdots = P(m) = 0.
\end{align*} P ∈ W ⟺ P ( x 0 ) = P ( x 1 ) = P ( x 2 ) = ⋯ = P ( x m ) = 0 ⟺ P ( 1 ) = P ( 2 ) = P ( 3 ) = ⋯ = P ( m ) = 0.
Then, our Vandermonde matrix V V V is given by:
V = [ 1 x 0 x 0 2 ⋯ x 0 n 1 x 1 x 1 2 ⋯ x 1 n 1 x 2 x 2 2 ⋯ x 2 n ⋮ ⋮ ⋮ ⋱ ⋮ 1 x m x m 2 ⋯ x m n ] = [ 1 1 1 2 ⋯ 1 n 1 2 2 2 ⋯ 2 n 1 3 3 2 ⋯ 3 n ⋮ ⋮ ⋮ ⋱ ⋮ 1 m m 2 ⋯ m n ] V=
\begin{bmatrix}
1 & x_0 & x_0^2 & \cdots & x_0^n \\
1 & x_1 & x_1^2 & \cdots & x_1^n \\
1 & x_2 & x_2^2 & \cdots & x_2^n \\
\vdots & \vdots & \vdots & \ddots &\vdots \\
1 & x_m & x_m^2 & \cdots & x_m^n
\end{bmatrix}=
\begin{bmatrix}
1 & 1 & 1^2 & \cdots & 1^n \\
1 & 2 & 2^2 & \cdots & 2^n \\
1 & 3 & 3^2 & \cdots & 3^n \\
\vdots & \vdots & \vdots & \ddots &\vdots \\
1 & m & m^2 & \cdots & m^n
\end{bmatrix} V = 1 1 1 ⋮ 1 x 0 x 1 x 2 ⋮ x m x 0 2 x 1 2 x 2 2 ⋮ x m 2 ⋯ ⋯ ⋯ ⋱ ⋯ x 0 n x 1 n x 2 n ⋮ x m n = 1 1 1 ⋮ 1 1 2 3 ⋮ m 1 2 2 2 3 2 ⋮ m 2 ⋯ ⋯ ⋯ ⋱ ⋯ 1 n 2 n 3 n ⋮ m n
Let a ⃗ = [ a 0 a 1 a 2 ⋮ a n ] \vector{a} = \begin{bmatrix}
a_0 \\ a_1 \\ a_2 \\ \vdots \\ a_n
\end{bmatrix} a = a 0 a 1 a 2 ⋮ a n be a column vector representing the coefficients. Then, we can construct a system V a ⃗ = y ⃗ V\vector{a} = \vector{y} V a = y . Since y ⃗ = 0 ⃗ \vector{y}=\vector{0} y = 0 , we just have a homogenous system.
V a ⃗ = y ⃗ [ 1 1 1 2 ⋯ 1 n 1 2 2 2 ⋯ 2 n 1 3 3 2 ⋯ 3 n ⋮ ⋮ ⋮ ⋱ ⋮ 1 m m 2 ⋯ m n ] [ a 0 a 1 a 2 ⋮ a n ] = [ 0 0 0 ⋮ 0 ] V\vector{a} = \vector{y} \\
\begin{bmatrix}
1 & 1 & 1^2 & \cdots & 1^n \\
1 & 2 & 2^2 & \cdots & 2^n \\
1 & 3 & 3^2 & \cdots & 3^n \\
\vdots & \vdots & \vdots & \ddots &\vdots \\
1 & m & m^2 & \cdots & m^n
\end{bmatrix}
\begin{bmatrix}
a_0 \\ a_1 \\ a_2 \\ \vdots \\ a_n
\end{bmatrix} =
\begin{bmatrix}
0 \\ 0 \\ 0 \\ \vdots \\ 0
\end{bmatrix} V a = y 1 1 1 ⋮ 1 1 2 3 ⋮ m 1 2 2 2 3 2 ⋮ m 2 ⋯ ⋯ ⋯ ⋱ ⋯ 1 n 2 n 3 n ⋮ m n a 0 a 1 a 2 ⋮ a n = 0 0 0 ⋮ 0
As such, we know there exists a trivial solution for a ⃗ \vector{a} a .
Finally, since our x i x_i x i terms are distinct (ascending natural numbers), det ( V ) ≠ 0 \det(V)\neq0 det ( V ) = 0 . Therefore, this system contains only trivial solution .
Further, given that m = n m=n m = n , V V V is also invertible. As such, P P P can be obtained by finding that a ⃗ = V − 1 y ⃗ = 0 ⃗ \vector{a}=V^{-1}\vector{y}=\vector{0} a = V − 1 y = 0 .
Since
a ⃗ = 0 ⃗ ⟹ a 0 = ⋯ = a n = 0 , \vector{a} = \vector{0}
\implies
a_0=\cdots=a_n=0, a = 0 ⟹ a 0 = ⋯ = a n = 0 ,
then such a polynomial P ∈ W P\in W P ∈ W must be zero. Hence, W W W is a trivial subspace.