Week 6 Lecture Notes

This week we’ll continue to look at least squares regression using linear algebra.

Monday, February 17

Last week, we introduced the four fundamental subspaces of a matrix. We ended with the following exercise that we didn’t have time to finish:

(Exercise) Find the range and null space for the 3-by-3 matrix: $e e^T = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix}?$
(Exercise) Why is every column of $A$ in $\operatorname{Range}(A)$ ?
(Exercise) Prove that $\operatorname{Null}(A) \perp \operatorname{Range}(A^T)$ .

Definition (Complementary Subspaces)

Two subspaces $U, V \subseteq \mathbb{R}^n$ are called complementary if $U \cap V = \{ 0 \}$ and $U \cup V$ spans $\mathbb{R}^n$ . An equivalent condition is that $\dim U + \dim V = n$ while $\dim (U \cap V) = 0$ .

Definition (Orthogonal Complement)

The orthogonal complement of a set $V \subseteq \mathbb{R}^n$ is the subspace $V^{\perp} = \{ w \in \mathbb{R}^n : w^T v = 0 \text{ for all } v \in V \}.$

Note that two subspaces $U, V \subseteq \mathbb{R}^n$ are orthogonal complements if and only if they are orthogonal and complementary. In particular, if $U = V^{\perp}$ , then $V = U^{\perp}$ and $(V^{\perp})^{\perp} = V$ as long as $V$ is a subspace.

Theorem (The Fundamental Theorem of Linear Algebra)

For any matrix $A \in \mathbb{R}^{m \times n}$ ,

$\operatorname{Null}(A) = \operatorname{Range}(A^T)^\perp$ .
$\operatorname{Null}(A^T) = \operatorname{Range}(A)^\perp$ .

In other words, the row space and null space of $A$ are orthogonal complements in the domain of $A$ which is $\mathbb{R} ^n$ and the column space and left null space of $A$ are orthogonal complements of $A$ in the codomain of $A$ which is $\mathbb{R}^m$ . Furthermore, the dimensions of $\operatorname{Range}(A)$ and $\operatorname{Range}(A^T)$ are both the same and both are the rank of $A$ .

Applications

Application 1. In linear regression, we needed to find the vector $\hat{y} \in \operatorname{Range}(X)$ that is closest to $y$ . Geometrically, this is the same as requiring $\hat{y} - y$ to be orthogonal to $\operatorname{Range}(X)$ . What is the set of vectors orthogonal to $\operatorname{Range}(X)$ ?

Use the Fundamental Theorem of Linear Algebra to show that $\hat{y} - y$ is orthogonal to $\operatorname{Range}(X)$ if and only if $X^T ( \hat{y} - y ) = 0$ .

If $\hat{y} = Xb$ , then $X^T( \hat{y} - y) = 0$ can be rewritten $X^TXb = X^T y$ which is the normal equation for least squares regression.

Application 2. To solve the normal equations for $b$ , it helps if $X^T X$ is an invertible matrix. It turns out that $X^T X$ is invertible if and only if the columns of $X$ are linearly independent.

For any $X \in \mathbb{R}^{m \times n}$ , show that $\operatorname{Null}(X) = \operatorname{Null}(X^TX)$ . Hint: To prove that $\operatorname{Null}(X^T X) \subseteq \operatorname{Null}(X^T X)$ , it helps to prove that if $b \in \operatorname{Null}(X^T X)$ , then $\|Xb \| = 0$ which means that $Xb = 0$ .

We ran out of time before we could answer these last two questions, but they are good linear algebra review questions:

Why are the columns of $X$ linearly independent if and only if $\operatorname{Null}(X) = \{ 0 \}$ ?
Why is $X^T X$ invertible if and only if $\operatorname{Null}(X^T X) = \{ 0 \}$ ? Hint: When is $X^T X$ onto? When is it 1-to-1?