This week we’ll continue to look at least squares regression using linear algebra.
Last week, we introduced the four fundamental subspaces of a matrix. We ended with the following exercise that we didn’t have time to finish:
(Exercise) Find the range and null space for the 3-by-3 matrix: \[e e^T = \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix}?\]
(Exercise) Why is every column of \(A\) in \(\operatorname{Range}(A)\)?
(Exercise) Prove that \(\operatorname{Null}(A) \perp \operatorname{Range}(A^T)\).
Two subspaces \(U, V \subseteq \mathbb{R}^n\) are called complementary if \(U \cap V = \{ 0 \}\) and \(U \cup V\) spans \(\mathbb{R}^n\). An equivalent condition is that \(\dim U + \dim V = n\) while \(\dim (U \cap V) = 0\).
The orthogonal complement of a set \(V \subseteq \mathbb{R}^n\) is the subspace \[V^{\perp} = \{ w \in \mathbb{R}^n : w^T v = 0 \text{ for all } v \in V \}.\]
Note that two subspaces \(U, V \subseteq \mathbb{R}^n\) are orthogonal complements if and only if they are orthogonal and complementary. In particular, if \(U = V^{\perp}\), then \(V = U^{\perp}\) and \((V^{\perp})^{\perp} = V\) as long as \(V\) is a subspace.
For any matrix \(A \in \mathbb{R}^{m \times n}\),
In other words, the row space and null space of \(A\) are orthogonal complements in the domain of \(A\) which is \(\mathbb{R} ^n\) and the column space and left null space of \(A\) are orthogonal complements of \(A\) in the codomain of \(A\) which is \(\mathbb{R}^m\). Furthermore, the dimensions of \(\operatorname{Range}(A)\) and \(\operatorname{Range}(A^T)\) are both the same and both are the rank of \(A\).
Application 1. In linear regression, we needed to find the vector \(\hat{y} \in \operatorname{Range}(X)\) that is closest to \(y\). Geometrically, this is the same as requiring \(\hat{y} - y\) to be orthogonal to \(\operatorname{Range}(X)\). What is the set of vectors orthogonal to \(\operatorname{Range}(X)\)?
If \(\hat{y} = Xb\), then \(X^T( \hat{y} - y) = 0\) can be rewritten \(X^TXb = X^T y\) which is the normal equation for least squares regression.
Application 2. To solve the normal equations for \(b\), it helps if \(X^T X\) is an invertible matrix. It turns out that \(X^T X\) is invertible if and only if the columns of \(X\) are linearly independent.
We ran out of time before we could answer these last two questions, but they are good linear algebra review questions:
Why are the columns of \(X\) linearly independent if and only if \(\operatorname{Null}(X) = \{ 0 \}\)?
Why is \(X^T X\) invertible if and only if \(\operatorname{Null}(X^T X) = \{ 0 \}\)? Hint: When is \(X^T X\) onto? When is it 1-to-1?